Showing posts with label semantics. Show all posts
Showing posts with label semantics. Show all posts

Sunday, January 8, 2012

RDF, Abiword and Relations

Abiword now has growing GUI support for editing RDF in ODF documents. Much of this support compliments what is available in Calligra for RDF handling. There are some areas where Calligra has more features and some areas where Abiword now does. Hopefully both will continue to have a large and growing shared feature base.

As some folks will know, ODF allows one or more RDF/XML files to be shipped in the ODF file, and for that RDF to be linking to the document content from the content.xml file. This means that you can explicitly say that a 1/2 inch bolt is from a particular maker and was procured on the 3rd of January 2012 by going to their office at geolocation ?x. Handy when you are reading the document a year down the line and want to know which office you bought the bolt from and the exact length of its thread.

Looking at the below image, one can see the purple underlined pieces of text. Each of these has some RDF associated with it. The citation to Dan Brickley has both contact and location RDF associated with it. Looking at the toolbar, towards the right side you see an "R" to start a change tracked document, and the "RF" button. Sorry about the images there, I draw at the 3 year old level so my icons are not quite polished shall we say. Anyway, the "RF" button selects this reference to an RDF item. So if you are between the "am" in the purple "James" then the whole word will be selected with this button. The ">" and "<" buttons then allow you to move to the next and previous reference to the selected RDF item. As you notice another purple James later in the document, this is the second and only other reference to him and you can move between them using these buttons.



These purple links I call "RDF links", which is a bit of a play on hyperlinks or bookmarks. Behind the scenes they are implemented using xml:id values and pkg:idrefs to those from RDF.

A feature added in the last days is the ability to capture and navigate by the relation between two RDF links. This is currently done by selecting a "source" link and then clicking another rdf link and setting the relation to the source. So in the below screenshot I am saying that Mark foaf:knows James. You will also notice the "Find by Relation" option which I can then use to see the people that mark foaf:knows. In the spirit of the foaf definition, I have made this a symmetric relation. So there is no stalking, or "following", if James knows Mark then Mark knows James. Asymmetric relationships are also possible, like son, child, or contains. I am hoping to add this feature to Calligra too in the future as the relations between RDF objects is one of the more powerful features that can be offered by using RDF in a document.



Note that the prev/next RDF item buttons work with relations. If I pick mark and navigate to the James he knows then I can "next" from that james to select the second reference to James. This is one recurring theme to RDF in ODF, that RDF objects like contacts can be cited or linked zero or more times in the text content of the document.

As mentioned above, the converse is also true, and the Dan Brickley text has two logical RDF objects linked to it; Dan's contact information and his location. Handling this multi-object for a single site is a little tricky and in this editing will create a window with both the semantic objects in it to let you edit the RDF abiword knows about. Note that this dialog is actually backed by two (or more) SPARQL queries;



If you are not squeamish about your triples then the "Show RDF" option for an RDF link will let you get right at them and edit away as shown below. There are a few technically cool things about this dialog: firstly the "Restrict to RDF Link" combo box lets you select one or more RDF links that the triples will have to be associated with, and secondly abiword makes sure any edits you make are properly linked to the RDF link you right clicked on. What I mean by this last bit is if you right click the RDF link "alice33" add a new triple "uri:alive myvocab:likes uri:bob" then abiword will add the triple "uri:alice pkg:idref alice33" for you. This is sort of having abiword do what you mean in that you want the new triple to be associated with the link but don't necessarily want to have to explicitly say it all the time. By choosing to edit the RDF for the link you have already explicitly said once that you want these things to be linked. This also applies if you change a subject, uri:alice to uri:amanda will update the pkg:idref values for you. Keep it linked, keep it valid.



Going one level deeper, the above dialog is actually a subclass of a restricted RDF model created using SPARQL. The SPARQL model is read only, and the subclass, RDFModel_XMLIDLimited handles mutations by creating a wrapper object which takes care of the automatic triple relinking mentioned above. Those still awake might like to see the abiword trunk code for src/text/ptbl/xp/pd_DocumentRDF.cpp.

This is part of an ongoing mad hacking sprint that is leading up to a talk at LCA 2012 which starts in a week. Many of the things I mention here are not in trunk yet, and only tested on Linux/GTK+3. Those in Ballarat in a week might like to pop in to the talk given my Martin and myself on Friday the 20th.

Saturday, May 8, 2010

Desktop Annotations...

OK, so a post about annotating files and desktop search. KDE guys might be interested because libferris uses soprano, the base of RDF on that desktop, maemo guys might be interested because all this works on that platform too, and I have a specialized index structure for n810 power level devices in libferris.

Tagging and Annotations are closely related. Tags (or emblems in libferris terms) are great for assigning one or more concepts to a file.
Annotations are great for adding some free text to something. While a short annotation might seem like one, two, or three tags, as in the example below, annotations also carry linguistic weight. Normally with tags you don't care about the order the tags are added to the file. With an annotation and a full text index you should be able to do proximity matching and ordered searching "seed collection" matches but "collection seed" doesn't. There are also issues of human language stemming which many tag systems silently ignore but full text indexes tend to have to address.

Below is the ego file manager 0.30.0 with the Annotation side panel. This panel auto saves the annotation if you select a new file or if you stop changing it for a few seconds. Hotkeys make this all quite handy. I'm using Control-t to start interactive tagging and Control-6 to switch to the annotation sidepanel with focus in the text block there. Hitting tab in the annotation sidepanel moves focus to the file list, so you can skip to and from annotating each file without the rodent.


You can of course add, view, and edit these same annotations from the command line. The fedit command runs vi on the annotation, allowing you to freely change it.
I have just fixed a slight inconsistency in the fedit command so it now accepts the "-a attribute" option too. The fcat views the annotation "-a attribute" from the file.

$ >| tfile
$ fedit -a annotation tfile
...
$ fcat -a annotation tfile
hi, the new annotation

$ feaindexquery '(annotation=~new)'
Found 1 matches at the following locations:
file:///tmp/tfile

Another nifty trick is to see the annotation right inside an fls output. Use mtime-display if you want the time to be more human readable than an epoch time_t.

$ fls --show-ea name,size,mtime,annotation tfile
tfile 0 1265954823 hi, the new annotation

Remember also that fls --xml gives you XML output, so with an XSLT stylesheet you can serve a directory of files and their annotation through your web server.
If this is of interest, see apps/phpsearchinterface/xml-results-to-xhtml.xsl for an initial stylesheet with row colour striping. It should be easy to extend the stylesheet to present other attributes. Bonus marks for anyone who makes it handle arbitrary XML attributes and orders them according to a predefined POSET. Patches always welcome...

When libferris saves an annotation, if DBus is enabled a signal is emitted on the session bus: "org.libferris.desktop", "AnnotationSaved"
which carries the URL and Path for the file you changed. This allows not only reindexing to happen, but you are free to hook up some Perl or whatnot to monitor this signal, then you can actually run some logic to work out what to do. If an annotation is saved, you might like to update an RSS feed for example.

And so ends the libferris tip of the day... happy annotating!
This post has been fueled by 99% cocoa, thanks to Jan-Piet Mens ;-)