Monday, January 25, 2010

KOffice and RDF: Say it with Style...

This scattered series of posts has been about the RDF support I'm working on for KOffice. The ODF document format lets you store RDF/XML data inside the document file, which in turn lets both a human reader and a computer know about things that comprise an office document. You can refer to a person, place, or time and have the computer know what you are saying without having to resort to heuristics.

Having RDF support in document formats means you can send somebody a single file containing exact information about real world events. The RDF can contain details which can be pulled up in the formatting of the text that you see. For example, for a given contact you might know his phone number, home page, normal business location, email address etc. You might only want to see a small fraction of this information at one place in a document, but perhaps for a header you want to know the postal address too. Stylesheets are what I'm working on right now to let that happen.

At the start of the video below, you can see James, Joyce and Mark. As I click on these contacts, the RDF docker tells me information about them. As you can see, there is more information known to KOffice than is shown in the document (first & last name). However, for Mark, we also know where he is and that is shown in the RDF docker.

James is mentioned in the second paragraph, and the document is talking about giving him a call to verify something. Instead of hunting down his phone number, you can set a semantic stylesheet for that particular reference to him in the document to include is phone number inline in the document text. The added advantage here is that if you edit his phone number via the RDF docker, all the places in the document text that cite the phone number are updated for you. KOffice knows that those digits are James' phone number, so it can modify them for you.

Later on we again cite the event itself, just saying its "next weekend", which isn't an ideal description of when a specific event is happening. Luckily, we have cited the RDF event, so it shows up in the RDF docker and the stylesheets are available to reformat the text. In this case I want to see the summary and when it starts.

I'm working on adding user specified stylesheets now too, as the Format menu shows in the video. When you create a user stylesheet it is also saved in RDF, so the stylesheets you make become part of the document itself. They will be available when some other KOffice user loads the document.

The File/Document Information widget has a new RDF section which lets you see and directly edit the RDF triples if that's your thing, the semantic tab shows you all the higher level things that KOffice has seen in the document, like poeple, places, and events, and finally the stylesheets page lets you nominate how you want things formatted by default. For example, you might want to see a persons name and phone number so setting that to the default lets you then drag and drop some contacts from kaddressbook into the document and you will see the phone number as part of the document text.

Of course, you can drag and drop items from the RDF docker into kaddressbook and korganizer. These pieces of information should be able to be moved into and out of an ODF file using KOffice without thinking about it. You want to add Fred to the text, pick him up from your kaddressbook and drop him into the RDF docker. Your default contact stylesheet is then used to insert some text into the document at the current cursor location showing you the Fred contact. Quick and simple... Lets make RDF something everybody uses but nobody needs to learn about (unless they want to).

KOffice and RDF: Say it with Style... from Ben Martin on Vimeo.


rullzer said...

This sounds very cool, more semantics in documents is always a good idea!

Will you also edit foaf support? Since then it is not me that has to update contact data but the person that owns the foaf account.

Ivan Čukić said...

Awesome! Can't wait to have all (KDE) applications go semantic :)

Michael said...

Sounds really interesting. However I am not sure about how the data is stored. Is the semantic data stored with the document? If so his could easily become a privacy issue, couldn't it?

monkeyiq said...

@rullzer: The contacts are indeed just FoaF triples. Calendar events are in the rdfical format and Location data available as the geo84 or other triples.
So you can get them from other folks already if they publish it that way.

monkeyiq said...

@Michael: Yes the RDF/XML is stored inside the zip file that an ODF file actually is. And yes, it could become a privacy issue. No more or less than any other data you transmit though. Although D&D makes it convenient to get semantic data into the document, you still have to manually add the data that way.

But some default "cleaning" options might be nice, so you can remove some information from documents on save/export.

Michael said...

@monkeyiq: I agree about the default cleaning. Personally I just don't think it's a good idea to have hidden private data stored in a document. Could make for some nasty suprises.

Don't get me wrong, I think having semantic data stored in a document is a great idea. It just seems to be a double sided sword - I would like to see onse side bluntet.

rullzer said...

@Michael: It depends, you usually know what you are writing about. Semantics is most of the time use full for other people. The security concern still holds of course!

rullzer said...

@monkeyiq: Then maybe for the next version it would be good to link semantic data related to a person to an email adress the sha1sum of course.

That way the local host that runs views the document can look into its own adddressbook and find the coresponding contact.

Or even better link it to a foaf.rdf file.

leobard said...

this rocks! Tell the NEPOMUK-KDE folks more about it. And please add a video-version with audio commentary and add a video where you show how to drag-drop appointments or contact entries from the address book or calendar to KOffice!

because you realized the vision of TimBl's 2001 Article "The Semantic Web" :-)