Dr. MonkeyIQ: libferris

Showing posts with label libferris. Show all posts

Sunday, June 10, 2018

A new libferris is coming! 2.0.x

A while back I ported most of the libferris suite over to using boost for smart pointers and for signals. The later was not such a problem but there were always some fringe cases to the former and this lead to a delay in releasing it because there were some known issues.

I have moved that code into a branch locally and reverted back to using the Modern C++ Loki library for intrusive reference counting and sigc++. I imported my old test suite into the main libferris repo and will flesh that out over time.

I might do a 2.0.0 or 1.9.9 release soonish so that the entire stack is out there. As this has the main memory management stuff that has been working fine for the last 10 years this shouldn't be more unstable than it was before.

I was tempted to use travis ci for testing but will likely move to using a local vm. Virtualization has gotten much more convenient and I'm happy to setup a local test VM for this task which also breaks a dependency on companies which really doesn't need to be there. Yes, I will host releases and a copy of git in some place like github or gitlab or whatever to make that distribution more convenient. On the other hand, anyone could run the test suite which will be in the main libferris distro if they feel the desire.

So after this next release I will slowly at leisure work to flesh out the testsuite and fix issues that I find by running it over time. This gives a much more incremental development which will hopefully be more friendly to the limited time patches that I throw at the project.

One upside of being fully at the mercy of my time is that the project is less likely to die or be taken over by a company and lead in an unnatural direction. The downside is that it relies on my free time which is split over robotics, cnc, and other things as well as libferris.

As some have mentioned, a flatpak or docker image for libferris would be nice. Ironically this makes the whole thing a bit more like plan9 with a filesystem microkernel like subsystem (container) than just running it as a native though rpm or deb, but whatever makes it easier.

Wednesday, June 8, 2016

libferris 2.0

A new libferris is coming. For a while I've been chipping away at porting libferris and it's tree over to using boost instead of the loki and sigc++ libraries. This has been a little difficult in that it is a major undertaking and that you need to get it working or things segv in wonderful ways.

Luckily there are tests for things like stldb4 so I could see that things were in decent shape along the way. I have also started to bring back the dejagnu test suite for libferris into the main tree. This has given me some degree of happiness that libferris is working ok with the new boost port.

As part of that I've been working on allowing libferris to store it's settings in a configurable location. It's a chicken and egg problem how to set that configuration, as you need to be able to load a configuration in order to be able to set the setting. At the moment it is using an environment variable. I think I'll expand that to allow a longer list of default locations to be searched. So for example on OSX libferris can check /Applications/libferris.app/whatever as a fallback so you can just install and run the ferris suite without any need to do move setup than a simple drag and drop.

For those interested, this is all pushed up to github so you can grab and use right now. Once I have expanded the test suite more I will likely make an announced 2.0 release with tarballs and possibly deb/rpm/dmg distributions.

New filesystems that I've had planned are for mounting MQTT, ROS, and YAML.

Friday, July 17, 2015

OSX Bundling Soprano and other joys

Libferris has been moving to use more Qt/KDE technologies over the years. Ferris is also a fairly substantial software project in it's own right, with many plugins and support for multiple libraries. Years back I moved from using raw redland to using soprano for RDF handling in libferris.

Over recent months, from time to time, I've been working on an OSX bundle for libferris. The idea is to make installation as simple as copying Ferris.app to /Applications. I've done some OSX packaging before, so I've been exposed to the whole library paths inside dylib stuff, and also the freedesktop specs expecting things in /etc or whatever and you really want it to look into /Applications/YouApp/Contents/Resources/.../etc/whatever.

The silver test for packaging is to rename the area that is used to build the source to something unexpected and see if you can still run the tools. The Gold test is obviously to install from the app.dmz onto a fresh machine and see that it runs.

I discovered a few gotchas during silver testing and soprano usage. If you get things half right then you can get to a state that allows the application to run but that does not allow a redland RDF model to ever be created. If your application assumes that it can always create an in memory RDF store, a fairly secure bet really, then bad things will befall the app bundle on osx.

Plugins are found by searching for the desktop files first and then loading the shared libary plugin as needed. The desktop files can be found with the first line below, while the second line allows the plugin shared libraries to be found and loaded.

export SOPRANO_DIRS=/Applications/Ferris.app/Contents/Resources/usr/share
export LD_LIBRARY_PATH=/Applications/Ferris.app/Contents/Resources/usr/local/lib/soprano/

You have to jump through a few more hoops. You'll find that the plugin ./lib/soprano/libsoprano_redlandbackend.so links to lib/librdf.0.dylib and librdf will link to other redland libraries which themselves link to things like libxml2 which you might not have bundled yet.

There are also many cases of things linking to QtCore and other Qt libraries. These links are normally to nested paths like Library/Frameworks/QtCore.framework/Versions/4/QtCore which will not pass the silver test. Actually, links inside dylibs like that tend to cause the show to segv and you are left to work out where and why that happened. My roll by hand solution is to create softlinks to these libraries like QtCore in the .../lib directory and then resolve the dylib links to these softlinks.

In the end I'd also like to make an app bundle for specific KDE apps. Just being able to install okular by drag and drop would be very handy. It is my preferred reader for PDF files and having a binary that doesn't depend on a build environment (homebrew or macports) makes it simpler to ensure I can always have okular even when using an osx machine.

Monday, July 29, 2013

GDrive mounting released!

So version libferris-1.5.18.tar.xz is hot off the make dist; including this much ado about mounting Google Drive support. The last additional feature I decided to add before rolling the tarball was support for viewing and adding to the sharing information of a file. It didn't really do much for me being able to "cp" a file to google://drive without being able to unlock it for given people I know to have access to it. So now you can do that from the filesystem as well.

So, since the previous posts have been about the GDrive API and various snags I ran into along the way, this post is about how you can actually use this stuff.

Firstly run up the ferris-capplet-auth app and select the GDrive tab. I know I should overhaul the UI for this auth tool, but since it's mostly only used once for a web service I haven't found the personal desire to beautify it. So inside the GDrive tab, clicking on the "Authenticate with GDrive" button opens a dialog (should become a wizard), the first thing to do as it tells you is visit the console page on google to enable the GDrive API. Then click or paste the auth link in the dialog to allow libferris to get its hands on your data. The auth link goes to google and tells you what libferris is wanting. When you OK that you are given a "code" that you have to copy and paste back into the lower part of the auth capplet this dialog window. Then OKing the dialog will have libferris get a proper auth token from google and you are all set.

So to get started the below command will list the contents of your GDrive:

$ ferrisls google://drive

To put a file up on there you can do something like;

$ date >/tmp/sample.txt
$ ferriscp /tmp/sample.txt google://drive

And you can get it back with cat if you like. Or ferriscp it somewhere else etc.

$ fcat google://drive/sample.txt
Mon Jul 29 17:21:28 EST 2013

If you want to see your shares for this new sample file use the "shares" extended attribute.

$ fcat -a shares google://drive/sample.txt
write,monkeyiq

The shares attribute is a BINEBO (Bytes In Not Equal Bytes Out). Yay for me coining new terms! This means that what you write to it is not exactly what you will get when you read back from it. The handy part of that is that if you write an email address into the extended attribute, you are adding that person to the list of folks who can write to the file. Because I'm using libferris without FUSE and bash doesn't understand libferris URLs, I have to use ferris-redirect in the below command. You can think of ferris-redirect like the shell redirection (>) but you can also supply the extended attribute to redirect data into with (-a). If I read back the shares extended attribute I'll see a new entry in there. Google will have sent a notification email to my friend with a link to the file for me also.

$ echo niceguy@example.com \
| ferris-redirect -a shares google://drive/sample.txt
$ fcat -a shares google://drive/sample.txt
write,monkeyiq
write,Really Nice Guy

I could also add some hookup to your "contacts" to this, so your evolution addressbook nick names or google contacts could be used to lookup a person. In this case, with names changed to protect the innocent etc, so hypothetically google thinks the name for that email address is Really Nice Guy because he is in my contacts on gmail.

All of this extends to other virtual filesystem that libferris supports. You can "cp" from your scanner or webcam or a tuple of a database directly to google drive if that floats your boat.

I've already had a bit of a sniff at the dropbox API and others, so you might be able to bounce data between clouds in a future release.

Saturday, July 27, 2013

The new google://drive/ URL!

The very short story: libferris can now mount Google Drive as a filesystem. I've placed that in google://drive and will likely make an alias from gdrive:// to that same location so either will work.

The new OAuth 2.0 standard is so much easier to use than the old 1.0 version. In short, after being identified and given the nod once by the user, in 2.0 you have to supply a single secret, in 1.x you have to use per message nonce, create hashes, send the key and token, etc. The main drawback of 2.0 is that you have to use TLS/SSL for each request to protect that single auth token. A small price to pay, as you might well want to protect the entire conversation if you are doing things that require authentication anyway.

A few caveats of the current implementation: mime types on uploaded files are based on file name sniffing. That is because the upload you might be using cp foo.jpg google://drive and the filesystem copies the bytes over. But GDrive needs to know the mimetype for that new File at creation time. The GDrive PATCH method doesn't seem to let you change the mimetype of a file after it has been sent. A better solution will involve the cp code prenotifying the target location so that some metadata (mimetype) can be prefetched form the source file if desired. That would allow full byte sniffing to be used.

Speaking of PATCH, if you change metadata using it, you always get back a 200 response. No matter what. Luckily you also get back a JSON file string with all the metadata for the file you have (tried to) updated. So I've made my PATCH caller code to ignore the HTTP response code compare the returned file JSON to see if the changes actually stuck or not. If a value isn't set how it is expected my PATCH returns an exception. This is in contrast to the docs for the PATCH method which claims that the file JSON is only returned "if successful".

Oh yeah, one other tiny thing about PATCH. If you patch the description it didn't show up in Firefox for me until I refreshed the page. Changing the title does update the Firefox UI automatically. I guess the sidepanel for description hasn't got the funky web notification love yet.

There are two ways I found to read a directory, using files/list and children/list. Unfortunately the later, while returning only the direct children of a folder, also only returns a few pieces of information for those children the most interesting being the child's id. On the other hand the files/list gives you almost all the metadata for each returned File. So on a slower link, one doesn't need thinking music to work out if one round trip or two are the desired number. The files/list also returns metadata for files that have been deleted, and files which other's have shared with you. It is easy to set a query "hidden = false and trashed = false" for files/list to not return those dead files. Filtering on the server exclusively for files that you own is harder. There is a query alias sharedWithMe but no OwnedByMe to return the counter set. I guess perhaps "not sharedWithMe" would == OwnedByMe.

Currently I sort of ignore the directory hierarchy that files/list returns. So all your drive files are just in google://drive/ instead of subdirs as appropriate. I might leave that restriction in the first release. It's not hard to remove, but I've been focusing on upload, download, and metadata change.

Creating files, updating metadata, and downloading files from GDrive all work and will be available in the next libferris release. I have one other issue to cleanup (rate limiting directory read) before I do the first libferris release with gdrive mounting.

Oh and big trap #2 for the young players. To actually *use* libferris on gdrive after you have done the OAuth 2.0 "yep, libferris can have access" you have to go to code.google.com/apis/console and enable drive API for your account otherwise you get access denied errors for all. And once you goto the console and do that, you'll have to OAuth again to get a valid token.

A huge thank you for those two contributed to the ferris fund raising after my last post proposing mounting Google Drive!

Monday, July 22, 2013

Mounting Google Drive?

So on the heels of resurrecting and expanding the support for mounting vimeo as a filesystem using libferris I started digging into mounting Google Drive. As is normally the case for these things, the plan is to start out with listing files, then uploading files, then downloading files, then updating the metadata for files, then rename, then delete, and with funky stuff like "tail -f" and append instead of truncate on upload.

One plus of all this is that the index & search in libferris will then extend it's claws to GDrive as well as desktop files. As I&S is built on top of the virtual filesystem and uses the virtual filesystem to return search results.

For those digging around maybe looking to do the same thing, see the oauth page for desktop apps, and the meat seems to be in the Files API section. Reading over some of the API, the docs are not too bad. The files.watch call is going to take some testing to work out what is actually going on there. I would like to use the watch call is for implementing "tail -f" semantics on the client. Which is in turn most useful with open(append) support. The later I'm still tracking down in the API docs, if it is even possible. PUT seems to update all the file, and PATCH seems very oriented towards doing partial metadata updates.

The trick that libferris uses of exposing the file content through the metadata interface seems to be less used by other tools. With libferris, using fcat and the -a option to select an extended attribute, you can see the value of that extended attribute. The content extended attribute is just the file's content :)

$ date > df.txt
$ fcat -a name df.txt
df.txt
$ fcat -a mtime-display df.txt
13 Jul 23 16:33
$ fcat -a content df.txt
Tue Jul 23 16:33:51 EST 2013

Of course you can leave out the "-a content" part to get the same effect, but anything that is wanting to work on an extended attribute will also implicitly be able to work on the file's byte content as well with this mechanism.

If anyone is interested in hacking on this stuff (: good ;) patches accepted. Conversely if you would like to be able to use a 'cp' like tool to put and get files to gdrive you might consider contributing to the ferris fund raising. It's amazing how much time these Web APIs mop up in order to be used. It can be a fun game trying to second guess what the server wants to see, but it can also be frustrating at times. One gets very used to being able to see the source code on the other side of the API call, and that is taken away with these Web thingies.

Libferris is available for Debian Hard Float and Debian armel soft floating point. I've just recently used the armhf to install ferris on an OMAP5 board. I also have a build for the Nokia N9 and will update my Open Build Service Project to roll fresh rpms for Fedora at some stage. The public OBS desktop targets have fallen a bit behind the ARM builds because I tend to develop on and thus build from source on desktop.

Saturday, July 20, 2013

Like a Bird on a Wire(shark)...

Over recent years, libferris has been using Qt to mount some Web stuff as a filesystem. I have a subclass of QIODevice which acts as an intermediary to allow one to write to a std::ostream and stream that data to the Web, over a POST for example. For those interested, that code is in Ferris/FerrisQt.cpp of the tarball. It's a bit of a shame that Qt heavy web code isn't in KIO or that the two virtual filesystems are not closer linked, but I digress.

I noticed a little while ago that cp to vimeo://upload didn't work anymore. I had earmarked that for fixing and recently got around to making that happen. It's always fun interacting with these Web APIs. Over the time I've found that Flickr sets the bar for well documented APIs that you can start to use if you have any clue about making GET and POST etc. At one stage google had documented their API in a way that you could never use it. I guess they have fixed that by now, but it did sort out the pretenders from those two could at least sniff HTTP and were determined to win. The vimeo documentation IIRC wasn't too bad when I added support to upload, but the docs have taken a turn for the worst it seems. Oh, one fun tip for the young players, when one API call says "great, thanks, well done, I've accepted your call" and then a subsequent one says "oh, a strange error has happened", you might like to assume that the previous call might not have been so great after all.

So I started tinkering around, adding oauth to the vimeo signup, and getting the getTicket call to work. Having getTicket working meant that my oauth signed call was accepted too. I then was then faced with the upload of the core data (which is normally done with a rather complex streaming POST), and the final I'm done, make it available call. On vimeo that last call seems to be two calls now, first a VerifyChunks call and then a Complete call.

So, first things first. To upload you call getTicket which gives you an endpoint that is an HTTP URL to send the actual video data to, as well as an upload ticket to identify the session. If you try to post to that endpoint URL and the POST converts the CGI parameters using multipart/form-data with boundaries into individual Content-Disposition: form-data elements, you loose. You have to have the ticket_id in the URL after the POST text in order to upload. One little trap.

So then I found that verifyChunks was returning Error 709 Access to the chunk list failed. And that was after the upload had been replied to with "OK. Thanks for the upload.". Oddly, I also noticed that the upload of video data would hang from time to time. So I let the shark out of the pen again, and found that vimeo would return it's "yep were done, all is well" response to the HTTP POST call at about 38-42kb into the data. Not so great.

Mangling the vimeo.php test they supply to upload with my oauth and libferris credentials I found that the POST had a header Expect: 100-continue. Right after the headers were sent vimeo gave the nod to continue, and then the POST body was sent. I assume that just ploughing through and giving the headers followed by the body confused the server end and thus it just said "yep, ok, thanks for the upload" and dropped the line. Then of course forgot the ticket_id because there was no data for it, so the verifyChunks got no chunk list and returned the strange error it did. mmm, hindsight!

So I ended up converting from the POST the newly available PUT method for upload. They call that their "streaming API" even though you can of course stream to a POST endpoint. You just need to frame the parameters and add the MIME tailer to the POST if you want to stream a large file that way. Using PUT I was then able to verify my chunks (or the one single chunk in fact) and the upload complete method worked again.

In the end I've added oauth to my vimeo mounting, many thanks to the creators of the QOAuth library!

Wednesday, May 8, 2013

Save Ferris: Show some love for libferris...

Libferris has been gaining some KDE love in recent times. There is now a KIO slave to allow you to see libferris from KDE, also the ability to get at libferris from plasma.

I've been meaning to update the mounting of some Web services like vimeo for quite some time. I'd also like to expand to allow mounting google+ as a filesystem and add other new Web services.

In order to manage time so that this can happen quicker, I thought I'd try the waters with a pledgie. I've left this open ended rather than sticking an exact "bounty" on things. I had the idea of trying a pledgie with my recent investigation into the libferris indexing plugins on a small form factor ARM machine. I'd like to be able to spend more time on libferris, and also pay the rent while doing that, so I thought I'd throw the idea out into the public.

If you've enjoyed the old tricks of mounting XML, Berkeley DB, SQLite, PostgreSQL and other relational databases, flickr, google docs, identica, and others and want to see more then please support the pledgie to speed up continued development. Enjoy libferris!

Tuesday, April 30, 2013

libferris available for debian arm hard float

To compliment my existing packages of the libferris virtual filesystem and index/search suite for soft float debian, I now offer hot off the press, debian hard float! The distinction between how floating point is handled probably doesn't make a big difference to the operation of libferris, but if you are installing debian on an odroid-u2 then you are likely running hf, and as such having debs which are hf makes installing libferris a whole bunch simpler.

With the eMMC card on the u2, it is a really enjoyable little server machine to play around with. So far I've done the rudimentary test that XML is mountable as a filesystem and created one or two indexes with the Qt/SQLite index plugin for libferris. Note that in recent releases the sqlite backend is transaction backed which gives a huge performance increase, and on really IO constrained machines this is even more noticeable. This is a little tip for those using QtSQL, transactions are not just for making operations atomic, you may find that the whole show runs faster when it is transaction protected.

If you haven't played with libferris, things are auto mounted where possible, and there are many coreutils like tools to make interacting with ferris simple. The ferris-redirect is like the bash redirection but can write to any filesystem that libferris knows how to mount.

ben@odroidu2:/tmp/test$ cat example.xml
<top>
<node name="foo">bar
</node>
</top>
ben@odroidu2:/tmp/test$ fls example.xml
top
ben@odroidu2:/tmp/test$ fls example.xml/top
foo
ben@odroidu2:/tmp/test$ fcat example.xml/top/foo
bar
ben@odroidu2:/tmp/test$ date | ferris-redirect -T example.xml/top/foo
ben@odroidu2:/tmp/test$ cat example.xml
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<top>
<node name="foo">Tue Apr 30 14:57:26 PDT 2013
</node>
</top>

The above interaction would also work for mounted Berkely DB and other filesystems of course.

I have noticed one of the binary scoped destructors doesn't like the hard float build for whatever reason. This can cause some of the command line tools to not exit gracefully, which is a shame. I can't get a good backtrace for the situation either, which makes tracking it down a nice day long adventure into trial and error.

So something productive has been generated during the last round of jet lag after all!

The goods http://fuuko.libferris.com/debian/debian-armhf/

Save Ferris!

Monday, February 18, 2013

Redland && Fedora 18

It appears that redland might be busted in some respects in Fedora 18. After a fresh install I recompiled libferris as per usual after an upgrade, only to find that the RDF/Soprano engine for metadata was busted. Digging into this, looking at the soprano sources and with many versions of soprano and sopranocmd I thought maybe the issue was one layer closer to the metal. Switching to rdfproc from redland I noticed that I couldn't create a new on disk RDF/db store!

$ rdfproc myrdf add \
'http://witme.sf.net/libferris-core/0.1/subj' \
'http://witme.sf.net/libferris-core/0.1/pred' \
'http://witme.sf.net/libferris-core/0.1/obj1'

rdfproc: Failed to open hashes storage 'myrdf'

After downloading redland's source rpm, recompiling it and installing the resulting rpm files I could then create RDF/db again. Same command, different result:

$ rdfproc myrdf add \
'http://witme.sf.net/libferris-core/0.1/subj' \
'http://witme.sf.net/libferris-core/0.1/pred' \
'http://witme.sf.net/libferris-core/0.1/obj1'
rdfproc: Added triple to the graph

I haven't verified this on a second Fedora 18 machine yet, but it might well mean that anyone trying to use the Berkeley db backends of redland on F18 would need an update to redland or to do some tinkering.

Wednesday, November 21, 2012

Libferris as a KIO Slave

The libferris virtual filesystem can now be exposed as a KIO Slove too. This allows you to use KDE applications to list, read and write the vast number of virtual filesystems that libferris makes available. For those who don't know, ferris is a little project I've been working on over the last decade to make everything a file: XML, db, relational data, web services, and even applications like XWindow, Amarok, pulseaudio, and gstreamer.

The following will create a Berkeley db4 file from the command line and show it to you. To dig into such files with libferris you can just read the file directly. So in this case I just grab the "base" directory in this db4 file with konq. You should see the sample and file2 files in that directory view and be able to load and save those "files" into kwrite. Sorry about the video being a tad jumpy, I have to work out what part of my desktop is causing that :/

Using libferris through the KIO interface from Ben Martin on Vimeo.

The commands as plain text:

$ cd /tmp
$ db45_load -T -t btree foo.db
base/sample
value here
base/file2
contents

$ db_dump -p foo.db
$ konqueror ferris:/tmp/foo.db/base

little tricks I found during the hacking, your listDir() method might call listEntry( e, false ) with a final call using ( e, true ). The last call is impotent with regard to "e" and just a finalizer call. The get() method uses data() to deliver bytes to the KIO user (application). The put() method uses dataReq() in a loop to grab chunks frrom the KIO user. Currently I have lazy methods, for example the put() just grabs everything from the KIO user and then operates on the data once it has everything. Really bad for 4gb files, but for smallish files to get a feel for things it works quite well.

Also, if you are using library init, in my case using gmodule to dynamically load some plugins from libferris itself, you might be in for a world of fun and games. Currently I spawn processes and interact with them from the KIO slave to get around this issue. I imagine for a more efficient implementation being able to tell KDE to load my KIO Slave as an application with a normal full init leading to a main() but haven't looked at that little technicality yet.

I've mostly been tinkering with kioclient, konq, and kwrite on libferris files at the moment. Things are turning out well, though there are still many glitches to this early days integration. This will be released in the next libferris version once I clean it up a bit more.

Monday, November 19, 2012

Mounting KIO with libferris

I know there are more folks interested in going the other way -- seeing libferris through kio glasses. Not that there are hoards of folks in either camp of course. Nonetheless the first move has been to allow libferris to mount kio.

The groundwork is very similar to what I'm thinking of using to allow kio to mount libferris. Top level URL schemes will appear first, allowing you to dig into each URL scheme. For example, in libferris kio appears under the kio: filesystem. The first directory in that filesystem is the KIO URL scheme to use. So something like the following will work:

$ date >| /tmp/df.txt
$ fcat kio:file:/tmp/df.txt

Some of the more fun KIO slaves to access through this are the man and fonts URLs. The following two commands produce the same man page:

$ fcat kio:man:/man
$ kioclient cat man:/man

Support is preliminary and only allows reading the files but not writing to them through the kio: filesystem as yet. Already though the kio: can be exposed to XQuery, SQLite, and through the libferris REST interface. Yay for cooperating!

I notice some really juicy digikam kio slaves but haven't dug into them enough to use them as yet. Although you can already upload to various web services from digikam, once I get access to digikamalbums through ferris kio mounting I can then 'cp' the images directly from the command line to other things that libferris can already mount such as flickr API sites, printers etc.

Thursday, April 26, 2012

Editing XML and PostgreSQL with ferris REST.

With a few minor tweaks, one can now edit the contents of XML files and PostgreSQL tables with the libferris REST interface. The YUI web interface has been updated to allow that to easily happen. In the below video, example.xml is first shown and then instead of viewing it as a file, I choose to "read" it as a directory, causing libferris to sniff it out and work out that it can mount that file as a directory for you. The same can be done using "ferrisls example.xml" at the command line.

Notice that the read link is only offered on the XML files. This is because libferris tells the client that those are not natively "directories" but can be seen that way if you like.

I then simply click down to the "barry" XML element in the mounted XML file. Editing the barry entry will write the data back to the server asynchronously. The terminal is used to verify that things went on back.

The second browser tab shows a mounted PostgreSQL table which has ID and Message columns. If I select edit on the message column I get to update the tuple as desired. Notice that the data grid in the browser is updated when the data is saved on back by watching the "message" column in the browser. I then use psql to verify the update from the command line.

Other interesting possibilities include mounted log files being split into columns in the web view, or grabbing some data from plasma on the server, streaming data from gstreamer or zoneminder, or anything else that ferris can do as a VFS. And as I tend to want ferris to mount it all^TM, the sky should be the limit! Heh, so I got the catchphrase in there emes.

Monday, April 23, 2012

VFS in the Could? Libferris Web Interface...

I'm syndicating to planet KDE because things in the post might be of interest for KDE. I'd be overjoyed to see some of the features in KDE too, the more powerful the tools available to folks the better the future tools will be ;) So, on with the show... I decided to add REST and YUI stuff to libferris. This is still very much a work in progress in spare time... Luckily the heavy lifting is all done already in the libferris library.

The initial web interface is still fairly basic, the back and forward buttons are handled by the browser leaving only the parent button in the apps toolbar. Home and Heart are your home dir and bookmarks respectively.

Clicking on a row allows arbitrary annotation of that file. The annotations are stored in either native kernel level Extended Attributes or RDF. A feature I find very useful is that all metadata is presented via the same interface. As you can see the "Annotation" column in the listview is showing your own description of each file. You can filter or sort on annotation just as you can on the file name.

The search page allows you to find files by their text content (full text search) and/or their metadata. As I've mentioned in the past, the metadata indexing modules include many optimizations above using the native APIs. This includes indexed lookup on certain classes of regular expressions. Many naive query evaluations using regular expressions will result in a linear time complexity. And unless you have used explicit code to handle it, you are likely to enjoy this bad performance even with very advanced indexing libraries and databases.

Tags or "emblems" etc are also handled through the same metadata interface. The tagging sidepanel offers suggestions for existing tags as you type and allows you to create new tags as you attach them to files. Removal of a tag is just a click away.

Of course, clicking a filename shows you the file itself over REST. This allows you to stream video files over REST to the Nokia n9 for example. There is partial IO support and write support so I could include a fancy text editor or image editor component in there... Unfortunately YUI 3.5.0 doesn't seem to support "selections" in the datagrid, so that nice visual feedback will have to wait for yui 3.6.0.

Monday, April 9, 2012

REST & Filesystems: A homage to plan9

In my previous post I mentioned, among other things, two things I want to hack together: An improved REST API for libferris, and KDE KIO integration for libferris. It occurred to me that these two, while able to be performed as distinct tasks, are better done as one. Or rather that the later can be coded to rely on the first being available. With a REST API and a Web server, a virtual filesystem inherits one of the great properties plan9 - filesystems as separate processes that can run on another machine and access data on a third machine.

Consider wanting to get at information from a PostgreSQL table from a mobile phone. The REST API might be on the "webserver" machine, while the database runs on dbserver. The phone talks to webserver sending GET/POST and receiving XML/JSON depending on it's preference. Using libferris, webserver talks to dbserver (by mounting it) and queries and returns results back to the phone.

This also lets the phone upload images via the REST API. Instead of setting up software on the phone, just use QML or HTML5 and http POST the image to the REST API using a path of "flickr://me/upload/" and have the server send that image onwards. Anything that libferris can mount and interact with becomes available over the light weight REST API, ready for QML/Javascript thumb interfaces. Other examples would be getting at a mounted Zoneminder over REST, or a webcam using mounted gstreamer.

Using REST like this also fairly nicely sweeps away language binding issues. Almost all languages can access http in some way. Sort of like CORBA without the IDL files and tools to create stubs for you. Hmm, a REST API IDL compiler for Qt with OAuth support... mmm...

Monday, April 2, 2012

Support libferris, get at ferris through KIO, and read about OTR messaging

I thought I'd see how well some pledgie tasks would do in the wild. Some of this code is stuff I've been wishing to write for a while but unable to make the time.

If you want to get at some of the funky filesystems offered by libferris from your KDE desktop, you might like to support my "KIO Slave for libferris" pledgie:

Click here to lend your support to: KIO Slave for libferris and make a donation at www.pledgie.com !

If you've always wanted to add support for Off-the-Record messaging to your project, or help entice somebody else in that direction, I have a tutorial article on offer here:

Click here to lend your support to: Off The Record Messaging HOWTO article and make a donation at www.pledgie.com !

If you want a REST interface for libferris, listing directories and getting at files over HTTP/HTTPS, you can help make that happen here:

Click here to lend your support to: Extend the REST interface of libferris and make a donation at www.pledgie.com !

And finally, last for today but not least, if you want to get at jpeg images on your Zoneminder server using nice normal command lines like the following:


$ fcat zoneminder://server/monitor | okular -

Then you might like to throw a little loose change at the zoneminder plugin pledgie:

Click here to lend your support to: libferris mounting Zoneminder and make a donation at www.pledgie.com !

Saturday, March 24, 2012

Ferris on the n9: Search by URL, content, and mtime

The n9 libferris app now allows you to search by the URL of the file as before, but also now by the file text content and it's modification time. Query type is selected by a button in the top right corner which unfortunately isn't nearly as easy to read in the video file as it is in real life.

As libferris handles extraction, update, and storage of metadata from disparate locations I have also added a sprinkling of what that means into the video. Notice that the first search by URL shows a comment "REST interface to libferris" in bold. This is simply the "annotation" metadata of the file but is much more interesting to the searcher than it's URL. Likewise in the second query, which finds a Gutenburg text file by searhing on text content, the annotation offers the name of the book that the file contains. Again much more interesting content to the human who is at the helm.

The third query is on the modification time of the files. There are three ways offered to perform a time search, "more recently than this" or "than last" which can have month or day as options for example, and modified >= X months ago which obviously wants a number as the query text. When querying by time like this, libferris happily accepts some human readable terms like "begin this month" as the time you specify. This makes it just as convenient to use search in scripts as in the dedicated front ends like the n9 app.

The plan is as always, to push metadata from being an after thought to being a first class citizen: able to be created, read, written, indexed, and searched on. Any file with metadata should be able to expose that as simply as its mtime or size, which is all currently done via a key-value "Extended Attribute" interface at the lowest level.

If you like libferris or the recent updates (status.net mounting, these indexing tweaks etc) then please consider making a donation. If you want to use this technology at a corporate level, please feel free to contact me.

Friday, March 23, 2012

libferris in 512Mb RAM on arm5 at 1.2Ghz

I mentioned yesterday that I had started hacking my infix indexing optimization into the clucene index module of libferris. The short story is that on an aged ARM machine with fairly slow IO this optimization makes a huge difference to regex query on URLs. The numbers are below, this is on an index of ~/, /etc and /usr on the ARM5 machine running debian, about 100,000 odd files. Cold1 and hot1 are the same query executed against cold and hot caches (second run done just after the first). Naturally Cold2 and Hot2 are the same for a different query. Cold1 returns 22 results and Cold2 returns a single result.


       old     new
cold1  4.9     3.6
hot1   2.0     1.0
cold2  2.6     1.9
hot2   1.7     1.0

As can be seen, the optimization effects both hot and cold cache times which is quite handy as there are many times I use both, starting with a "-Z" evalutation to see how many matches there are and specializing or generalizing the query from there.

While the numbers may seem large, keep in mind that this is a slow arm running from an IO interface that needs some TLC. Using this indexing on an N9 or desktop machine will be much quicker even if there are 10 times the number of files indexed.

These arm deb builds will be up on fuuko.libferris.com sometime soon. I have to do a release of libferris with this optimization and the funky new JSON/REST support too.

Thursday, March 22, 2012

Libferris on the N9: JSON, REST, QML, Index and Search, VFS all together

Continuing my n9 apps I now have an index search app that uses libferris. Currently I have only exposed URL regular expression search. This is searching an index maintained and stored in a PostgreSQL database. Solving regex query in a timely manner is a fairly complex problem with one useful solution outlined in one of my previous blog entries.

Note that data is downloaded over normal http(s) from the server using the n9 app, so no NFS or other network mounts are needed. This might be handy for grabbing a PDF, image or text file off an Intranet file server while on the move.

I have also been updating the clucene libferris module to allow more effective index use during regex/infix queries. This libferris engine works quite nicely on older ARM machines. Benchmarks on that to follow, once I reindex the arm and produce the comparative figures.

Thursday, March 8, 2012

Libferris on both arms

I now have libferris on a 512mb ARM5t embedded device and a 1gb arm7 one (the Nokia n9). As part of the fun and games I updated the clucene and boostmmap index modules for libferris and created a QtSQL module which uses SQLite by default. Hey, when you can't choose which cake to have why not have a slice of them all... or a slice to fit the moment or device at hand. eg, desktop or server works nicely with the soprano or postgresql indexing modules, embedded is nicer on mmap or clucene.

I find it ironic that I never really thought much of "embedded" devices when hacking libferris. But devices with 512/1024mb of RAM are really not so much embedded I guess.

A few things I've tried to do in the design of libferris index+search is to design for many machines and also federations of them. It is possible to search a router, phone, and desktop's individual indexes as a federation from the laptop. Another thing that helps is that indexing is routed through the "findexadd" commands. So you can use find and split to break up indexing activity, and have it done from cron when you want.
The new --total-files-to-index-per-run option works in combination, causing findexadd to exit when it has indexed a given number of files. Note that if a file has not changed since it was last indexed it is not indexed again (no need), so that file does not count toward the
total-files-to-index-per-run tally.

The below is a little script to incrementally index just selected metadata from /usr and your home directory using clucene. The WHITELIST environment variable stops libferris from trying to sniff up metadata for files and has it only look for and add what metadata you want. If you have md5 in there then libferris will store the checksum for each file, at a commensurate cost in IO. Splitting into batches of 5000 prevents the process running too long and wanting too much RAM.

$ mkdir -p ~/.ferris/ea-index
$ cd ~/.ferris/ea-index
$ fcreate --create-type eaindexclucene .
$ vi update.sh
#!/bin/bash

TMPDIR=~/tmp
EAIDXPATH=~/.ferris/ea-index

cd $TMPDIR
find /usr | split -l 5000 - usr.split.
find ~ | split -l 5000 - home.split.

export LIBFERRIS_EAINDEX_EXPLICIT_WHITELIST=name,size,mtime,mtime-display,atime,ctime,user-owner-name,group-owner-name,user-owner-number,group-owner-number,inode
echo "whitelist: $LIBFERRIS_EAINDEX_EXPLICIT_WHITELIST"
cd $EAIDXPATH
rm -f write.lock
for if in $TMPDIR/*split.*
do
echo "Processing $if"
cat $if | feaindexadd -P `pwd` -1
done

Ironically the arm5 has given me much less trouble overall. One issue seems to be with gcc-4.4x on the n9. Charming little errors like my old friend the undefined __sync_val_compare_and_swap_4 which stops memory mapped boost data structures from working properly and also leaves the clucene-core-2.3.3.4 build laying on the side of the road bleeding. I've hacked the clucene code to get around the atomic errors, but then seem to have found that search results are not accurate. I guess my quick hack there was just bad^TM. Especially since the arm5 produces the right results using the virgin clucene codebase.

I've been trying to convince gcc 4.5 and 4.6 to build for me so I can use the updated compiler to generate a proper and working clucene for the device. I seem to run into little build issues after time consuming rebuilds. (uses VFP register arguments, yay). Once I stop compiling compilers then maybe I can get my favourite indexing code on the n9.