Dr. MonkeyIQ: kde

Showing posts with label kde. Show all posts

Friday, July 17, 2015

OSX Bundling Soprano and other joys

Libferris has been moving to use more Qt/KDE technologies over the years. Ferris is also a fairly substantial software project in it's own right, with many plugins and support for multiple libraries. Years back I moved from using raw redland to using soprano for RDF handling in libferris.

Over recent months, from time to time, I've been working on an OSX bundle for libferris. The idea is to make installation as simple as copying Ferris.app to /Applications. I've done some OSX packaging before, so I've been exposed to the whole library paths inside dylib stuff, and also the freedesktop specs expecting things in /etc or whatever and you really want it to look into /Applications/YouApp/Contents/Resources/.../etc/whatever.

The silver test for packaging is to rename the area that is used to build the source to something unexpected and see if you can still run the tools. The Gold test is obviously to install from the app.dmz onto a fresh machine and see that it runs.

I discovered a few gotchas during silver testing and soprano usage. If you get things half right then you can get to a state that allows the application to run but that does not allow a redland RDF model to ever be created. If your application assumes that it can always create an in memory RDF store, a fairly secure bet really, then bad things will befall the app bundle on osx.

Plugins are found by searching for the desktop files first and then loading the shared libary plugin as needed. The desktop files can be found with the first line below, while the second line allows the plugin shared libraries to be found and loaded.

export SOPRANO_DIRS=/Applications/Ferris.app/Contents/Resources/usr/share
export LD_LIBRARY_PATH=/Applications/Ferris.app/Contents/Resources/usr/local/lib/soprano/

You have to jump through a few more hoops. You'll find that the plugin ./lib/soprano/libsoprano_redlandbackend.so links to lib/librdf.0.dylib and librdf will link to other redland libraries which themselves link to things like libxml2 which you might not have bundled yet.

There are also many cases of things linking to QtCore and other Qt libraries. These links are normally to nested paths like Library/Frameworks/QtCore.framework/Versions/4/QtCore which will not pass the silver test. Actually, links inside dylibs like that tend to cause the show to segv and you are left to work out where and why that happened. My roll by hand solution is to create softlinks to these libraries like QtCore in the .../lib directory and then resolve the dylib links to these softlinks.

In the end I'd also like to make an app bundle for specific KDE apps. Just being able to install okular by drag and drop would be very handy. It is my preferred reader for PDF files and having a binary that doesn't depend on a build environment (homebrew or macports) makes it simpler to ensure I can always have okular even when using an osx machine.

Saturday, November 9, 2013

RePaper 2.7 inch epaper goodness from the BeagleBone

A little while back I bought a rePaper 2.7 inch eInk display. While the smaller, down to 1.4 inch screens have few enough pixels to be driven from an Arduino, the 264x176 screen should need around 5.5k for a single frame buffer, and you need two buffers to "wax on, wax off" the image on the display in order to update. The short story is that these displays work nicely from the BeagleBone Black. You have to have a fairly recent kernel in order to get the right sys files for the driver. Hint: if you have no "duty" file for your pwm then you have too old of a kernel.

So the first image I chose to display after the epd_test was a capture of fontforge editing Cantarell Regular. Luckily, I've made no changes to the splineset so my design skills are not part of the image. The rendering of splines in the charview of fontforge uses antialiasing, as it was switched over to cairo around a year ago. As the eInk display is monochrome the image displayed is dithered back to 1 bit.

With the real time collaboration support in fontforge this does raise the new chance to see a font being rendered on eInk as you design it (or hint it). I'm not sure how many fonts are being designed with eInk as the specific consumption ground. If you are interested in font design, checkout Crafting Type which uses fontforge to create new type, and you should also be able to see the collaboration and HTML preview modes in action.

Getting the actual eInk display to go from the BeagleBone had a few steps. Firstly, I managed to completely fill up the 2gb of eMMC where my Angstrom was installed. So now I'm running the whole show off a high speed 8gb sandisk card. I spent a little extra cash on a faster card, its one of the extreme super panda + extra adjective sandisk ones. The older kernel I had didn't have a duty file for the PWM pin that the driver wanted to use. Now I that I have a fully updated beaglebone black boot area I have that file. FWIW I'm on kernel version 3.8.13-r23a.49.

Trying out the epd_test initially showed me some broken lines and after a little bit what looked like a bit of the cat from the test image. After rechecking the wireup a few times I looked at the code and saw it was expecting a 2 inch screen. That happens in a few places in the code. So I changed those to reflect my hardware. Then the test loop ran as expected!

The next step was getting the FUSE driver installed (change for size needed too). Then the python demos could run. And thus the photo above was made. My next step is to create a function to render cairo to /dev/epd/display in order to drive the display directly from a cairo app.

A huge thank you to rePaper for making this so simple to get going. The drivers for Raspberry and Beagle are up on their github page. I had been looking at the Arduino driver and it's SPI code thinking about porting that over to Linux, but now that's not necessary! I might design some cape love for this, perhaps with a 14 pin IDC connector on it for eInk attaching. Shouldn't look much worse than last night's SPI only monster, though something etched would be nicer.

The 2.7 inch changes are below, the first one is just slightly more verbose error reporting. You'll also want to set EPD_SIZE=2.7 in /etc/init.d/epd-fuse.

diff --git a/PlatformWithOS/BeagleBone/gpio.c b/PlatformWithOS/BeagleBone/gpio.c
index b3ded6f..d1df3df 100644
--- a/PlatformWithOS/BeagleBone/gpio.c
+++ b/PlatformWithOS/BeagleBone/gpio.c
@@ -767,7 +767,7 @@ static bool PWM_enable(int channel, const char *pin_name) {
                                usleep(10000);
                        }
                        if (pwm[channel].fd < 0) {
-                               fprintf(stderr, "PWM failed to appear\n"); fflush(stderr);
+                               fprintf(stderr, "PWM failed to appear pin:%s file:%s\n", pin_name, pwm[channel]
                                free(pwm[channel].name);
                                pwm[channel].name = NULL;
                                break; // failed
diff --git a/PlatformWithOS/demo/EPD.py b/PlatformWithOS/demo/EPD.py
index da1ef12..41cc6c1 100644
--- a/PlatformWithOS/demo/EPD.py
+++ b/PlatformWithOS/demo/EPD.py
@@ -48,8 +48,8 @@ to use:

     def __init__(self, *args, **kwargs):
         self._epd_path = '/dev/epd'
-        self._width = 200
-        self._height = 96
+        self._width = 264
+        self._height = 176
         self._panel = 'EPD 2.0'
         self._auto = False

diff --git a/PlatformWithOS/driver-common/epd_test.c b/PlatformWithOS/driver-common/epd_test.c
index e2f2b5a..afe3cb8 100644
--- a/PlatformWithOS/driver-common/epd_test.c
+++ b/PlatformWithOS/driver-common/epd_test.c
@@ -72,7 +72,7 @@ int main(int argc, char *argv[]) {
        GPIO_mode(reset_pin, GPIO_OUTPUT);
        GPIO_mode(busy_pin, GPIO_INPUT);

-       EPD_type *epd = EPD_create(EPD_2_0,
+       EPD_type *epd = EPD_create(EPD_2_7,
                                   panel_on_pin,
                                   border_pin,
                                   discharge_pin,

Monday, July 29, 2013

GDrive mounting released!

So version libferris-1.5.18.tar.xz is hot off the make dist; including this much ado about mounting Google Drive support. The last additional feature I decided to add before rolling the tarball was support for viewing and adding to the sharing information of a file. It didn't really do much for me being able to "cp" a file to google://drive without being able to unlock it for given people I know to have access to it. So now you can do that from the filesystem as well.

So, since the previous posts have been about the GDrive API and various snags I ran into along the way, this post is about how you can actually use this stuff.

Firstly run up the ferris-capplet-auth app and select the GDrive tab. I know I should overhaul the UI for this auth tool, but since it's mostly only used once for a web service I haven't found the personal desire to beautify it. So inside the GDrive tab, clicking on the "Authenticate with GDrive" button opens a dialog (should become a wizard), the first thing to do as it tells you is visit the console page on google to enable the GDrive API. Then click or paste the auth link in the dialog to allow libferris to get its hands on your data. The auth link goes to google and tells you what libferris is wanting. When you OK that you are given a "code" that you have to copy and paste back into the lower part of the auth capplet this dialog window. Then OKing the dialog will have libferris get a proper auth token from google and you are all set.

So to get started the below command will list the contents of your GDrive:

$ ferrisls google://drive

To put a file up on there you can do something like;

$ date >/tmp/sample.txt
$ ferriscp /tmp/sample.txt google://drive

And you can get it back with cat if you like. Or ferriscp it somewhere else etc.

$ fcat google://drive/sample.txt
Mon Jul 29 17:21:28 EST 2013

If you want to see your shares for this new sample file use the "shares" extended attribute.

$ fcat -a shares google://drive/sample.txt
write,monkeyiq

The shares attribute is a BINEBO (Bytes In Not Equal Bytes Out). Yay for me coining new terms! This means that what you write to it is not exactly what you will get when you read back from it. The handy part of that is that if you write an email address into the extended attribute, you are adding that person to the list of folks who can write to the file. Because I'm using libferris without FUSE and bash doesn't understand libferris URLs, I have to use ferris-redirect in the below command. You can think of ferris-redirect like the shell redirection (>) but you can also supply the extended attribute to redirect data into with (-a). If I read back the shares extended attribute I'll see a new entry in there. Google will have sent a notification email to my friend with a link to the file for me also.

$ echo niceguy@example.com \
| ferris-redirect -a shares google://drive/sample.txt
$ fcat -a shares google://drive/sample.txt
write,monkeyiq
write,Really Nice Guy

I could also add some hookup to your "contacts" to this, so your evolution addressbook nick names or google contacts could be used to lookup a person. In this case, with names changed to protect the innocent etc, so hypothetically google thinks the name for that email address is Really Nice Guy because he is in my contacts on gmail.

All of this extends to other virtual filesystem that libferris supports. You can "cp" from your scanner or webcam or a tuple of a database directly to google drive if that floats your boat.

I've already had a bit of a sniff at the dropbox API and others, so you might be able to bounce data between clouds in a future release.

Saturday, July 27, 2013

The new google://drive/ URL!

The very short story: libferris can now mount Google Drive as a filesystem. I've placed that in google://drive and will likely make an alias from gdrive:// to that same location so either will work.

The new OAuth 2.0 standard is so much easier to use than the old 1.0 version. In short, after being identified and given the nod once by the user, in 2.0 you have to supply a single secret, in 1.x you have to use per message nonce, create hashes, send the key and token, etc. The main drawback of 2.0 is that you have to use TLS/SSL for each request to protect that single auth token. A small price to pay, as you might well want to protect the entire conversation if you are doing things that require authentication anyway.

A few caveats of the current implementation: mime types on uploaded files are based on file name sniffing. That is because the upload you might be using cp foo.jpg google://drive and the filesystem copies the bytes over. But GDrive needs to know the mimetype for that new File at creation time. The GDrive PATCH method doesn't seem to let you change the mimetype of a file after it has been sent. A better solution will involve the cp code prenotifying the target location so that some metadata (mimetype) can be prefetched form the source file if desired. That would allow full byte sniffing to be used.

Speaking of PATCH, if you change metadata using it, you always get back a 200 response. No matter what. Luckily you also get back a JSON file string with all the metadata for the file you have (tried to) updated. So I've made my PATCH caller code to ignore the HTTP response code compare the returned file JSON to see if the changes actually stuck or not. If a value isn't set how it is expected my PATCH returns an exception. This is in contrast to the docs for the PATCH method which claims that the file JSON is only returned "if successful".

Oh yeah, one other tiny thing about PATCH. If you patch the description it didn't show up in Firefox for me until I refreshed the page. Changing the title does update the Firefox UI automatically. I guess the sidepanel for description hasn't got the funky web notification love yet.

There are two ways I found to read a directory, using files/list and children/list. Unfortunately the later, while returning only the direct children of a folder, also only returns a few pieces of information for those children the most interesting being the child's id. On the other hand the files/list gives you almost all the metadata for each returned File. So on a slower link, one doesn't need thinking music to work out if one round trip or two are the desired number. The files/list also returns metadata for files that have been deleted, and files which other's have shared with you. It is easy to set a query "hidden = false and trashed = false" for files/list to not return those dead files. Filtering on the server exclusively for files that you own is harder. There is a query alias sharedWithMe but no OwnedByMe to return the counter set. I guess perhaps "not sharedWithMe" would == OwnedByMe.

Currently I sort of ignore the directory hierarchy that files/list returns. So all your drive files are just in google://drive/ instead of subdirs as appropriate. I might leave that restriction in the first release. It's not hard to remove, but I've been focusing on upload, download, and metadata change.

Creating files, updating metadata, and downloading files from GDrive all work and will be available in the next libferris release. I have one other issue to cleanup (rate limiting directory read) before I do the first libferris release with gdrive mounting.

Oh and big trap #2 for the young players. To actually *use* libferris on gdrive after you have done the OAuth 2.0 "yep, libferris can have access" you have to go to code.google.com/apis/console and enable drive API for your account otherwise you get access denied errors for all. And once you goto the console and do that, you'll have to OAuth again to get a valid token.

A huge thank you for those two contributed to the ferris fund raising after my last post proposing mounting Google Drive!

Monday, July 22, 2013

Mounting Google Drive?

So on the heels of resurrecting and expanding the support for mounting vimeo as a filesystem using libferris I started digging into mounting Google Drive. As is normally the case for these things, the plan is to start out with listing files, then uploading files, then downloading files, then updating the metadata for files, then rename, then delete, and with funky stuff like "tail -f" and append instead of truncate on upload.

One plus of all this is that the index & search in libferris will then extend it's claws to GDrive as well as desktop files. As I&S is built on top of the virtual filesystem and uses the virtual filesystem to return search results.

For those digging around maybe looking to do the same thing, see the oauth page for desktop apps, and the meat seems to be in the Files API section. Reading over some of the API, the docs are not too bad. The files.watch call is going to take some testing to work out what is actually going on there. I would like to use the watch call is for implementing "tail -f" semantics on the client. Which is in turn most useful with open(append) support. The later I'm still tracking down in the API docs, if it is even possible. PUT seems to update all the file, and PATCH seems very oriented towards doing partial metadata updates.

The trick that libferris uses of exposing the file content through the metadata interface seems to be less used by other tools. With libferris, using fcat and the -a option to select an extended attribute, you can see the value of that extended attribute. The content extended attribute is just the file's content :)

$ date > df.txt
$ fcat -a name df.txt
df.txt
$ fcat -a mtime-display df.txt
13 Jul 23 16:33
$ fcat -a content df.txt
Tue Jul 23 16:33:51 EST 2013

Of course you can leave out the "-a content" part to get the same effect, but anything that is wanting to work on an extended attribute will also implicitly be able to work on the file's byte content as well with this mechanism.

If anyone is interested in hacking on this stuff (: good ;) patches accepted. Conversely if you would like to be able to use a 'cp' like tool to put and get files to gdrive you might consider contributing to the ferris fund raising. It's amazing how much time these Web APIs mop up in order to be used. It can be a fun game trying to second guess what the server wants to see, but it can also be frustrating at times. One gets very used to being able to see the source code on the other side of the API call, and that is taken away with these Web thingies.

Libferris is available for Debian Hard Float and Debian armel soft floating point. I've just recently used the armhf to install ferris on an OMAP5 board. I also have a build for the Nokia N9 and will update my Open Build Service Project to roll fresh rpms for Fedora at some stage. The public OBS desktop targets have fallen a bit behind the ARM builds because I tend to develop on and thus build from source on desktop.

Saturday, July 20, 2013

Like a Bird on a Wire(shark)...

Over recent years, libferris has been using Qt to mount some Web stuff as a filesystem. I have a subclass of QIODevice which acts as an intermediary to allow one to write to a std::ostream and stream that data to the Web, over a POST for example. For those interested, that code is in Ferris/FerrisQt.cpp of the tarball. It's a bit of a shame that Qt heavy web code isn't in KIO or that the two virtual filesystems are not closer linked, but I digress.

I noticed a little while ago that cp to vimeo://upload didn't work anymore. I had earmarked that for fixing and recently got around to making that happen. It's always fun interacting with these Web APIs. Over the time I've found that Flickr sets the bar for well documented APIs that you can start to use if you have any clue about making GET and POST etc. At one stage google had documented their API in a way that you could never use it. I guess they have fixed that by now, but it did sort out the pretenders from those two could at least sniff HTTP and were determined to win. The vimeo documentation IIRC wasn't too bad when I added support to upload, but the docs have taken a turn for the worst it seems. Oh, one fun tip for the young players, when one API call says "great, thanks, well done, I've accepted your call" and then a subsequent one says "oh, a strange error has happened", you might like to assume that the previous call might not have been so great after all.

So I started tinkering around, adding oauth to the vimeo signup, and getting the getTicket call to work. Having getTicket working meant that my oauth signed call was accepted too. I then was then faced with the upload of the core data (which is normally done with a rather complex streaming POST), and the final I'm done, make it available call. On vimeo that last call seems to be two calls now, first a VerifyChunks call and then a Complete call.

So, first things first. To upload you call getTicket which gives you an endpoint that is an HTTP URL to send the actual video data to, as well as an upload ticket to identify the session. If you try to post to that endpoint URL and the POST converts the CGI parameters using multipart/form-data with boundaries into individual Content-Disposition: form-data elements, you loose. You have to have the ticket_id in the URL after the POST text in order to upload. One little trap.

So then I found that verifyChunks was returning Error 709 Access to the chunk list failed. And that was after the upload had been replied to with "OK. Thanks for the upload.". Oddly, I also noticed that the upload of video data would hang from time to time. So I let the shark out of the pen again, and found that vimeo would return it's "yep were done, all is well" response to the HTTP POST call at about 38-42kb into the data. Not so great.

Mangling the vimeo.php test they supply to upload with my oauth and libferris credentials I found that the POST had a header Expect: 100-continue. Right after the headers were sent vimeo gave the nod to continue, and then the POST body was sent. I assume that just ploughing through and giving the headers followed by the body confused the server end and thus it just said "yep, ok, thanks for the upload" and dropped the line. Then of course forgot the ticket_id because there was no data for it, so the verifyChunks got no chunk list and returned the strange error it did. mmm, hindsight!

So I ended up converting from the POST the newly available PUT method for upload. They call that their "streaming API" even though you can of course stream to a POST endpoint. You just need to frame the parameters and add the MIME tailer to the POST if you want to stream a large file that way. Using PUT I was then able to verify my chunks (or the one single chunk in fact) and the upload complete method worked again.

In the end I've added oauth to my vimeo mounting, many thanks to the creators of the QOAuth library!

Friday, June 7, 2013

BeagleBone Black: Walking the dog.

My software guy with a soldering iron fun has recently extended to the BeagleBone Black. This is a wonderful little ARM machine with a 1Ghz CPU, a whole bunch of GPIO pins, I2C, SPI, AIN.. all the fun things packed into a $45 board.

On an unrelated purchase, I got a small 1.8 inch TFT display that can do 128x160 with a bunch of colours using the st7735 chip. That's shown above running the qtdemo on the framebuffer. Of course, an animation might serve to better show that off. The display was on sale for $10 and so it was then on it's way to me :) My original plan was to drive that from an Arduino... Looking around I noticed that Matt Porter had generously contributed a driver to run the st7735 over SPI from the Linux kernel. The video of him talking at ELC about this framebuffer driver was also very informative :) It seems the same TFT can be run from the Raspberry or Beagle series of hardware.

The wiring for the panel I got was a bit different than the adafruit one that Matt used. But once you have the pinouts its not so hard to figure out. I've currently left the 5V rail unconnected on my TFT. On the BeagleBone Black the HDMI output captures a whole bunch of pins when it starts. Unfortunately some of those pins are needed for the little TFT. One might be able to reroute the SPI to the other bus or mux the pins differently to get around that and have HDMI and the TFT at once. But I wanted to get the TFT going to see if/how it worked before changing the pins.

I had found some info on putting a line in eEnv.txt to stop the HDMI cape from loading but that didn't work for me. On my board I saw that in /sys/devices/bone_capemgr.9/slots the HDMI was the 5th cape. When I first echoed "-5" into the slots file to unload that cape the kernel gave a backtrace. If I did the same on a freshly booted bone it would cleanly remove the HDMI cape though. So something was using the HDMI cape driver before which didn't want to be removed.

With the HDMI cape unloaded the next step is to load a "firmware" file that reserves the pins that the st7735fb driver wants to use. Since I used the same pins on the bone as the adafruit display wants I could just use the below.

echo cape-bone-adafruit-lcd-00A0 > /sys/devices/bone_capemgr.9/slots

A dmesg showed that a new framebuffer device fb0 had come into existence.

[   85.280471] bone-capemgr bone_capemgr.9: slot #6: Requesting firmware 'cape-bone-adafru-00A0.dtbo' for board-name 'Override Board Name', version '00A0'
[   85.284645] bone-capemgr bone_capemgr.9: slot #6: dtbo 'cape-bone-adafru-00A0.dtbo' loaded; converting to live tree
...
[   86.235178] fb0: ST7735 frame buffer device,
[   86.235178] using 40960 KiB of video memory
[   86.236687] bone-capemgr bone_capemgr.9: slot #6: Applied #5 overlays.

After a bunch of searching around trying various things, I found that prescaling in mplayer can display to the framebuffer:

# mplayer -ao null -vo fbdev2:/dev/fb0 -x 128 -y 160 -zoom ArduSat_Open_Source_in_orbit.mp4

The qtdemo also runs "ok" by executing the below. I say ok because it obviously expects a higher resolution display than 128x160.... qtdemoE -qws

It is tempting to have two screens and add a touch sensitive film to them. With a QML/QtQuick/TodaysRebrand^TM interface the GUI should work well and be flickable to many screens.

A great hack I look forward to is running a 32x16 LED DMD using a deferred rendering framebuffer driver like the st7735fb does. I see the evil plan now, release the BeagleBone Black for $45 and draw more C/C++ programmers to being kernel hackers rather than userland ones :)

Wednesday, May 29, 2013

FontForge: Rounding out the platforms for binary distrubution

Earlier this year I made it simple to install FontForge on OSX. The process boiled down to expanding a zip file into /Applications. The libraries that fontforge uses have been all tinkered to work from inside the package, and the configuration files and other dynamically opened resources and theme are sought in the right place too.

Now after another stint I have FontForge running under 32bit Windows 7. So finally I had a use for that other OS sitting on my laptop for all this time ;) The first time I got it to run it looked like below. I created a silly glyph to make sure that bezier editing was responsive...

The plan is to have the theme in use so nice modern fonts are used in the menu, and other expected tweaks before making it a simple thing to install on Windows.

One, IMHO, very cool thing I did to get all this happening was to use the OpenSUSE Build System (OBS) to make the binaries. There are some DLL and header file drops for X floating around, but I tend to like to know where libraries that are being linked into the program have come from. Call me old fashioned. So in the process I cross compiled chunks of X Window for Windows on the OBS servers. My OBS win32 support repository contains these needed libraries, right through cairo and pango using the Xft backends to render.

There is a major a schism there: if you are porting a native GTK+2 application over to win32, then you will naturally want to use the win32 backends to cairo et al and have a more native win32 outcome. For FontForge however, the program wants to use the native X Window APIs and the pango xft backend. So you need to be sure that you can render text to an X Window using pango's xft backend to make your life simpler. That is what the pangotest project I created does, just put "hello world" on an X Window using pango-xft.

A big thanks to Keith Packard who provided encouragement at LCA earlier this year that my crazy cross compile on OBS plan should work. I had a great moment when I got xeyes to run, thinking that things might turn out well after the hours and hours trying to cross compile the right collection of X libraries.

I should also mention that I'm looking for a bit of freelance hacking again. So if you have an app you want to also run on OSX/Windows then I might be the guy to make that happen! :) Or if you have cool C/C++ work and are looking to expand your team then feel free to email me.

Wednesday, May 8, 2013

Save Ferris: Show some love for libferris...

Libferris has been gaining some KDE love in recent times. There is now a KIO slave to allow you to see libferris from KDE, also the ability to get at libferris from plasma.

I've been meaning to update the mounting of some Web services like vimeo for quite some time. I'd also like to expand to allow mounting google+ as a filesystem and add other new Web services.

In order to manage time so that this can happen quicker, I thought I'd try the waters with a pledgie. I've left this open ended rather than sticking an exact "bounty" on things. I had the idea of trying a pledgie with my recent investigation into the libferris indexing plugins on a small form factor ARM machine. I'd like to be able to spend more time on libferris, and also pay the rent while doing that, so I thought I'd throw the idea out into the public.

If you've enjoyed the old tricks of mounting XML, Berkeley DB, SQLite, PostgreSQL and other relational databases, flickr, google docs, identica, and others and want to see more then please support the pledgie to speed up continued development. Enjoy libferris!

Thursday, May 2, 2013

Indexing on limited hardware... what to do

Libferris supports many indexing libraries and technologies through its plugin interface. Larger systems can use a PostgreSQL plugin which is tailored explicitly to get the most out of that RDBMs for larger file server indexes. For smaller end, there are memory mapped files, clucene, soprano, or SQLite. I've been doing some tinkering trying to milk extra performance out of the indexing plugins for ARM machines lately. Note that if you are using debian, the CLucene you'll want is the 2.x series, currently only packaged for experimental.

For testing purposes I built a fairly tiny index of only 130k files. An interesting test case is looking for specific files which have paths that match against a regular expression and returns a fairly small chunk of results. For this case, about 115 resulting files using a four character substring search as the regex. These are a common query for looking for files when you don't recall the exact ordering of the directory names or where a directory was. Small number of results, regex to pick them.

The memory mapped index implementation (boostmmap) uses boost IPC and multi indexed collections created in memory mapped files to maintain the index. The index has also a digram index for each URL allowing regular expressions to resolve through index rather than needing evaluation against full URLs.

The SQLite index is fairly vanilla and doesn't include many customizations for sqlite. Whereas the PostgreSQL index implementation does use many of the features specific to that database. Neither the SQLite or boostmmap indexes in the public libferris repo attempt to do any compression on URL strings or the like.

A fairly basic index on 130k files is about 80mb using either memory mapped files or SQLite. Caches are cleared by echo 3 > drop_caches. Using an odroid-u2 with emmc flash, on a cold cache the SQLite index comes out about 10% faster than the boostmmap for a query finding 115 files. Turning off the regex prefilter index in the boostmmap makes it 10% slower again. This is a trade off, a very fast CPU and a disk with great file location and single extents will show less or no difference with the prefilter as reading 80mb from disk will take less time and the CPU can run 130k regexes very quickly. The prefilter requited only 124 regex evaluations, without the prefilter all 130611 URLs needed a regex evaluation.

The interesting part is with a warm cache the boostmmap is about twice as fast overall as the SQLite index. This is a big difference as the timing is for overall complete run time from the command line, and there is some overhead in starting up the index query itself. As usual, things vary depending on if you are expecting frequent queries (warm cache), have a very fast CPU (regex eval is relatively less costly), or need multiple updaters (SQLite allows it, my memory mapped doesn't).

To then experiment a little further, I brought the ferris clucene plugin into the mix. I disabled the explicit prefilter index on regex code for initial testing, the index became about 70mb and could resolve the query on a cold cache in about 65% the time of the SQLite plugin. On warm cache the clucene was slowest, which is mainly due to the prefilter being disabled and the fallback code making the URL query a WildcardQuery with no pre or postfix to anchor the query on.

Next time around I'll see how speed effective the prefilter index is on clucene. I know it slows down adding documents (you are indexing more), and is larger (I haven't optimized for index size), but it will be interesting to see the performance on the eMMC device for the prefilter.

Tuesday, April 30, 2013

libferris available for debian arm hard float

To compliment my existing packages of the libferris virtual filesystem and index/search suite for soft float debian, I now offer hot off the press, debian hard float! The distinction between how floating point is handled probably doesn't make a big difference to the operation of libferris, but if you are installing debian on an odroid-u2 then you are likely running hf, and as such having debs which are hf makes installing libferris a whole bunch simpler.

With the eMMC card on the u2, it is a really enjoyable little server machine to play around with. So far I've done the rudimentary test that XML is mountable as a filesystem and created one or two indexes with the Qt/SQLite index plugin for libferris. Note that in recent releases the sqlite backend is transaction backed which gives a huge performance increase, and on really IO constrained machines this is even more noticeable. This is a little tip for those using QtSQL, transactions are not just for making operations atomic, you may find that the whole show runs faster when it is transaction protected.

If you haven't played with libferris, things are auto mounted where possible, and there are many coreutils like tools to make interacting with ferris simple. The ferris-redirect is like the bash redirection but can write to any filesystem that libferris knows how to mount.

ben@odroidu2:/tmp/test$ cat example.xml
<top>
<node name="foo">bar
</node>
</top>
ben@odroidu2:/tmp/test$ fls example.xml
top
ben@odroidu2:/tmp/test$ fls example.xml/top
foo
ben@odroidu2:/tmp/test$ fcat example.xml/top/foo
bar
ben@odroidu2:/tmp/test$ date | ferris-redirect -T example.xml/top/foo
ben@odroidu2:/tmp/test$ cat example.xml
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<top>
<node name="foo">Tue Apr 30 14:57:26 PDT 2013
</node>
</top>

The above interaction would also work for mounted Berkely DB and other filesystems of course.

I have noticed one of the binary scoped destructors doesn't like the hard float build for whatever reason. This can cause some of the command line tools to not exit gracefully, which is a shame. I can't get a good backtrace for the situation either, which makes tracking it down a nice day long adventure into trial and error.

So something productive has been generated during the last round of jet lag after all!

The goods http://fuuko.libferris.com/debian/debian-armhf/

Save Ferris!

Thursday, March 28, 2013

FontForge: Rolling Type Design with No Save Using Collab

FontForge has some support for collaborative type design. It's early days, but things are moving along in the right direction. In order to test things I've been looking at the python scripting support with an eye to moving control points around etc, or doing design and modification from the script and having the collab server keeping up with things. This way I can "save" the type that is being designed at certain points and compare the saved font with that I would expect the result of multiple collaborators (the previous python scripts) should be. Automated testing for the win!

So, No Save is only for the "doing" scripts. I of course want to save the current type at my leisure :)

You might be wondering what a script that creates a glyph might look like. I had to do a bit of trial and error to figure out how to use the scripting API myself. With that in mind, I might roll some of my scripts into the mainline FontForge git repo so others can enjoy the little snippits to base their own scripts on.

Anyway, the following script will load a font and create a new capital "C" glyph. The core of this that wasn't that intuitive to me is that you have to set g.layers[] to the layer you got earlier from g.layers[] or the new contour will not show up in the /tmp/out.sfd file. See the MUST comment for the needed line.

import fontforge

f=fontforge.open("test.sfd")       
fontforge.logWarning( "font name: " + f.fullname )

g = f.createChar(-1,'C')
l = g.layers[g.activeLayer]
c = fontforge.contour()
c.moveTo(100,100)   
c.lineTo(100,700)   
c.lineTo(800,700)   
c.lineTo(800,600)   
c.lineTo(200,600)   
c.lineTo(200,200)   
c.lineTo(800,200)   
c.lineTo(800,100)   
c.lineTo(100,100)   
l += c
g.layers[g.activeLayer] = l  #### MUST do this for changes to show up

f.save("/tmp/new.sfd")

At first blush I was expecting to do something like
c = layer.createContour()
c.moveTo()
...
c.lineTo()
and then not have to do anything special. At that stage calling f.save() should know about the new contour etc and save. But without setting the g.layers[active] to the layer that contains the contour you will not see it.

Digging into the C code in python.c I see that point, contour etc are all basically abstractions for python use only. When you assign to a layer in the "g.layers[] = l" call, the C function PyFF_LayerArrayIndexAssign() calls PyFF_Glyph_set_a_layer() which uses something like SSFromLayer() to convert the python only data structure (contour or what have you) into a native "c" SplineSet object.

The good news is that with all this mining into python.c I now have some collab sprinkles in there. So when you do "g.layers[] = l" the FontForge in script mode will send updates to the layer off to the server as a collab update message.

The test is quite easy to run. As three consecutive scripts, start the collab server process (collab server remains running, script ends). Next attach to the collab server and update the C glyph, and finally attach to the collab server and grab all its data and save a out.sfd file.

fontforge -script collab-sessionstart.py
fontforge -script collab-sessionjoin-and-change-c.py
fontforge -script collab-sessionjoin-and-save-to-out.sfd.py

The middle script connects to the collab server and makes its changes with the python API and then exits. No Save. To know if the changes made it to the collab server, the last script grabs all the updates etc and builds the "current" font to save into /tmp/out.sfd.

It took a bit of hacking in the python code, but now the little changes to the contour (path) of the C glyph are sent to the server as one would expect.

The python scripts are still out of repo. Since they are interesting in and of themselves I'll likely put them into my fontforge fork as a prestage to having them mainline.

Now to move on to the next thing that needs to be send to the collab server and updated in all clients.

Monday, March 4, 2013

FontForge Design: Two is a party!

FontForge can now allow multiple people to collaborate on designing a font in real time. This is all still rather alpha level code, and in fact the pull request including this code was only pushed over the fence in the last hour ;) If real time collaboration sounds interesting to you, or something in this post seems of interest, then you might like to attend an upcoming Interactivos Workshop which is right after the Libre Graphics Meeting in Madrid (10-27 April).

At interactivos I will be expanding on the current collab code and discussing future directions and possible cross tool network specifications to allow different font tools to talk to each other. Even if you are not convinced that you want anyone else joining your collab session, you might like something like a Web sink watching your work and updating a page on your tablet as you modify the font so you can see in near real time if changes are going to work well on a target device and design or not. Why burden your workflow having to have your local fontforge export to an OTF and slow you down when a server can do all that for you in the background. There is no reason that the processes in the collab session can't all be running on your behalf.

If you are running osx then I have a binary drop for you which bundles all the needed stuff for collab at:
http://fuuko.libferris.com/osx/packages/201303/05_1630/

To start a server select the menu item Collaborate/Start Session... a dialog will show you the IP address the server will run on so you can start fontforge on another computer (+other OS) and connect to that address using Collaborate/Connect to Session...

Things are in very early days and focus has been only on the glyph view. It seems that there is some fluff holding the undo system from working on the metrics view (thus on kerning etc).

The design uses FontForge's undo/redo system to know what the changes are in the font, and since those changes can be serialized to an SFD format that code is reused to send and receive the undo information across a zeromq broadcast server.

The design is inline with the MCT (Merge Enabled Change Tracking) for ODF that is up on the OASIS lists. In FontForge's case, s baseline full SFD snapshot and sequential fragments thereafter describing updates.

If you are compiling from sources then you'll need czmq and zeromq version3 development stuff installed. Then grab master from github and enjoy.
https://github.com/fontforge/fontforge

Monday, February 18, 2013

Redland && Fedora 18

It appears that redland might be busted in some respects in Fedora 18. After a fresh install I recompiled libferris as per usual after an upgrade, only to find that the RDF/Soprano engine for metadata was busted. Digging into this, looking at the soprano sources and with many versions of soprano and sopranocmd I thought maybe the issue was one layer closer to the metal. Switching to rdfproc from redland I noticed that I couldn't create a new on disk RDF/db store!

$ rdfproc myrdf add \
'http://witme.sf.net/libferris-core/0.1/subj' \
'http://witme.sf.net/libferris-core/0.1/pred' \
'http://witme.sf.net/libferris-core/0.1/obj1'

rdfproc: Failed to open hashes storage 'myrdf'

After downloading redland's source rpm, recompiling it and installing the resulting rpm files I could then create RDF/db again. Same command, different result:

$ rdfproc myrdf add \
'http://witme.sf.net/libferris-core/0.1/subj' \
'http://witme.sf.net/libferris-core/0.1/pred' \
'http://witme.sf.net/libferris-core/0.1/obj1'
rdfproc: Added triple to the graph

I haven't verified this on a second Fedora 18 machine yet, but it might well mean that anyone trying to use the Berkeley db backends of redland on F18 would need an update to redland or to do some tinkering.

Friday, December 14, 2012

Cross platform package building..?

Sorry for the chatter post, but if anybody has recommendations for a tool that can build for win, osx, and lin that would be great. The project is an autofools one, mainly coded in 100s of kloc of C++. Builds like a treat on a Fedora machine, can be beaten unto submission to build on osx, and I assume on a suitably tainted windows machine it will gcc into binaries too. At least it has built on those other platforms in the past.

I've had great success with OBS, but that was mainly for Linux packages. It seems OBS can do mingw too, but I've not walked the valley of darkness into building for the more closed platforms on OBS before.

The saucelabs looks pretty cool, but it seems targeted to web code if I am reading it correctly.

The initial plan is to get 24hr rolling packages for all platforms and have feedback as to which day a github commit has broken the package build. It might be nice to have it for each github commit, but I think it would be easy enough to bisect a break given a 24 hour window unless an armada of contributors rushes at the ship.

A separate build issue I've been tinkering with in my mind for a while is grabbing from a github repo and creating android packages. Different code base for this though, mainly some of my n9 apps, as such, preferably for a mixed C++/QML app. But I think for that project I'll wind up taking my chisel and hammer and coming back with a cron job.

Wednesday, November 21, 2012

Libferris as a KIO Slave

The libferris virtual filesystem can now be exposed as a KIO Slove too. This allows you to use KDE applications to list, read and write the vast number of virtual filesystems that libferris makes available. For those who don't know, ferris is a little project I've been working on over the last decade to make everything a file: XML, db, relational data, web services, and even applications like XWindow, Amarok, pulseaudio, and gstreamer.

The following will create a Berkeley db4 file from the command line and show it to you. To dig into such files with libferris you can just read the file directly. So in this case I just grab the "base" directory in this db4 file with konq. You should see the sample and file2 files in that directory view and be able to load and save those "files" into kwrite. Sorry about the video being a tad jumpy, I have to work out what part of my desktop is causing that :/

Using libferris through the KIO interface from Ben Martin on Vimeo.

The commands as plain text:

$ cd /tmp
$ db45_load -T -t btree foo.db
base/sample
value here
base/file2
contents

$ db_dump -p foo.db
$ konqueror ferris:/tmp/foo.db/base

little tricks I found during the hacking, your listDir() method might call listEntry( e, false ) with a final call using ( e, true ). The last call is impotent with regard to "e" and just a finalizer call. The get() method uses data() to deliver bytes to the KIO user (application). The put() method uses dataReq() in a loop to grab chunks frrom the KIO user. Currently I have lazy methods, for example the put() just grabs everything from the KIO user and then operates on the data once it has everything. Really bad for 4gb files, but for smallish files to get a feel for things it works quite well.

Also, if you are using library init, in my case using gmodule to dynamically load some plugins from libferris itself, you might be in for a world of fun and games. Currently I spawn processes and interact with them from the KIO slave to get around this issue. I imagine for a more efficient implementation being able to tell KDE to load my KIO Slave as an application with a normal full init leading to a main() but haven't looked at that little technicality yet.

I've mostly been tinkering with kioclient, konq, and kwrite on libferris files at the moment. Things are turning out well, though there are still many glitches to this early days integration. This will be released in the next libferris version once I clean it up a bit more.

Monday, November 19, 2012

Mounting KIO with libferris

I know there are more folks interested in going the other way -- seeing libferris through kio glasses. Not that there are hoards of folks in either camp of course. Nonetheless the first move has been to allow libferris to mount kio.

The groundwork is very similar to what I'm thinking of using to allow kio to mount libferris. Top level URL schemes will appear first, allowing you to dig into each URL scheme. For example, in libferris kio appears under the kio: filesystem. The first directory in that filesystem is the KIO URL scheme to use. So something like the following will work:

$ date >| /tmp/df.txt
$ fcat kio:file:/tmp/df.txt

Some of the more fun KIO slaves to access through this are the man and fonts URLs. The following two commands produce the same man page:

$ fcat kio:man:/man
$ kioclient cat man:/man

Support is preliminary and only allows reading the files but not writing to them through the kio: filesystem as yet. Already though the kio: can be exposed to XQuery, SQLite, and through the libferris REST interface. Yay for cooperating!

I notice some really juicy digikam kio slaves but haven't dug into them enough to use them as yet. Although you can already upload to various web services from digikam, once I get access to digikamalbums through ferris kio mounting I can then 'cp' the images directly from the command line to other things that libferris can already mount such as flickr API sites, printers etc.

Sunday, October 14, 2012

Mini Xplus Hard Float

Since the A10 chip has floating point support in hardware, I thought I'd bring my "little machine" experience kicking and screaming into the 486DX era... After reading around I found a thread on Linaro 12.06 armhf and was soon up and running on the minix 1gb machine.

For an initial test I ran the following
time cjpeg -quality 90 img_01_l.pnm >| out.jpg
on a sample jpeg image from Nikon at
http://imaging.nikon.com/lineup/dslr/d3s/sample.htm

Three machines for cross comparison, an Intel 2600k, an kirkwood 2ghz running debian armel, and the Linaro hard float on the minix running an A10 at 1ghz.

0m0.097s 2600K
0m1.236s 1ghz A10 MiniX HF
0m3.250s 2ghz kirkwood armel

So you can see a clear advantage for the hardfloat; a chip at twice the clock rate needs 2.6x the time to perform the same process.

Friday, October 12, 2012

Mini Xplus

So I got to tinker with one of the miniand Xplus devices. About $100 lands you an A10 with 1gb of RAM and a HDMI output. Comes with android 4 and is very easy to get to run Fedora 17.

Under android this thing ranges down to 2 watts with wifi up, around 3 under moderate use and 4 watts with a reasonable browser load put on it (mjpeg streaming to firefox). All with wifi up.

My growing little "openssl speed" comparison now has another value point. For ciphers this A10 does about as well as an n9. For md5 shown below, it is the best cat in the camp by a good way:

The next tests will be the floating point and simd stuff, which is why I'm interested in the chip. One little discovery, the wifi seems to be very short range, the nexus 7 is happy to connect to one of my APs but I get the minix dropping the connection all the time to that AP from the same room. Don't let the cute aerial fool you it seems.

Monday, July 30, 2012

Off the Record Messaging: A Tutorial

OK, a rather long post about effective digital communication. Hopefully an interesting read to folks who would like to add some code to protect communications but haven't gotten around to that TODO item just yet.

A commonly used method for sending messages to others when you need authentication and privacy is to use an OpenPGP tool such as GNU Privacy Guard (GnuPG). For real time communications such as instant messaging, IRC, and socket IO, using Off The Record (OTR) messaging provides Perfect Forward Secrecy and secure identification of the remote party without the need for a web of trust.

In order to operate without a web of trust, libotr implements the Socialist Millionaires' Protocol (SMP). The SMP allows two parties to verify that they both know the same secret. The secret might be a passphrase or answer to a private joke that two people will easily know. The SMP operates fine in the presence of eaves droppers (who don't get to learn the secret). Active communications tampering is not a problem, though of course it might cause the protocol not to complete successfully.

Because the SMP doesn't rely on the fingerprint of the user's private key for authentication, the private key becomes almost an implementation detail. Once generated, the user generally doesn't need to know about the key or it's fingerprint. The only time a user really cares to know is when a key is created because a bit of entropy has to go into that process. Of course, an application should avoid regenerating keys for no reason because each time the key is replaced the user has to use the SMP again to allow remote parties to authenticate them.

In this article I'll show you how to use the current release, libotr 3.2.0+, to provide OTR messaging. I'll present two examples which are both in C++ and use the boost library for socket IO. I have gone this was so we can focus on the OTR action and not the details of sockets.

The first example does not use the Socialist Millionaires' Protocol (SMP). So the new_fingerprint() callback is essential to establishing a secure session. When not using the SMP, authentication is performed by comparing the sent fingerprints of those you are wishing to communicate with against known good values. These known values must be sent beforehand through a secure secondary channel, such as a face to face meeting. Once fingerprints have been accepted, subsequent OTR communications with the same party can be performed without explicit fingerprint verification.

The second example makes things simpler for the user by using the SMP for authentication of the remote party. This way, the information exchanged beforehand becomes shared experiences you and the other party have had such that a question can be raised that only you and they can easily answer.
A central abstraction in using the libotr library is the struct s_OtrlMessageAppOps vtable. This is used by libotr to callback into your code when something happens such as a cryptographic fingerprint being received, or libotr wanting to send a message to the other end. The later happens frequently during OTR session establishment.

If a program monitors it's socket IO using select() or some other mainloop abstraction, then having these internal protocol messages being sent is not so much of an issue. Alas, for the simple echo server I present one must remember that there might be one or more internal OTR protocol messages sent from what seems like outside of the normal program flow. I'll get back to this point while describing the relevant section of the first example.

Many of the callback functions in s_OtrlMessageAppOps might be simple stubs, but you should be aware of inject_message() which will be called when libotr itself wants to send something, notify and display_otr_message can both provide feedback to the user, the new_fingerprint() method is called when a remote key is discovered in order to allow you to inform the user and possibly abort the session. The gone_secure() method is called to allow you to inform the user that they are off the record. When you call libotr functions you supply both a pointer to a s_OtrlMessageAppOps structure uiops and a void* opdata. When libotr calls a method in uiops it will pass opdata back to you.

Another common three parameters you will pass to libotr functions are the accountname, protocol and sender or receiver name. The protocol string can be anything as long as both ends of the system use the same protocol string. The state data that libotr uses is stored in an OtrlUserState object which is created with otrl_userstate_create() and passed to many of the libotr functions along the way.

The code below loads a private key or creates a new one if none already exists. Because creating a new key is an entropy heavy operation, the setupKey() function warns the user that if they are erratic it the process might move along a bit quicker. Note that the uiops has a callback create_privkey to generate a key if needed. I just prefer to make this codepath explicit and out of the main callback logic.

 bool ok( gcry_error_t et )
{
    return gcry_err_code(et) == GPG_ERR_NO_ERROR;
}

void setupKey( const std::string& filename )
{
    gcry_error_t et;
    
    et = otrl_privkey_read( userstate, filename.c_str() );
    if( !ok(et) )
    {
        cerr << "can't find existing key, generating a new one!" << endl;
        cerr << "this needs a bunch of entropy from your machine... so please" << endl;
        cerr << "move the mouse around and slap some keys mindlessly for a while" << endl;
        cerr << "a message will be printed when keys have been made..." << endl;
        et = otrl_privkey_generate( userstate, filename.c_str(),
                                    accountname, protocol );
        if( !ok(et) )
        {
            cerr << "failed to write new key file at:" << filename << endl;
        }
        cerr << "Have keys!" << endl;
    }
}

The main.cpp program implements both the client and server. The server mode is selected by passing -s at startup. Firstly, a userstate is created, some variables set depending on if we are a client or server, and the correct private key is loaded or created.

    OTRL_INIT;
    userstate = otrl_userstate_create();

    keyfile = "client.key";
    accountname = "client";
    recipientname = "server";
    if( ServerMode )
    {
        keyfile = "server.key";
        accountname = "server";
        recipientname = "client";
    }
    setupKey( keyfile );

The core logic for the echo client is to read a string from the user, send it to the server, grab a reply from the server and show it to the user. The start of the client code connects to a given port on localhost and reads a string from the user.

        VMSG << "client mode..." << endl;
        stringstream portss;
        portss << Port;
        iosockstream stream( "127.0.0.1", portss.str() );
        if (!stream)
        {
            cerr << "can't connect to server!" << endl;
            exit(1);
        }

        string s;
        while( true )
        {
            getline(cin,s);
            cerr << "your raw message:" << s << endl;
            cerr << "send plaintext:" << colorsend(s) << endl;

We certainly do not want to send the raw string s over the wire to the server though. That would very much be "on the record". So the next fragment of the client gets libotr to encrypt the string s so we can send it off the record to the server. The userstate is the value created during program initialization using otrl_userstate_create(). The ui_ops is the vtable s_OtrlMessageAppOps structure described above, and opdata is the value we want libotr to pass back to our methods in ui_ops when it uses them. In this case, we use the address of the iostream for the socket as the opdata so callbacks can send and receive data on the socket if they so desire. The newmessage will point to an off-the-record message that the server can decrypt to read the string s. The tests on the return value for message_sending() ensure that we have a new, encrypted off the record message to send instead of the plaintext s.

 void* opdata = &stream;
OtrlTLV* tlvs = 0;
gcry_error_t et;
char* newmessage;

void* opdata = &stream;
OtrlTLV* tlvs = 0;
gcry_error_t et;
char* newmessage;

et = otrl_message_sending( userstate, &ui_ops, opdata,
                           accountname, protocol, recipientname,
                           s.c_str(), tlvs, &newmessage,
                           myotr_add_appdata, &ui_ops );
cerr << "encoded... ok:" << ok(et) << endl;
if( !ok(et) )
{
    cerr << "OTR message_sending() failed!" << endl;
}
if( ok(et) && !newmessage )
{
    cerr << "There was no error, but an OTR message could not be made." << endl;
    cerr << "perhaps you need to run some key authentication first..." << endl;
}
if( newmessage )
{
    VMSG << "have new OTR message:" << newmessage << endl;
    s = newmessage;
}

Since we have replaced the plaintext s with the off the record version, we send that to the server using the socket iostream and then wait a moment before reading a response. The while loop is slightly hairy in that it will block for new messages if we are not secure. As I mentioned above, libotr can call the inject_message() callback to write a new off the record message to the socket. Outgoing messages will be generated and injected during session establishment. There is no incoming version of inject_message() so the client needs to keep reading these injected messages before it tries to send another off the record message. One will find that there are many messages exchanged between libotr at each end when the string s is written to the socket. This only happens the first time through to setup the OTR protocol.

When reading messages from the server, the encrypted string is read and passed to otrl_message_receiving(). If the recevied message was an OTR message that was sent from the other end by libotr using inject_message() then otrl_message_receiving() will indicate to the client that it should simply ignore this message. Otherwise a real message was encrypted and sent by the server and so the client will show the user the decrypted newmessage.

 cerr << "WRITE:" << s << endl;
stream << s << endl;
usleep( 200 * 1000 );
while( !secure && stream.peek() != std::iostream::traits_type::eof()
       || secure && stream.rdbuf()->available() )
{
    s = "junk";
    VMSG << "reading data from server" << endl;
    getline(stream,s);
    VMSG << "READ:" << s << endl;

    int ignore_message = otrl_message_receiving(
        userstate, &ui_ops, opdata,
        accountname, protocol, recipientname,
        s.c_str(),
        &newmessage,
        &tlvs,
        myotr_add_appdata, &ui_ops );

    VMSG << "ignore:" << ignore_message << " newmsg:" << maybenull(newmessage) << endl;
    if( ignore_message )
    {
        VMSG << "libotr told us to ignore this message..." << endl;
        VMSG << "available:" << stream.rdbuf()->available() << endl;
        VMSG << " in_avail:" << stream.rdbuf()->in_avail() << endl;
        
        continue;
    }
    if( newmessage )
        s = newmessage;
    otrl_message_free( newmessage );

    cout << color( s ) << endl;
}

Server mode is handled by a thread which executes server_session() using the std::iostream for the new socket.

if( ServerMode )
{
    VMSG << "server mode..." << endl;

    boost::asio::io_service io_service;
    tcp::acceptor a( io_service, tcp::endpoint( tcp::v4(), Port ));
    for (;;)
    {
        h_iosockstream stream(new iosockstream());
        a.accept( *(stream->rdbuf()) );
        boost::thread t(boost::bind(server_session, stream));
    }
}

The server implementation would look like the below if OTR messaging was not being used.

void server_session( h_iosockstream streamptr )
{
    iosockstream& stream = *(streamptr.get());
    while( stream )
    {
       std::string s;
       getline( stream,s );
       cout << "server got:" << s << endl;
       stream << s << endl;
    }
}

The OTR server implementation starts out the same way, reading a string from the socket. Then our old friend otrl_message_receiving() is called to decrypt that message. If ignore_message is set then there is nothing to be done and we simply continue to the top of the loop to read another string from the client. Also, if we are not yet secure, there is no point in trying to send a new OTR message back to the client, so we simply continue at the top of the while loop again. This way we avoid writing replies to the client when session establishment messages are sent by libotr on the client side.

This might seem a little strange at first, how will we ever become secure and start replying to the client if all we do is read from them and throw away the messages. The thing to keep in mind is that messages sent with inject_message() on the client will be seen by libotr when we call otrl_message_receiving() which in turn might cause libotr on the server to inject_message() with a reply to this session establishment message. Eventually libotr will call the gone_secure() OtrlMessageAppOps callback in which we set the global variable secure to true, this allowing the server to start replying to the client as it normally would.

void server_session( h_iosockstream streamptr )
{
    iosockstream& stream = *(streamptr.get());
    while( stream )
    {
        gcry_error_t et;
 std::string s;
 VMSG << "getting more data from the client..." << endl;
 getline( stream,s );
 VMSG << "READ:" << s << endl;
                    
 void* opdata = &stream;
 OtrlTLV* tlvs = 0;
 char *newmessage = NULL;
 int ignore_message = otrl_message_receiving(
    userstate, &ui_ops, opdata,
    accountname, protocol, recipientname,
    s.c_str(),
    &newmessage,
    &tlvs,
    myotr_add_appdata, &ui_ops );

 VMSG << "ignore:" << ignore_message << " newmsg:" << maybenull(newmessage) << endl;
 if( newmessage )
        s = newmessage;
 otrl_message_free( newmessage );
 if( ignore_message )
 {
     VMSG << "libotr told us to ignore this message..." << endl;
     continue;
 }
                
 cout << "ignore:" << ignore_message << " server got:" << s << endl;
 cout << "message from client:" << color(s) << endl;

 // do not echo back messages when we are establishing the session
 if( !secure )
     continue;

The remainder of server_session() creates the echo reply message, encrypts it with otrl_message_sending() and sends the OTR message over the socket.

  static int count = 0;
  stringstream zz;
  zz << "back to you s:" << s << " count:" << count++;
  s = zz.str();
  cout << "writing...s:" << s << endl;
  cerr << "send plaintext:" << colorsend(s) << endl;

  et = otrl_message_sending( userstate, &ui_ops, opdata,
     accountname, protocol, recipientname,
     s.c_str(), tlvs, &newmessage,
     myotr_add_appdata, &ui_ops );
  if( !ok(et) )
  {
     cerr << "OTR message_sending() failed!" << endl;
  }
  if( ok(et) && !newmessage )
  {
     cerr << "There was no error, but an OTR message could not be made." << endl;
     cerr << "perhaps you need to run some key authentication first..." << endl;
  }
  if( newmessage )
  {
     VMSG << "have new OTR message:" << newmessage << endl;
     s = newmessage;
  }
                
  VMSG << "writing otr...s:" << s << endl;
  stream << s << endl;

As the security of the OTR messaging relies on fingerprints in the first example, the new_fingerprint callback presents our fingerprint and the remote fingerprint and asks the user if they want to continue to establish the session or not. Unforuntately this means the user has to eyeball scan the remote fingerprint against an expected value they have obtained from the remote party at some other time in a secure channel.

static void myotr_new_fingerprint( void *opdata, OtrlUserState us,
                                   const char *accountname, const char *protocol,
                                   const char *username, unsigned char fingerprint[20])
{
    cerr << "myotr_new_fingerprint(top)" << endl;

    char our_fingerprint[45];
    if( otrl_privkey_fingerprint( us, our_fingerprint, accountname, protocol) )
    {
        cerr << "myotr_new_fingerprint() our   human fingerprint:" << embold( our_fingerprint ) << endl;
    }
    
    cerr << "myotr_new_fingerprint() their human fingerprint:"
         << embold( fingerprint_hash_to_human( fingerprint )) << endl;
    cerr << "do the fingerprints match at the remote end (enter YES to proceed)" << endl;
    std::string reply;
    getline( cin, reply );
    if( reply != "YES" )
    {
        cerr << "You have chosen not to continue to talk to these people... good bye." << endl;
        exit(0);
    }
}

Simpler authentication with SMP

The second example uses the SMP to avoid having to verify fingerprints. For good measure, the fingerprints established are saved and loaded to/from disk so that subsequent conversations do not need any SMP or user fingerprint verification.

During process startup, fingerprints are read from file if they exist;

 std::stringstream fn;
 fn << "fingerprints-" << accountname;
 gcry_error_t e = otrl_privkey_read_fingerprints( userstate, fn.str().c_str(), 0, 0 );

The otrl_message_sending() and otrl_message_receiving() functions both have a parameter OtrlTLV *tlvs. The tlvs allow data to be sent and received as sideband information that does not effect what you send with libotr. The SMP uses the tlvs to communicate the information that it needs in order to authenticate.
In server_session() the main change is a check on the tlvs variable after calling otrl_message_receiving().

 if( tlvs )
 {
    handle_smp( stream, tlvs, userstate, &extended_ui_ops, opdata );
 }

The client initiates the SMP and has heavier changes to it's code. After creating a iosockstream to localhost, the client calls run_smp_client() to setup the OTR session and run the SMP to authenticate. Apart from the call to run_smp_client() the client mainloop while(true) doesn't need to change. This makes sense because the SMP is normally only used at session establishment when we do not know about the remote key (fingerprint) already.

In the run_smp_client function, the first while( !secure... loop will establish an OTR session using fingerprints just like the first example. This time we do not stop to ask the user to verify the fingerprints, we simply record that a new fingerprint was seen. This is done by setting runSMP=true to force the SMP if we are using a fingerprint that we didn't already have on disk.

If runSMP is set then we read a secret from the user and call otrl_message_initiate_smp() to get the SMP ball rolling with libotr. This leads to the second while( !secure loop which will stop when we are secure again.

void run_smp_client( iosockstream& stream )
{
    void* opdata = &stream;
    OtrlTLV* tlvs = 0;

    // establish session using fingerprints
    stream << "?OTR?v2?" << endl;
    usleep( 200 * 1000 );
    while( !secure && stream.peek() != std::iostream::traits_type::eof() )
        client_read_msg_from_server( stream );

    if( !runSMP )
    {
        return;
    }
    
    VMSG << "Starting the Socialist Millionaires' Protocol " << endl
         << " to work out who the other guy is..." << endl
         << endl;
    VMSG << "please give me a secret that only you and the other guy know..." << endl;
    std::string s;
    getline( cin, s );
    int add_if_missing = true;
    int addedp = 0;
    ConnContext* smpcontext = otrl_context_find( userstate,
                                                 recipientname, accountname, protocol,
                                                 add_if_missing, &addedp,
                                                 myotr_add_appdata, &ui_ops );

    cerr << "addedp:" << addedp << " smpcontext:" << smpcontext << endl;
    if( !smpcontext )
        return;
    otrl_message_initiate_smp( userstate, &ui_ops, opdata, smpcontext,
                               (const unsigned char*)s.c_str(), s.length() );

    // we are only secure if the SMP succeeds
    secure = 0;
    while( !secure && stream.peek() != std::iostream::traits_type::eof() )
        client_read_msg_from_server( stream );
        
    cerr << "secure:" << secure << endl;
    if( secure == SMP_BAD )
    {
        cerr << "couldn't authenticate server, exiting..." << endl;
        exit(1);
    }
}

The client_read_msg_from_server() function calls otrl_message_receiving() and checks if tlvs is set and if so calls handle_smp() with that tlvs value.

As you see from the above, whenever a tlvs is set in the client or server then handle_smp() is called. If you look at the UPGRADING file in libotr 3.2.0+ you will see a skeleton code in "3.3.4. Control Flow and Errors" which the handle_smp() is based on. The handle_smp() function uses otrl_tlv_find() on tlvs to check for internal OTR messages sent from libotr itself which describe a stage in the SMP. handle_smp() is like a primitive state machine working through from SMP1 (the server asking for the secret to respond to the client's initial request), through to SMP3 and SMP4 which are called when the protocol completes with either success or failure (same or different secrets).

  if( tlv = otrl_tlv_find(tlvs, OTRL_TLV_SMP2))
  {
    if (nextMsg != OTRL_SMP_EXPECT2)
    {
       cerr << "smp: spurious SMP2 received, aborting" << endl;
       otrl_message_abort_smp( userstate, ui_ops, opdata, smpcontext);
       otrl_sm_state_free(smpcontext->smstate);
    }
    else
    {
       cerr << embold("SMP2 received, otrl_message_receiving will have sent SMP3") << endl;
       smpcontext->smstate->nextExpected = OTRL_SMP_EXPECT4;
    }
  }

If the secrets are proven to be the same when the SMP is used it is adventagious to save the fingerprints to disk so that future communications do not require user fingerprint verificaiton or the SMP.

if(  tlv = otrl_tlv_find(tlvs, OTRL_TLV_SMP4) 
  || tlv = otrl_tlv_find(tlvs, OTRL_TLV_SMP3))
 {
 if( smpcontext->smstate->sm_prog_state == OTRL_SMP_PROG_SUCCEEDED )
   {
     std::stringstream fn;
     fn << "fingerprints-" << accountname;
     gcry_error_t e = otrl_privkey_write_fingerprints( userstate, fn.str().c_str() );
   }
}

Hopefully you are now in a better position to add libotr support to your real time network programs. The full source code to these programs as well as the HTML for this post itself is up on my github page. Remeber, using off the record messaging doesn't nessesarily mean you have anything to hide, just that you have nothing to show.