Dr. MonkeyIQ: 2007

Saturday, December 22, 2007

Eclipse... now it runs ok, but...

Yes, with heaps of RAM and a ulimit higher than 90% of the apps I execute, eclipse runs up... Don't get me wrong, it can have 2gb of RAM if it wants, if it works well...

The emacs bindings will leave you in despair. Some bindings that an emacs user expects to exist at a fundamental level are C-x C-b which almost happens. But you are left with a dropdown of buffer names and the most recent non active buffer is not highlighted by default.

Cap SenSitive search and replace seems to be coming in 2008. And although you can start an incremental search with C-s without a dialog hitting C-s C-w C-w will not grab a nice chunk of identifier to isearch-forward with. The s-and-w key being so close one quickly gets into the habit of that for isearching. The search and replace bug makes me wonder how folks actually use eclipse to edit code, though I guess you don't miss what you didn't have.

So ye olde emacs will likely live on for another 2-5 years as my C++ editor of choice. Either that or somebody will drop enough cash to actually get eclipse-emacs working to some retro level that can be somewhat tolerated. I guess emacs users are not in the target demographic of eclipse anyway.

Friday, December 21, 2007

Keeping up with the joneseses

After noticing that the boost::serialization binary format is platform dependent recently some folks told me that I should maybe just use the text format for b::serial. The claims of relative efficiently being hard to tell came up of course. So this time I actually benchmarked with valgrind.

For text archive, 223,879,650 instructions and 2.97% overall runtime but using binary archive 2.1% relative runtime and 140,057,074 instructions. The relative runtime is for a short app interaction, not just application startup.

Since the b::ser is only used for loading data that is changed very infrequently but is loaded all the time I might end up having the text version as a fallback just in case you change architectures libferris can try to load the binary version, fail and then try to load the text version and force a resave right at that point. This means loading after an architecture switch will be about 5-6 times slower, but only the first time you run any libferris app after the arch change.

Fun to look ahead to the maemo / moko invasion of the ferris.

I added a few little optimizations here and there to libferris for the next release too. Many of these make the filemanager more responsive in some use cases I have... YMMV.

Tuesday, November 27, 2007

That's one hell of a package diane

Now I have packages of libferris for Fedora 7 and 8 and opensuse 10.3. So the next obvious move for a someone who has rolled many an rpm would be to jump into the deep end of debian packaging...


root@ubuntu710:/tmp/PKG/libferris-1.2.1# ls -lh  ../*deb
-rw-r--r-- 1 root root 2.4M 2007-11-28 04:39 ../ferris_1.2.1-1_i386.deb
-rw-r--r-- 1 root root 5.4M 2007-11-28 04:40 ../libferris1_1.2.1-1_i386.deb
-rw-r--r-- 1 root root 228K 2007-11-28 04:39 ../libferris-dev_1.2.1-1_i386.deb
-rw-r--r-- 1 root root 249K 2007-11-28 04:40 ../libferrisui1_1.2.1-1_i386.deb
-rw-r--r-- 1 root root  64K 2007-11-28 04:40 ../libferrisxslt1_1.2.1-1_i386.deb

I still need to actually add some pre/post scripts and the likes but it is looking somewhat decent so far. The downside is that the default builds for all three distros are somewhat conservative leaving out many of the higher level, higher dependency functionality. But for folks just wishing to mount xml, db4 and do some desktop search these packages should work. Though you will get reduced selection of what index formats are supported etc.

Tuesday, November 13, 2007

Open-C for s60, so close but yet so far

Like the vast tribes who look over /. from time to time I noticed the new rewards for developing for Google's new mobile platform. I found this quite interesting considering the recent openc comp and the differences between the pecuniary motivation offered by the two camps.

I tried to port over rsync to s60 using open-c. The massive problems with the API not offering anything to cleanly emulate fork() and exec() are bad enough, try submitting a patch to an open source project with ugly stuff in a #ifdef clause to try to emulate these calls with pthreads and see how far you get. I haven't tried this with the rsync guys yet as I haven't managed to get it running yet :/

The killer, so to speak is the lack of signal() and kill() calls. The self IPC that is quite common of having an app fork() itself and signal itself or wait for itself at a later time. Waiting is one thing, that can be quite readily ported to pthreads type calls using IPC mechanisms. When a parent sends for example USR2 signal to the child is where you run into unbounded joy. Especially where all of this is happening inside a do_recv() function which is used quite heavily in both client and server mode.

I might throw together a kludge for it at some point because it actually compiles (though will not run because I haven't ported some fork() code yet). There are three or so places that use fork() which will need attention. It compiles because I have a nasty pthreads implementation of fork() in an effort to avoid dumping a huge bunch of garbage boilerplate code into apps to be ported. I don't know why the symbian dudes didn't include a int s60_pthreads_fork() implementation....

Thursday, October 25, 2007

Berkeley db and libstldb4 for s60

It turns out the problem I was having with bdb on s60/openc was that the default stack size is very small (by desktop standards). Adding EPOCSTACKSIZE 65536 to the mmp file of the test client made things work as one would expect when the sis file is installed on the device itself.

If your stack is too small then the application just magically goes away on the device and you are left wondering why. Not only that but if the function that smashes the stack with auto variables is tracked down you will have a function that, once entered, will have a corrupt stack right from the start. Functions like open(2) will not work properly because things are already on their way to the bit bucket. These sorts of things take a while to track down when you are used to stacks that adjust their size when they get full.

So, long and the short, stldb4 now works on s60/openc :)

I notice that there is also SDL for symbian which when taken together with the fact that evas has a backend engine for SDL makes edje/evas GUIs for desktop and s60 seem like a good choice. Being able to debug the look and feel of the app on a desktop using a window of device size and then just running the edje on the s60 to see if any special effects run fast enough.

Tuesday, October 23, 2007

Symbian and Open-C: A porting we shall go

Yay, I now have libsigc++, libferrisloki, libferrisstreams and stldb4 compiling and all but the last one actually useful on an Nokia E61. It makes a huge difference to the whole coding for embedded knowing that you have decent std iostreams and intrusive reference counted objects.

The stldb4 is a pain in though as it crashes in the test client. Its just so much fun that the remote debug stuff is work thousands of Euros, making on device testing just so easy for folks who dont have suitcases of cash laying around.

Looks like ye olde cout << "...1" << endl; on device debugging until I manage to move something like Enamel out of libferris onto the s60. Perhaps a port of syslog to symbian would help, atleast I'd be able to get network streamed log events.

Very few changes actually were required in libferrisstreams. Unfortunately the madvise() call was one of them. This gets used in ferrisstreams for memory mapped IO to tell the kernel of sequential IO access with MADV_SEQUENTIAL. This is probably more relevant to desktop and server machines than embedded anyway, seek times being so different between a flash disk and regular HDD.

All I need to do now is actually iron out why Berkeley db-4.x on symbian is deciding to do funny things for me and I'll have quite a nice little start making a nicer coding environment for s60.

Shame I'm too late for the little Open-C port a posix contest that was floating around.

Friday, September 28, 2007

Mounting DBus. Nothing is sacred anymore....

In the next release of libferris (due out $DATE) there will be initial support for mounting dbus. Not everything can be mounted effectively, but interaction with simple APIs is very easy from the command line :)

DBus methods are callable just like postgresql functions. I also added the option to use the [] bracket pairs to call a method so that you don't always have to escape a call from bash.

Command lines can get longish with descriptive service and interface names.
fcat dbus://localhost/session/org.freedesktop.DBus.Examples.Echo/org/freedesktop.../org.freedesktop.DBus.EchoDemo/Random[]

But one major advantage this has is that you can use the file manager to browse a dbus filesystem. Of course you can then drag and drop a full method name (because its just a url after all) to the shell to add some arguments and whack return.

Handling the stranger input/output argument pairing is something that will be delayed until I actually need to do it. Wallet/Code patches welcome as usual ;-p

So there is now one fewer things on the list of things that libferris can not optionally mount.

Saturday, September 8, 2007

DBus for filesystem metadata... now working for some types.

And now the DBus magic is complete... getting metadata out of process is quite simple for all clients in an async manner. Clients are nicely isolated from metadata extractors deciding to segv and the API is non blocking... yes it is a little bit slower to do this over the wire (in fact over two wires bidirectionally because the broker actually talks to a worker to do the actual work). But with libferris and its automatic RDF caching the hit is only felt the first time you extract for many of the more expensive operations anyway. On a multicore machine it can actually be faster because you can have many workers attacking files for metadata at once.

I pass the URL back in the signals so that clients don't have to worry about mapping reqid -> url in their process. Such annoying things should be done by the broker for you at the slight cost of sending URLs across process boundaries in both directions.

<node name="/net/sf/witme/libferris/Metadata">
<interface name="net.sf.witme.Libferris.Metadata.Broker">

<method name="asyncGet">
<arg type="s" name="earl" direction="in"/>
<arg type="s" name="name" direction="in"/>
<arg type="i" name="reqid" direction="out"/>
</method>

<signal name="asyncGetResult">
<arg type="i" name="reqid" />
<arg type="s" name="earl" />
<arg type="s" name="name" />
<arg type="ay" name="value" />
</signal>

<signal name="asyncGetFailed">
<arg type="i" name="reqid" />
<arg type="s" name="earl" />
<arg type="i" name="eno" />
<arg type="s" name="ename" />
<arg type="s" name="edesc" />
</signal>

Monday, August 27, 2007

dbus desktop metadata API

Hi,
I've started a little light hacking allowing metadata extraction to work out-of-process in libferris. I really want metadata extraction to be able to scale to 4-8 threads of concurrent extraction and caching for use on the 4-8 core machines of today and tomorrow. Also doing metadata extraction out-of-process like this means that apps wanting metadata will not segv because a strange file is given that causes the metadata extraction path to segv. So folks will not blame libferris when libY.so that libferris uses to handle extracting data from Y.foo files has a segv causing bug in it.

I'm planning on using dbus for this at the moment. This should finally make code sharing for metadata extraction on the desktop somewhat more sharable. Hopefully I can just drop in libferris, strigi and other metadata extraction services and have dbus automatically pick them up and use them. This part isn't much of a gain to me personally because I tend to just add native support in libferris for metadata that I am interested in (or using other libs like strigi from ferris ;)

The below is my current design thoughts;

The plan is to have a object broker and many worker objects. The API should be async by default and clients can fairly easily block for a metadata reply if they are designed to work that way (like console apps). Other apps can just issue a bunch of metadata requests and update the GUI as the results come in.

I plan to have two APIs, one for quick get me the value of X from url Y and a bulk API for get it all.

The broker API might be something like this;
void registerClient( string callbackname )
long request( string url, string attribute )
long put( string url, string attribute, string value )

With registerClient() telling the broker what object to call back with metadata to, so the request() call will find the metadata and call the dbus object registered to tell it the value. The other option is to just use signals to reply to request() and put() from the broker.

The abstract api for workers is synchronous in nature and the broker handles managing many workers and remembering if any segv and under what condidtions so it doesn't invoke those cases again. Of course the broker will have to use threading or async dispatch of dbus calls to be able to remain responsive while the workers are doing their, um, "work".
Single attribute API;
string get( string url, string attribute )
void put( string url, string attribute, string value )

Bulk API;
map getbulk( string url, stringset restriction )
void setbulk( string url, map values )

The values might well become byte arrays or something else more streamy. But overall the use of streams here doesn't gain much unless you add the complexity of a streambuf API over dbus to be able to do partial reads and seeks on metadata. Overall using strings should be dandy fine for value.length() <>

Saturday, April 7, 2007

Indexing and Searching and Triples, oh my!

Seems that the beagle and tracker guys are having a good time of late. Things like this make me wonder if the so called "win" in open source is achieved by loud blog posts. I still don't see a win per se, does it matter if oodles of folks use projectX or projectY. Apart from getting a system to do what you want it to there is very rarely any other ROI from open source.

As an aside, I don't think that using the word triple scares any werewolves. Indexing can be done effectively with Lucene, PostgreSQL or redland, depends on what you want from your index.

The next libferris will also allow mounting web photo sites like flickr and now has f-spot metadata support. Yay, cp'ing images to 23hq via a filesystem and having the remote images tagged based on your local (indexable, searchable, extensible) tags.

Monday, April 2, 2007

Yay another waste of time that can be integrated into the planets.