Monday, May 31, 2010

redrum a distro: rm -rf /Fedora13

Some discussion recently came up as to what the outcome of a "rm -rf /" would be on a Linux machine. It's been ages since I last tested, and at the time there was no LVM in use, and /dev was actual nodes instead of a virtual filesystem. So, I was installing Fedora 13 64bit to test some hardware out and couldn't resist "seeing what happens". Needless to say, its not going to be an outcome that is pleasant, but academically interesting perhaps. If you don't know what the command does, don't execute it! And probably, don't even if you know what it does ;)

Also, I ran this from a gnome-terminal under a normal graphical session.

The filesystem layout was:


Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_test-lv_root
51606140 3111784 45872916 7% /
tmpfs 769704 548 769156 1% /dev/shm
/dev/sda1 495844 28538 441706 7% /boot
/dev/mapper/vg_test-lv_home
560152184 203192 531494900 1% /home


The home partition still contained /home/whiteele/.gvfs and the root lvm had var/lib/nfs/rpc_pipefs and empty boot, dev, home, proc, selinux, sys directories. But other than that everything else was eaten by the rmrf. Oh yes, and you have to supply the --no-preserve-root option to ensure the rm command that you know you are being silly.

Friday, May 28, 2010

Libferris and flickr, vimeo, facebook etc.

If you don't read Linux Format, issue 133 and 132 have some good material on how to mount web services as filesystems using libferris.

Friday, May 14, 2010

Plasma, libferris, and Google Spreadsheets

In my previous post I talked about how libferris now has a Plasma DataEngine which lets you get at the entire virtual filesystem (on demand) from plasmoids.

I've since extended things so that the DataEngine now includes a Service object, allowing plasmoids to write data too. Instead of tracking an XWindow, this time around I'll track a Google Spreadsheet and update it through the plasmoid. Doing this makes a mounted web service like GSpreads into a pastebin with access control applied. So you can share your clipboard with anyone in your "group" or company and stop others getting at it.

The two plasmoids show two spreadsheet cells, the top one is the top coloured cell and the bottom plasmoid is the lower, purple cell. The purple cell in the spreadsheet is the sum of the three cells in the column around the peachy cell above it.

First, updating the spreadsheet through the browser is reflected in the plasmoid. Towards the end I copy and paste a number into the top plasmoid, which updates the peach cell in the spreadsheet, which in turn causes the purple cell to update and so the lower plasmoid shows this updated sum. Of course, these two plasmoids don't have to be on the same machine. For tracking sums or other formulas, the ferris_graph might be more interesting if you are more interested in the trends than the current value.

plasma-google-spread.avi from Ben Martin on Vimeo.

Wednesday, May 12, 2010

Plasma & the libferris DataEngine

In a previous post I mentioned that libferris can now mount Plasma DataEngines. Of course, the opposite would have to follow; you can now access the entire virtual filesystem of libferris as a Plasma DataEngine. For those who are unfamiliar with libferris, it is my little virtual filesystem project which can mount xml, isam files, relational databases, flickr, vimeo, google spreadsheets, Firefox, XWindows, and shall we say one or two other things :)

To play around, I now have a DataEngine that ships with libferris itself (which exposes libferris to plasma), and a few custom plasmoids which basically ask the libferris data engine to "cat" files and metadata.

One of the things that libferris can mount is the XWindow system. This lets you see all your windows and their size and location. This data is exposed in xwin://localhost/window and the Extended Attributes (EA) x,y show the x and y position of each window. For example, the below command will show you the name and location of the window "foo" on your local X:
fls --show-ea=name,x,y xwin://localhost/window/foo

The libferris data engine makes "sources" on demand. You ask for a source by supplying the URL you want to read, the data engine makes the source for you and the content key contains the contents of that URL. If you want to get at metadata through the EA interface, just use @attribute in XML fashion.

The two plasmoids I made are libferris_cat and libferris_graph. The former just shows you text of the URL you have configured, the latter allows up to four files to be read and graphed. Obviously the latter plasmoid is meant for numeric data.

So to see the location of a window with ferris_cat set the URL to:
xwin://localhost/window/foo@x

Which is what I've done in the below video. Notice that there are two plasmoids so I can track the X and Y ordinates of the window as I move it.

plasma-cat-window-position-encoded.avi from Ben Martin on Vimeo.



The same data is shown using ferris_graph below.

plasma-graph-window-position-encoded.avi from Ben Martin on Vimeo.



Libferris can also mount postgresql and other relational databases. For postgresql you can run SQL and execute database functions through the filesystem as well as interact with the base tables. Lets assume you have a simple database like the one shown in the below setup:

drop database testplasma;
create database testplasma;
\c testplasma

create table folks ( name varchar, salary int, id serial );
create view stats as
select min(salary) as min, max(salary) as max, avg(salary) as avg
from folks;

insert into folks values ( 'Fred', 15 );
insert into folks values ( 'Mary', 17 );
insert into folks values ( 'Henry', 21 );

select * from stats;
min | max | avg
-----+-----+---------------------
15 | 21 | 17.6666666666666667
(1 row)

To get at this with libferris you might start by probing around the pg:// or postgresql:// URLs:

fls pg://localhost/testplasma
folks stats

fls -0 pg://localhost/testplasma/stats
15 21 17.6666666666666667 17.6666666666666667-21-15 avg-max-min
Adding the --xml switch to shows something including:
avg="17.6666666666666667" max="21" min="15"
name="17.6666666666666667-21-15" primary-key="avg-max-min"

Because the view has no primary key, libferris has used the values of the whole tuple to form a unique file name. This is less than optimal for our needs when using the DataEngine because we want a stable filename. The solution is to leave out the name and just use "*" to have libferris expand it for us! So the below URL will have ferris_cat track the "min" value in the view:
pg://localhost/testplasma/stats/*@min

The below video shows ferris_cat on the left viewing the min, the ferris_graph in the center shows min, max and average, and the avg is shown in the ferris_cat on the right. I add and remove a few folks from the table to see the effect on the plasmoids.

plasma-ferris-postgresql-encoded.avi from Ben Martin on Vimeo.



Of course, for a production PostgreSQL server you would use a scratch table and triggers to update it so that aggregates are not computed over mid to large sized tables all the time. Another advantage of triggers and a scratch table is you can easily handle rolling averages and delve into more advanced statistics while keeping the overhead known.

In short, if you can ferrisls and fcat something interesting, you should be able to drop it onto your desktop and monitor it now too :) All I need now is to get plasma onto my n810 :/ I have a feeling I'll be playing with tracking facebook and online spreadsheets using plasma+ferris soon ;p

Saturday, May 8, 2010

Desktop Annotations...

OK, so a post about annotating files and desktop search. KDE guys might be interested because libferris uses soprano, the base of RDF on that desktop, maemo guys might be interested because all this works on that platform too, and I have a specialized index structure for n810 power level devices in libferris.

Tagging and Annotations are closely related. Tags (or emblems in libferris terms) are great for assigning one or more concepts to a file.
Annotations are great for adding some free text to something. While a short annotation might seem like one, two, or three tags, as in the example below, annotations also carry linguistic weight. Normally with tags you don't care about the order the tags are added to the file. With an annotation and a full text index you should be able to do proximity matching and ordered searching "seed collection" matches but "collection seed" doesn't. There are also issues of human language stemming which many tag systems silently ignore but full text indexes tend to have to address.

Below is the ego file manager 0.30.0 with the Annotation side panel. This panel auto saves the annotation if you select a new file or if you stop changing it for a few seconds. Hotkeys make this all quite handy. I'm using Control-t to start interactive tagging and Control-6 to switch to the annotation sidepanel with focus in the text block there. Hitting tab in the annotation sidepanel moves focus to the file list, so you can skip to and from annotating each file without the rodent.


You can of course add, view, and edit these same annotations from the command line. The fedit command runs vi on the annotation, allowing you to freely change it.
I have just fixed a slight inconsistency in the fedit command so it now accepts the "-a attribute" option too. The fcat views the annotation "-a attribute" from the file.

$ >| tfile
$ fedit -a annotation tfile
...
$ fcat -a annotation tfile
hi, the new annotation

$ feaindexquery '(annotation=~new)'
Found 1 matches at the following locations:
file:///tmp/tfile

Another nifty trick is to see the annotation right inside an fls output. Use mtime-display if you want the time to be more human readable than an epoch time_t.

$ fls --show-ea name,size,mtime,annotation tfile
tfile 0 1265954823 hi, the new annotation

Remember also that fls --xml gives you XML output, so with an XSLT stylesheet you can serve a directory of files and their annotation through your web server.
If this is of interest, see apps/phpsearchinterface/xml-results-to-xhtml.xsl for an initial stylesheet with row colour striping. It should be easy to extend the stylesheet to present other attributes. Bonus marks for anyone who makes it handle arbitrary XML attributes and orders them according to a predefined POSET. Patches always welcome...

When libferris saves an annotation, if DBus is enabled a signal is emitted on the session bus: "org.libferris.desktop", "AnnotationSaved"
which carries the URL and Path for the file you changed. This allows not only reindexing to happen, but you are free to hook up some Perl or whatnot to monitor this signal, then you can actually run some logic to work out what to do. If an annotation is saved, you might like to update an RSS feed for example.

And so ends the libferris tip of the day... happy annotating!
This post has been fueled by 99% cocoa, thanks to Jan-Piet Mens ;-)

Thursday, May 6, 2010

Plasma: Tree Shaped Eyes

After digging into KDE4's plasma a little bit it smelt a bit like an old friend. In Plasma you have one or more data engines, and each engine can have many sources. Each source offers a list of key-value pairs, which are updated at a nominated interval.

Normally folks think of filesystems as directories and files. But these days, you have to consider the Extended Attribute (EA) interface that filesystems offer as well. This makes a filesystem much closer to a large XML repository than just a collection of files accessible through a tree namespace (the directories). In libferris, each EA key-value pair can also tell the developer/user what schema that value has. This closes the gap between what a filesystem is and what a plasma data engine is just a little bit more. In fact, one might think of a plasma data engine as a virtual filesystem with a touch of extra stuff to allow a plasmoid to poll the data engine easily. This is not to detract from plasma at all, saying its "just a filesystem" means it is like postgresql, xml, or emacs to me ;)

For example, to see some current weather using my ABOMiNation plasma data engine and libferris (dev trunk), I can see the wind and also what type of value that EA or data engine key-value is:


$ fls -l \
--show-ea=name,air-temp,wind-speed,wind-gust,schema:wind-gust \
plasma://abomination_observations/nnn1

nnn1 22 15 26 schema://xsd/attributes/decimal/integer/long/int


One major upshot of looking at plasma in this way is the major upshot of everything being a filesystem. I can "cat" values directly from a data engine and also use fls to inspect data engines and their values from the command line while developing. plasmaengineexplorer is very nice, but its a bit of a pain to use a GUI tool to test out if the data engine is working when you are in a compile, run test cycle. It is also really easy to pluck out data from plasma with libferris, for example, the above fls with a --xml on the command line will do what you imagine. And if you are a nepomuk fan, using fls --rdf will give you an RDF/XML file to enjoy.

I found a few of the data engines would crash if they are started with a QApplication that forces GUI to off. So I black list
s == "tasks" || s == "mouse" || s == "keystate"
in order to mount plasma at the moment. Also, I'm using the signal/slot callbacks to get at the source key-value hash because the immediate mode methods don't seem to want to work for me :/