Wednesday, June 6, 2012

An ARMs race, with a core i7 in there too for relativity

After doing some power benchmarking recently (1.2ghz kirkwood with gb net up = 5watts) I decided to work out how fast these arms can do useful work^TM. In the running is a Synology ds212j, DreamPlug running the Freedombox, Nokia n9 mobile phone, and an Intel 2600k just for working out where those relative numbers sit when compared to a desktop machine.



The above image shows the cipher performance of "openssl speed" across many machines. The 2600k is only single threaded, so could be many times faster in real world use by taking advantage of the cores. One interesting point right off the bat is that the 1.2Ghz kirkwood in the synology NAS is bested by the 1.0ghz CPU of the Nokia n9. Arms is not arms.

Removing the overload from the Intel i7 2600K from the graph we see that the Dreamplug is very close to the ds212j in terms of performance.
On the other hand, the digests show a distinct advantage to the Dreamplug setup. Again the n9 has a nice little lead on the others. Since a mobile phone can perform some useful work, one should perhaps also be demanding the NAS also offer handy features as well as just serving data blocks.


The RSA sign and verify graphs both show the same large slump for the ds212j unit. So for connection heavy workloads there would seem to be a large difference to the throughput you might get for the ARM you choose. On the other hand, the Dreamplug and ds212j both have similar performance on steam ciphers. So if connections are longer term then the difference will be less.

I would love to add benchmarks for the CuBox and QNAP 2ghz (TS-219PII) NAS units. It would also be interesting to use after market software on the ds212j and see the difference.

7 comments:

punit said...

I would be interested in seeing these numbers for the CuBox. The ARM based devices you've tested are based on an older version of the ARM ISA.

The CuBox uses v7 so you might even be able to try some of the hard float ports of Linux which should give a nice performance boost compared to the older designs.

Riku Voipio said...

Hi,

Here's cubox numbers:

http://kos.to/cubox-ssl.txt

Running cubox 2.6.32 kernel, openssl from ubuntu precise armhf. Hope the lack of openssl config doesn't mess the performance of the test

And here pandaboard (OMAP4430)

http://nchippin.kos.to/pandaboard-ssl.txt

Running linaro 12.05 image (3.3 kernel).

While you are benchmarking, howabout speed/watt graphs? pandaboard is also around 5W, while cubox is advertized at 3W. It would be interesting to see how they compare to i7 speed/watts.

I think the AES code in openssl might be missing some important optimization in my tests.

Riku Voipio said...

Indeed the debian/ubuntu packages were compiled without arm assembler support. new numbers at:

http://kos.to/pandaboard-ssl-new.txt

monkeyiq said...

A problem when trying to get watts/op for the i7 is how the PSU is put together and the rest of the system. Ideally, a no card system with a microPSU should give a low watt/op ratio at a reasonable cost.

Thanks for the numbers, I have OLPC and Xoom now too, so some follow up graphs will be coming at some stage ;)

Kevin Kofler said...

And as expected, the x86 machine completely blows away all those ARMs.

monkeyiq said...

Although, as mentioned above, if the graph was more on ops/watt it would be interesting to see how the arms do against the x86.

A.T. said...

@Kevin Kofler: for x86 perspective - normalise speed with power consumed, it will get better picture for target market.