Sunday, 17 June 2012

Raspberry Pi forever - getting the SD card(s) to work along with some numbers and graphs

I have finally got my hands on the awesome Raspberry Pi board with a vicious plan of running a hardened Gentoo on it of course ;] But before that could happen, I had to get a decent SD card for it, which turned out to be not that obvious. There's a wiki page with a list of SD cards that should and shouldn't work with your Raspberry. There's also a discussion thread on the Raspberry Pi forum about performance of various cards, which is vital to the overall performance of the system. I took an SD card from my camera - a 16GB SanDisk Extreme SD card, which is a Class 10 card and should do "up to 45MB/s". I also decided to buy a 16GB micro SDHC SanDisk Card with adapter. SanDisk claims this card can do "up to 30MB/s" and is marked as '200x' class 6 card. What what I could tell, there were at least some people who were able to get it to work with Pi and got a decent performance out of it.

So I had the following candidates:
  • 16GB SanDisk Extreme SD card, class 10, 45 MB/s
  • 16GB micro SDHC SanDisk Card with adapter, class 6, 30MB/s
For an easy start, I grabbed the debian image of the Raspberry Pi site, put on the cards and booted the Pi. None of the card let me boot the system. Ooops! My hope was that updating the kernel will make a difference...Fortunately, updating kernel image on the Pi is easy, you just need to grab the kernel files from here. All I've done was to replace the kernel.img, kernel_emergency.img and start.elf on the first partion of the card, with the files available in the /boot folder from the firmware repository. Next step was to update the /modules folder from the firmware repository, which can be found in /lib/ folder on the second partition on the card. Voila! With the new kernel both cards booted the Debian Pi successfully!

root@raspberrypi:~# uname -a Linux raspberrypi 3.1.9+ #122 PREEMPT Sun Jun 17 00:30:41 BST 2012 armv6l GNU/Linux

Now I was curious how well can each of the card perform. I've run the CrystalMark tool on both of them, but the results were fairly inconclusive - both cards scored pretty much the same results, with the random write speeds fluctuating between 1.0 - 1.3 MB/s. Not a bad result anyway! Regardless, I don't like running Windows...;) and I wanted to run something on the Pi itself, as I think it gives better comparable results between users, because it's done on the very same hardware - the Pi itself :) So hdparm went first, simply run as

hdparm -t /dev/mmcblk0

...and I got the following results for the Extreme 45 MB/s card:

root@raspberrypi:~# hdparm -t /dev/mmcblk0 /dev/mmcblk0: Timing buffered disk reads: 60 MB in 3.08 seconds = 19.47 MB/sec

the Ultra 30 MB/s card scored:

root@raspberrypi:/home/pi# hdparm -t /dev/mmcblk0 /dev/mmcblk0: Timing buffered disk reads: 60 MB in 3.08 seconds = 19.48 MB/sec

Again, pretty much the same results - a taste of things to come. Let's see...

I was interested how they would both perform with random reads and writes, to test in the similar manner that the CrystalMark tool does. I found that there's a great linux tool than can achieve this - fio. Fio is a very versatile tool and provide a lot of testing options. It can also log results to a file, which then can be plotted with gnuplot using a script that is also provided with fio. I created simple three cases:
  • random read
  • random write
  • random read/write
Each one of them was defined as follows (in three separate files, as otherwise fio would run them simultaneusly as a separate threads. Not something I wanted but an interesting feature for creating more complex tests):

[random-read]
rw=randread
size=100m
directory=/tmp/fio-testing/data
write_bw_log=bandwidth-read
write_lat_log=latency-read

[random-write]
rw=randwrite
size=100m
directory=/tmp/fio-testing/data
write_bw_log=bandwidth-write
write_lat_log=latency-write

[random-read-write]
rw=randrw
size=100m
directory=/tmp/fio-testing/data
write_bw_log=bandwidth-read-write
write_lat_log=latency-read-write


All tests where done over ssh, other than the login shell, the system was idle.

Results below, first, the Extreme card, random read, 2 best and 2 worst results out of approximately 10 runs:

READ: io=102400KB, aggrb=3370KB/s, minb=3451KB/s, maxb=3451KB/s, mint=30377msec, maxt=30377msec
READ: io=102400KB, aggrb=2881KB/s, minb=2951KB/s, maxb=2951KB/s, mint=35531msec, maxt=35531msec
READ: io=102400KB, aggrb=2783KB/s, minb=2850KB/s, maxb=2850KB/s, mint=36789msec, maxt=36789msec
READ: io=102400KB, aggrb=2766KB/s, minb=2832KB/s, maxb=2832KB/s, mint=37018msec, maxt=37018msec


...and the Ultra card:

READ: io=102400KB, aggrb=3355KB/s, minb=3436KB/s, maxb=3436KB/s, mint=30513msec, maxt=30513msec
READ: io=102400KB, aggrb=3337KB/s, minb=3417KB/s, maxb=3417KB/s, mint=30682msec, maxt=30682msec
READ: io=102400KB, aggrb=3218KB/s, minb=3295KB/s, maxb=3295KB/s, mint=31814msec, maxt=31814msec
READ: io=102400KB, aggrb=3210KB/s, minb=3287KB/s, maxb=3287KB/s, mint=31891msec, maxt=31891msec


More or less, same results, the Ultra card seems actually more consistent than the Extreme card...

Now the random write results below, 2 best and 2 worst results out of approximately 10 runs, the Extreme card:

WRITE: io=102400KB, aggrb=1192KB/s, minb=1221KB/s, maxb=1221KB/s, mint=85842msec, maxt=85842msec
WRITE: io=102400KB, aggrb=1181KB/s, minb=1209KB/s, maxb=1209KB/s, mint=86696msec, maxt=86696msec
WRITE: io=102400KB, aggrb=1111KB/s, minb=1138KB/s, maxb=1138KB/s, mint=92104msec, maxt=92104msec
WRITE: io=102400KB, aggrb=956KB/s, minb=979KB/s, maxb=979KB/s, mint=107096msec, maxt=107096msec


...and the Ultra card:

WRITE: io=102400KB, aggrb=1244KB/s, minb=1274KB/s, maxb=1274KB/s, mint=82269msec, maxt=82269msec
WRITE: io=102400KB, aggrb=1221KB/s, minb=1250KB/s, maxb=1250KB/s, mint=83851msec, maxt=83851msec
WRITE: io=102400KB, aggrb=1027KB/s, minb=1051KB/s, maxb=1051KB/s, mint=99697msec, maxt=99697msec
WRITE: io=102400KB, aggrb=645KB/s, minb=660KB/s, maxb=660KB/s, mint=158708msec, maxt=158708msec


Interesting...apart from the one particularily slow run - the Ultra card is as quick as the Extreme card! And the best result for Ultra is better than for Extreme!
And potentially the most interesting one, random read and write results, again 2 best and 2 worst results out of approximately 10 runs:

READ: io=51484KB, aggrb=1059KB/s, minb=1084KB/s, maxb=1084KB/s, mint=48600msec, maxt=48600msec
WRITE: io=50916KB, aggrb=1047KB/s, minb=1072KB/s, maxb=1072KB/s, mint=48600msec, maxt=48600msec

READ: io=51660KB, aggrb=812KB/s, minb=831KB/s, maxb=831KB/s, mint=63605msec, maxt=63605msec
WRITE: io=50740KB, aggrb=797KB/s, minb=816KB/s, maxb=816KB/s, mint=63605msec, maxt=63605msec

READ: io=50528KB, aggrb=748KB/s, minb=766KB/s, maxb=766KB/s, mint=67502msec, maxt=67502msec
WRITE: io=51872KB, aggrb=768KB/s, minb=786KB/s, maxb=786KB/s, mint=67502msec, maxt=67502msec

READ: io=50456KB, aggrb=723KB/s, minb=740KB/s, maxb=740KB/s, mint=69733msec, maxt=69733msec
WRITE: io=51944KB, aggrb=744KB/s, minb=762KB/s, maxb=762KB/s, mint=69733msec, maxt=69733msec



...and the Ultra card:

READ: io=51028KB, aggrb=894KB/s, minb=915KB/s, maxb=915KB/s, mint=57071msec, maxt=57071msec
WRITE: io=51372KB, aggrb=900KB/s, minb=921KB/s, maxb=921KB/s, mint=57071msec, maxt=57071msec

READ: io=50664KB, aggrb=897KB/s, minb=918KB/s, maxb=918KB/s, mint=56471msec, maxt=56471msec
WRITE: io=51736KB, aggrb=916KB/s, minb=938KB/s, maxb=938KB/s, mint=56471msec, maxt=56471msec

READ: io=51112KB, aggrb=777KB/s, minb=795KB/s, maxb=795KB/s, mint=65773msec, maxt=65773msec
WRITE: io=51288KB, aggrb=779KB/s, minb=798KB/s, maxb=798KB/s, mint=65773msec, maxt=65773msec

READ: io=51308KB, aggrb=741KB/s, minb=759KB/s, maxb=759KB/s, mint=69182msec, maxt=69182msec
WRITE: io=51092KB, aggrb=738KB/s, minb=756KB/s, maxb=756KB/s, mint=69182msec, maxt=69182msec


As I mentioned, fio comes with fio_generate_plots script, that lets you plot collected data. Below are the results for the Extreme card.

Extreme - Random read bandwidth


Extreme - Random read latency

That spike in latency and resulting downward spike in bandwidth is interesting...

Extreme - Random write bandwidth


Extreme - Random write latency


Extreme - Random read/write bandwidth


Extreme - Random read/write latency

...and we can also combine them on one graph:
Extreme - bandwidth - combined


Extreme - latency - combined



And now the graphs for the Ultra card:

Ultra - Random read bandwidth


Ultra - Random read latency

Interesting...the very same spike as for the Extreme card!

Ultra - Random write bandwidth


Ultra - Random write latency


Ultra - Random read/write bandwidth


Ultra - Random read/write latency

...and when combined on one graph:

Ultra - bandwidth - combined


Ultra - latency - combined



Now, using the very same script, I have combined the respective graphs for both cards. Please bear in mind, that these graphs do not contain the best results for each card - hence should not be treated as a definitive point of reference. What you can see though, is that the cards share very similar patterns, even if the actual values are slightly different (again, these are from random runs, rather than the best ones)...

Combined results - Random read bandwidth


Combined results - Random read latency


Combined results - Random write bandwidth


Combined results - Random write latency


Combined results - Random read and write bandwidth


Combined results - Random read and write latency



Conclusion

Hmm...it seems that the Extreme card does not perform much better, if at all, than the Ultra card. The fastest results for the random read/write test were significantly better than the highest result for the Ultra card; however, on average, they did perform more or less the same...Whether this testing was enough to give any conclusive results is a different matter...;)

The choice is yours...!

Next step - Gentoo on Pi, but for now...Enjoy the Pi! ;]

6 comments:

  1. What was the block size used for the random reads/writes? This really determines the performance.

    Could you do me a favor and try out iozone running on the Pi? This let's you specify block size, so you can mimick CrystalDiskMark's 4K block size random read/write and 512K block size random/read write. The command to measure this would be the following:

    iozone -e -I -a -s 50M -r 4k -r 512k -i 0 -i 1 -i 2

    iozone is installable via the iozone3 package in Debian.

    ReplyDelete
  2. Cheers for the info!

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. For anyone still seeing this in 2013 (this post ranks high on Google...) The SD is connected via USB 2. That's your bottleneck!

    ReplyDelete
    Replies
    1. Completely false. The sd card is connected to the emmc module directly. The bottleneck is the transfer mode, which is currently up to 50Mhz on four data lines. 50Mhz * 4 lines / (8 bits/byte) = maximum current theoretical throughput of 25 MB/s on the pi. Then there's overhead. I believe that the emmc controller can go faster, but it's not implemented in the kernel.

      Delete
  5. Hi,

    @runeks: if you want to do it in fio you can use the bs parameter for modifying the blocksize

    @radegand: you also could write you jobs in one file and separate them with the stonewall keyword. Then the current job will wait for the previous job to finish.

    Enjoyed reading your post but hoped for some gentoo stuff with the pi. Will read the other posts - perhaps you mentioned it there ;)

    ReplyDelete

Have your say: