Category Archives: Stuff


SeekMark Update

Published by:

I just updated SeekMark to include a write seek test. I was initially reluctant to do this, because nobody would ever want to screw up their filesystem by performing a random write test to the disk it resides on, right?? Of course not, but occasionally you need to benchmark a disk, for the sake of benchmarking, and aren’t worried about the data. And of course, I didn’t care about that functionality until I needed it myself!

So here we have version 0.8 of SeekMark, which adds the following features:

  • write test via “-w” flag, with a required argument of “destroy-data”
  • allows for specification of io size via the “-i” flag, from 1byte to 1048576 bytes (1 megabyte). The intended purpose of the benchmark (which is to test max iops and latency) is still best fulfilled by the default io size of 512, but changing the io size can be useful in certain situations.
  • added “-q” flag per suggestions, which skips per-thread reporting and limits output to the result totals and any errors that possibly arise

Now head on over to the SeekMark page and get it!



Published by:

I just added a new page for SeekMark, a little program that I put together recently to test the number of random accesses/second to disk. It’s threaded and will handle RAID arrays well, depending on the number of threads you select. I’m fairly excited about how this turned out, it helped me prove someone wrong about whether or not a particular RAID card did split seeks on RAID1 arrays. The page is here, or linked to at the top of my blog, for future reference.  I’d appreciate hearing results/feedback if anyone out there gives it a try.

Here are some of my own results, comparing a linux md raid10, 5 disk array against the underlying disks. I’ll also show the difference in the results that threading the app made:

single disk, one thread:

  [root@server mlsorensen]# ./seekmark -t1 -f/dev/sda4 -s1000
  Spawning worker 0
  thread 0 completed, time: 13.46, 74.27 seeks/sec

  total time: 13.46, time per request(ms): 13.465
  74.27 total seeks per sec, 74.27 seeks per sec per thread

single disk, two threads:

  [root@server mlsorensen]# ./seekmark -t2 -f/dev/sda4 -s1000
  Spawning worker 0
  Spawning worker 1
  thread 0 completed, time: 27.29, 36.64 seeks/sec
  thread 1 completed, time: 27.30, 36.63 seeks/sec

  total time: 27.30, time per request(ms): 13.650
  73.26 total seeks per sec, 36.63 seeks per sec per thread

Notice we get pretty much the same result, about 74 seeks/sec total.

5-disk md-raid 10 on top of the above disk, one thread:

  [root@server mlsorensen]# ./seekmark -t1 -f/dev/md3 -s1000
  Spawning worker 0
  thread 0 completed, time: 13.09, 76.41 seeks/sec

  total time: 13.09, time per request(ms): 13.087
  76.41 total seeks per sec, 76.41 seeks per sec per thread

Still pretty much the same thing. That’s because we’re reading one small thing and waiting for the data before continuing. Our test is blocked on a single spindle!

four threads:

  [root@server mlsorensen]# ./seekmark -t4 -f/dev/md3 -s1000
  Spawning worker 0
  Spawning worker 1
  Spawning worker 2
  Spawning worker 3
  thread 1 completed, time: 15.02, 66.57 seeks/sec
  thread 2 completed, time: 15.46, 64.69 seeks/sec
  thread 3 completed, time: 15.57, 64.24 seeks/sec
  thread 0 completed, time: 15.69, 63.74 seeks/sec

  total time: 15.69, time per request(ms): 3.922
  254.96 total seeks per sec, 63.74 seeks per sec per thread

Ah, there we go. 254 seeks per second. Now we’re putting our spindles to work!



Published by:

I’ve just added a page for FractMark, a simple multi-threaded fractal-based benchmark. Read more about it (and download it) here.

On a side note, some of you may be familiar with a similarly simple i/o benchmark called PostMark. It was written under contract for Network Appliance  and is known as an easy, portable, random i/o generator.  At this point the sourcode has been pretty much abandoned as far as I can tell, so I’ve picked it up and have begun adding some bugfixes as well as some enhancements. The primary things I’ve done so far are to add an option for synchronous writes  in Linux, as well as threaded transactions, which should give people the flexibility to test scenarios where they might have many processes creating random i/o.

If this interests you, I’ll be posting the source code and patches coming soon!


VM I/O benchmarks

Published by:

Ok, I’ve been sitting on these for a few days and wanted to get them out there. I’ve got an old server that I configured with 4x750GB western digital black SATA drives, and used an LSI2008 controller in raid10 with the default 64k stripe size. It’s a 4 core xeon 5200 series, I believe, with 24G of RAM. The OS for every test was CentOS 5.4 with default virtual memory/sysctl configurations and minimal packages.

The VMs were built with 300GB virtual disks and 4GB of memory.   The KVM guests had disk drivers and cache settings as indicated, and was on ext3 with a QCOW2 image (options were 1MB cluster size, preallocated metadata).  The ESX guest had the pvscsi driver enabled, and a 2MB cluster size was used on the filesystem due to the filesize limitations.

First off, postmark.  For those who don’t know, postmark is a simple, yet decent utility that’s designed to give an idea of small I/O workloads. It’s tunable, but geared toward web/mail server type loads. It creates a ton of small files, then does random operations on them, and spits out the results.

Here’s the config:

set buffering false
set number 100000
set transactions 50000
set size 512 65536
set read 4096
set write 4096
set subdirectories 5

The primary things I’m interested in here are the KVM virtio performance (specifically writethrough and nocache) compared to ESX4.1 and the native host disks. The IDE driver and writeback tests were done just to see what would happen, but they’re not exactly what I’d prefer to use in production. The KVM virtio driver is nearly on par with native speed when it comes to reads and writes. It falls behind a bit on the actual operations per second, but it must have made up the time somewhere, since the benchmark overall only took a hair longer (if this confuses you, basically the benchmark goes through several stages: creating dirs, creating files, performing transactions, deleting files, deleting dirs. Only the transactions part counts towards the ops/sec number).  The ESX guest didn’t do as well with the pvscsi drivers. In fact, this benchmark alone would be enough to put any of my concerns about virtio performance and choosing KVM vs the tried and true ESX.

Next up, iozone.  This benchmark tries to create a sort of ‘map’ of the disk I/O, by testing various file sizes at various record sizes, creating sort of a matrix.  I regret to say that the read numbers are a bit skewed, as my setup didn’t include an unmountable volume on every system, that’s really the only way to clean out the read caches between tests with this benchmark.  Still, we’ve got some good write numbers and some interesting cache comparisions.

As you can see, the host’s read cache is much faster than the guest’s. Still, some of those guests are posting upwards of 1GB/s, not bad, but it does give us some insight into the overhead of the vm, we’re likely seeing the added latency of fetching the data from cache and passing it through, which can be pretty big when memory speed is measured in ns.

Also of note yet again is the good performance of virtio, and that writethrough and writeback score roughly the same.  The KVM IDE driver didn’t fare so well, in fact in writeback mode it caused the mount to go read-only repeatedly, so I gave up on it. ESX, again, not so good, beating out only the VMs that aren’t using any cache.

Here we see the huge performance boost that a VM using writeback cache can attain, the virtio driver has no problem with cheating.  Now, we’re not talking about storage controller, battery-backup writeback, we’re talking about writes going into the host’s dirty memory and being considered complete.  As such, you had better trust that your host won’t crash or suddenly reboot, or at the least make sure you’ve got snapshots you can roll back to in case of an emergency. You can be fairly certain that your writes will be committed within a minute or so at the worst (check the hosts dirty_expire_centisecs), most likely much sooner unless the host is spending a lot of time in IOWAIT, my point being that if you choose to go this route you can be certain that you’ve at least got a good, recent snapshot if you can get a few minutes away from the latest one before catastrophe.

Here is the same data, with the writeback taken out, so we can get a better look at the rest of the pack.

Not really too much exciting about the sequential graph, except for the  nocache and writethrough VMs being faster than the host. As a guess I’d attribute this to the 1M cluster size on the qcow2 file, i.e. even though we’re writing 4k at a time in the VM, it’s probably writing them in much larger chunks when the writes hit the host. I also did some ‘dd’ tests in each of these systems, but the results were very similar so I’m not going to rehash them.

Random writes… here we actually see ESX perk up a bit and hold its own on 4-8k record sizes. The host is even faster on the low end, and in some respects this random write graph mirrors the iops results from our postmark test if you kind of average together the left half.

In all I must say that I’m fairly pleased with the progress of KVM and it’s I/O performance.


Invitation from NetApp

Published by:

A few of us from work were invited to meet with Vice Chairman Tom Mendoza from NetApp downtown this morning.  He seems to have sort of a side job being a fairly popular inspirational speaker in the business world. Apparently he stopped by between flights from the east to west coast and the local NetApp folks talked him into holding an informal seminar over breakfast with some of us customers.  There were only 12 or 15 of us, and it was in a lofty conference room with a wonderful city view, which gave us the feeling that it was a special event.

He spoke a lot about NetApp, but I didn’t get the feeling that he was doing it as a salesman so much as he was doing it because they were valid experiences to illustrate his point. He spoke a lot about how they’ve run their business, their culture, and the problems he’s seen in other companies.

He pointed out a few things he’s seen, such as companies saying that people are their greatest resource, but not spending any time in board meetings discussing how they show that as a company.  He talked about how NetApp has a program to let the executives know if you feel that someone has done a good job and given extra effort, and various ways in which they recognize those extraordinary people.  He even talked about the one layoff they had, where 50 of the 70 affected individuals wrote thank-you letters to the company for how they handled it, which was first to let the people know that it wasn’t their fault, that the company had to do it, second to compensate them fairly in their severance, and last to be involved in helping them find jobs elsewhere.  He said that later on, many of the individuals came back.

He also spoke of their business culture, and how they’re more than happy to save their customers money by coming up with new technologies, for example when Oracle asked them for read/write snapshots, creating what they now calll Flex Clone. It allowed Oracle to buy fewer NetApp products, but it made NetApp products better and made them money in the long run.  He spoke about the economic downturn, and how many companies throw their arms around what they’ve got and try to protect it, looking at what they need to cut to stay the same, when they should be meeting, changing, and figuring out new strategies that will allow them to adapt and grow. ‘Either you’re moving forward or you’re moving backward. If you’re standing still then you’re moving backward.’ On this same topic, he spoke about candor and about how it’s crucial to the company, that people shouldn’t be afraid to say what they think, and the productivity that comes along with that.

Last, he spoke about personal goals, and how the majority of people who become successful have them. He detailed a bit about how he thought one should go about managing their goals, and offered to e-mail any of us a more detailed outline  that he’s come up with.

In all, it was a pretty good speech and I’m glad that I went. Part of me did wonder whether it was a roundabout recruiting mechanism, since I’m pretty sure everyone there went away wishing they worked for NetApp, but at the same time  I think some of the information he shared really is valuable if  we implement what we can in our current environments.