In the future I’d like to discuss some tests I’ve done with linux md-raid and hardware raid, as well as write up a few instructional documents on md-raid, but thought it might be interesting to talk a bit about parity before digging into some details on raid 5 performance, caching, and stripe size. This is a pretty basic explanation, but hopefully it will give some insight to those beginner/intermediate folks like me regarding exactly how we’re able to provide fault tolerance and redundancy simply by adding one extra drive to the array.
You may already be familiar with parity or at least heard of it, perhaps in the context of a parity bit, which works in a similar manner but is used for error detection. More on that some other time, perhaps. Most system admins have at least a general idea of what the parity data in a RAID array is, i.e. ‘extra data that can be used to rebuild an array’, but I find it interesting to go through the exercise of exactly how it works.
If you’re only somewhat familiar with how RAID 5 works, you may have at least heard something about XOR calculations or XOR hardware on your controller. XOR is a logic operation, which stands for “exclusive or”, meaning ‘I’ll take this or that but not both’. Without getting too much into what that means, it’s basically just binary addition. In other words, if we were to XOR bits ‘0’ and ‘0’, we’d get a ‘0’ (0 + 0 = 0). If we XOR bits ‘1’ and ‘0’, we get 1. It doesn’t get tricky until we XOR bits ‘1’ and ‘1’, which basically rolls the digit over and we get a binary 2, or ’10’. We’re only interested in the least significant bit, however, so in this case 1 + 1 = 0. Using the following four equations, we should be able to XOR anything we need to:
- 0 + 0 = 0 (nothing)
- 1 + 0 = 1 (I’ll take this)
- 0 + 1 = 1 (I’ll take that)
- 1 + 1 = 0 (but not both)
Now, let’s apply this to our parity striped raid set. In the following example, we’ll look at a single stripe set across three disks. Two hold data and the third holds parity for that data.
If we lose Stripe 1, we can determine what it held by reversing the equation, or solving ? + 0 = 1. Likewise, if we lose Stripe 2 we can do the same ( 1 + ? = 1). If we lose the Parity stripe we haven’t really lost any data and can just recalculate parity. Now, the interesting thing about the math is that rather than looking at it as, for example, 1 + ? = 1 when we lose Stripe 2, we can restore Stripe 2 by doing an XOR with Stripe 1 and the Parity stripe, 1(Stripe 1) + 1(Parity) = 0(Stripe 2)
Now, in reality, the stripes are much larger than one single bit as described above, you’ll likely have a stripe of 4 kilobytes up to 256 kilobytes or maybe more, but the mechanics are the same. Let’s upgrade the previous example to a three bit stripe just to show how it works with multiple bits in a stripe. Stripe 1 will contain ‘101’, stripe 2 will contain ‘011’. So, to calculate the parity, we match up the bit placements of stripe 1 and 2 and then XOR each set of bits individually. It helps to stack them vertically and work on each column one by one, as the following figure shows:
And again, an example of ‘losing’ stripe 2 and recalculating from parity.
Now, that’s great, but what about four, five, six disk arrays? Well, once you get the idea of how XOR works, it’s pretty simple. If you understand binary math you can continue to use that to add four, five, six bits together, but another rule of thumb is to think of ‘0’ as even and ‘1’ as odd. For example, 1+1+1, an odd plus an odd plus an odd will be an odd number, so parity equals 1. Another example, 0+1+0+1, an odd plus an even will be odd, plus an even will again be odd, plus an odd will be even, so parity equals 0. If that gets too confusing, you can simply count how many ‘1’s you have. If you’ve got an odd number of ‘1’s, then parity is 1, if you have an even number of ‘1’s, parity is 0.
These parity calculations are fairly simple, yet interesting for the fact that we can provide protection for the data on any number of disks by simply adding a single disk. No need to have a full copy of the data. The caveat, of course is that you can only lose 1 disk per parity stripe. Yes, that’s right, you can have dual parity as well, also known as RAID 6.
Another drawback is that while you’re missing one of your disks any stripe sets that had data on that disk will take a performance hit because they’ll be doing XOR calculations to read the data is lost, rather than just reading it from the now lost disk. Stripes that had parity on the lost disk won’t take a performance hit, but it will take some processing time to rebuild the parity once the disk is replaced. These performance hits are going to be an important part of later discussions regarding raid 5 performance.
Probably worst of all, in order to modify data on disk, the system has to read the *entire stripe set* in order to recalculate parity with the changed data, effectively causing a read along with every write. This is where on-card caches can be helpful.
To finish off the exersize, I’ve put together this simple RAID 5 array. Each disk has four stripe sets, each stripe set (in matching color), has three data stripes and one parity stripe. You can see that the parity stripe(???) alternates on to a different disk for each stripe, which is the primary difference between RAID 5 and RAID 4, which places all parity stripes on the same disk. If you’d like to test yourself and see just how your data is protected, first calculate the parity stripe for each set, then cover up a whole column (disk) at random and see if you can reconstruct its contents by doing the math.
As always, if anyone is reading this ;-), feel free to leave your feedback, correct me, whatever.