THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Is RAID 5 Really That Bad?

RAID 5 is a dirty word in the DBA community and beyond. There are websites devoted to trash RAID 5. I've seen DBAs declaring performance root cause found the very moment they found out that some database files were placed on RAID 5 volumes. You'd be ridiculed and run out of town if you dare to suggest putting the transaction log file on RAID 5. Is this knee jerk reaction to blaming RAID 5 for a database's performance failings really justified?

Why am I even asking this question? Isn't it already settled that RAID 5 is bad for writes? Well, only if life is that simple.

Below are two charts showing the performance of the same 8K sequential writes at various I/O queue depths applied to two RAID volumes presented to the same server: one is RAID 5 and the other is RAID 10. Both of them are presented from SANs. For all the data points in the charts, sqlio.exe--which is a free download from Microsoft--is configured with 32 threads to generate the I/O loads.

In the charts, the RAID 5 volume sustains much higher I/O throughput than does the RAID 10 volume, and it has significantly lower I/O latency than does the RAID 10 volume. Before you question the validity of the data, let me assure you that the performance difference illustrated in the charts are no fluke. The pattern is consistently reproducible.

How can RAID 5 outperform RAID 10 on 8K sequential writes?

Let's note that behind each of the two volumes is an I/O path consisting of many hardware/software components. Some of them are shared between the volumes, others are not. Many of the components have a significant impact on the I/O throughput and latency of the volume; they include HBAs and the software that manages the I/O paths on the host, SAN architecture, model of the SAN frames, cache on the SAN, RAID configuration, specifications of the physical disks inside the SAN frames, configuration of hardware hardware replication, number of spindles used by the volume, and so on.

So the RAID configuration is but one of many factors that influence the I/O performance of a volume. Differences in some of these other factors can result in a RAID 5 volume outperforming a RAID 10 volume on 8K sequential writes. The easiest example is that if the RAID 10 volume is enabled for synchronous replication in the SAN.

Okay, you can accuse me of comparing apples and oranges--i.e. it's not strictly RAID 5 vs. RAID 10 in the charts. Guilty as charged.  But when you automatically declare the performance root cause found at the first sight of RAID 5, you have just committed the same fallacy of comparing apples and oranges.

I'm not disputing that RAID 5 is not as good as RAID 10 for writes with everything else being equal. However, the fact of life in a real enterprise SAN environment is that a RAID 5 volume and a RAID 10 volume almost never differ only in their RAID configuration. In such an environment, to be certain about the performance of a volume, you need to actually evaluate/test it. The information on the RAID configuration alone is far from being sufficient in making a recommendation.

If you have been asked to decide on which of these two volumes to place the transaction log file of a database, how would you proceed?

 

Published Wednesday, February 07, 2007 10:31 AM by Linchi Shea

Attachment(s): RAID.gif

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alexander Gladchenko said:

You have not correctly chose the type of SQL Server IO. Look http://blogs.msdn.com/sqlcat/archive/2005/11/17/493944.aspx

What size of SQL Server IO workload?

February 8, 2007 1:22 AM
 

Linchi Shea said:

But I didn't choose any type of SQL Server I/Os. I did a pure I/O test with 8K sequential writes, which does have some relevance to the SQL Server log writes in that SQL Server does mostly 512byte~64K sequential writes. I was more interested in the I/O performance itself than in the SQL Server performance. Using SQL Server to test I/O characteristics is valid and useful, but has too many confounding factors.

> What size of SQL Server IO workload?

The exact block size of the log writes varies with the actual SQL Server workload at the time. But if this is a general question on the characteristics of the SQL Server I/O workload, the discussions in the link you referenced provide excellent information on that.

The point I wanted to get across in this blog post is that there are many factors on an I/O path, and RAID configuration is only one of many such factors. You can't use the RAID configuration as the only input in making decisions on the I/O performance of such an I/O path.

Linchi

February 8, 2007 10:14 AM
 

Dave Markle said:

A great article once again!

I've always wondered when people compare RAID 5 and RAID 10 side by side... What variables are they keeping constant?  The statement "Everything being equal" always made little sense to me.  Are folks comparing two RAID arrays of the same size, or of the same cost?  Because in order to have a 1TB RAID 5 system, that'll cost you a lot less than a 1TB RAID 10 system if you're using the same drives.   This is an important distinction when money's an object...

As far as throughput goes, I can totally understand why, for the same cost, a RAID 5 setup outperforms a RAID 10 system -- there are a lot more spindles distributing out the writes than on the RAID 10 system...

Am I missing something here?

February 8, 2007 11:45 AM
 

Linchi Shea said:

I'd argue that the following is one such 'everything else being equal' scenario. Say, you have X number of drives hanging off a SCSI channel. You can either configure all these drives into RAID 10 (i.e. stripe across 5 mirrored sets) or configure them into RAID 5. In this case, nothing else has changed except the RAID configuration.

But even in this well-controlled case, if you happen to choose a workload that is bottlenecked on some other component (e.g. the SCSI controller), you may not see the expected performance difference.

Linchi

February 8, 2007 1:51 PM
 

Chuck Boyce said:

Hi Dave,

I do not think Linchi is saying (and Linchi, correct me if I'm wrong) "for the same cost, a RAID 5 setup outperforms a RAID 10 system".

I think what he is saying is TEST and observe.

All things being equal, RAID 10 performs better on sequential writes than RAID 5.  Microsoft has published as much:  (see Table 1A of http://www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx)

The crux of the issue, though, is: ARE things equal?  Are you sure?  

For instance: has your company employed, for example, SRDF (http://en.wikipedia.org/wiki/SRDF) on one RAID implementation and not on another?  If so, there are significant implications on I/O performance!

All the best and have a good day,

Chuck

February 8, 2007 2:01 PM
 

David Markle said:

Linchi:

Thanks, that totally clears things up for me -- it was just a question of semantics.  When it comes to performance testing, I think your setup is the most honest way of doing it.  You're using the same hardware (keeping the cost variable constant) and comparing the two configurations.  

When folks are planning for projects, though, their requirements tend to dictate how much disk space their project needs.  That inherently makes the hardware unequal because you need that many more drives for a RAID 10 config to yield the same amount of space.  And in that scenario, I'd believe that the results would probably show the opposite -- you'd probably see a significant bump in write performance across the board on the RAID 10 system.  But that would be a somewhat disingenuous test...

February 8, 2007 2:30 PM
 

David said:

On my IE6 browser, the pictures in the text, and each

individual comment line is wider than the box for it, and the

right side of the picture, or the rightmost letters in each

comment are hidden from my view.

To keep this comment visible on my browser, I have inserted

a carriage return when the line gets close enough to the

right edge of the box.

February 14, 2007 8:32 AM
 

David said:

In looking at it more closely, each line of the

original article is also hidden under the grey

area on the right.

This problem remains even when I choose

the smallest text size from the IE 'view' tab.

It is worse with each increase in text size.

The only way I can read the article and

comments is to 'select all' and then copy

and paste into another tool.

February 14, 2007 8:38 AM
 

Jonathan B. said:

Linchi,

I think your post is only the start of a come-back for RAID 5.  Not so much because RAID 10 is bad, but as Dave pointed out, we haven't exactly been honest with ourselves when looking at the RAID 10 vs RAID 5 comparison.

For example, we just got a new server that has 4 SAS drives connected to a PERC 5e controller.  We don't care about space since it has *way* more than we need.  So, how should we set it up?  If you look at the most recent MS docs, it states with no exception that RAID 10 is better than RAID 5.  But when we did a round of tests, virtually all tests pointed to RAID 5 being the best setup for these four drives.

What gives??  Why has industry fallen so much in love with RAID 10 that we have posed the question in terms of space needed instead of for a given set of hardware?  Is this a way to sell more drives?

I mean seriously.  Suppose you use the space-needed approach which leads one to conclude that they need to setup a RAID 10 (b/c RAID 10 is better).  So, one purchases 10 drives instead of 5.  Then the 10 drives come in.  Shouldn't the first thing to do would be to test RAID 5 vs RAID 10 on ALL the drives, not just half of them?  Instead, we say we know RAID 10 is better and skip the comparison.

Another point I've concluded, although this may not be news to most, but the controller has a LOT to do with the outcome.  I am down-right flabbergasted at the sorry-A** results that PERC controllers (both 4 and 5) offer on anything but 64K-striped blocks or anything but RAID 5.  Clearly the best choice for fast drives with no fault-tolerance would be RAID 0.  But even tests on RAID 0 do not perform according to their theoretical performance capabilities given the number and capability of the drives (on the PERC controllers anyway).  I am very disappointed in the industry for not doing more testing and keeping our vendors honest.

Cheers to Linchi for keeping us on point and reminding us to test our hardware.  I'd add only one point, and that is to complain to your vendors if things don't perform like you feel they should.  It helps to have hard numbers on your side though -- so thanks to Lindi for sharing (his?) knowledge on this topic!

-Jonathan

February 19, 2007 11:01 AM
 

ebraekke said:

Linchi,

I strongly suggest that you check out Cary Millsap’s article on the subject (http://www.miracleas.com/BAARF/1.Millsap2000.01.03-RAID5.pdf).  Given that you haven’t published details about your tests, it is very difficult to assess whether you have taken into account the *big* problem with RAID5: the write penalty and its effect on the overall throughput of the IO system. A good test should take into account read and write performance, potentially with different r/w ratios. During periods of high write activity, RAID5 systems can display exceptionally poor performance characteristics.

July 27, 2007 3:11 AM
 

Marc said:

"If you have been asked to decide on which of these two volumes to place the transaction log file of a database, how would you proceed?"

What's the answer?  The SQL servers I have configured (and are in production) have OS & trans logs on RAID1 volume (C:) and DB on RAID5 volume (D:) -that used to be the suggested method.

I need to configure a new server and would like to just put it all on one RAID5 volume (4 disks).

August 8, 2007 1:19 PM
 

humbleDBA said:

I watch these discussions about RAID 5 vs RAID 10 with some amusement.

Linchi, I think it is great to challenge these things and I totally agree that Testing for 'your' environment is the best way of choosing a route.

A number of things spring to mind here for me...based upon experience. Whenever I have calculated my requirements for a database system on expected throughput, I have found that it is not size of the voulume that I need to focus on, but I/O. And whenever I calculate the I/O requirements balanaced on the ratios of expected reads to writes, etc., I always have found that based on those requirements RAID 5 requires more spindles to satisfy those requirements than RAID 10 - and this has been found even with a Read/Write of 9:1. Time and again, in high performance arenas, I have come across others talking of how much volume space I am asking for when I have asked for spindles based on I/O requirements. Of course, if performance is not an issue for you then this may be arbitrary - but I keep getting called to places that are having performance problems, so I guess for most it is important.

I also believe that quite often another area is missed with regard to these two RAID types. This is based on the redundancy characteristics (and effect on performance too) - sorry, I realise this is a bit of a digression on what Linchi is trying to get across, but I think it is a very important area that is often missed in these discussions. Lose just one drive in RAID 5 and you will see a significant performance drop. Even when a hot-spare or replacement drive is swapped in, the rebuild of that drive will severely impact performance until it is rebuilt. On top of this, from the time the drive failed to the time it is back online you are at risk of losing your array if you lose another physical drive. In contrast, RAID 10 (based on a set of mirrors that are then striped) can suffer multiple drive failures, as long as its not a mirror pair and will step-degrade in perfomance, ie in a 20 disk array (10 disks mirrored) the loss of one disk will see a performance degradation of approx 5%, two disks result in 10% degradation. RAID 5 degradation can be much as 80% or more with just one disk out. I have experienced multiple drive failures several times in different companies, so it can and does happen. Obviously, this is all based on the risk and cost for your business, but remember, the RAID levels are part of, and can affect the risk decision.

But to try and answer your question Linchi...I've not come across many DBs where the TLog throughput is so high that it stretches a RAID 1 setup, and where I have, it is usually due to multiple TLogs on the same drive and so the writes are random in nature. Also, I'm not likely to use numbers of spindles more than 6 or 8 for TLogs (set up for SQL2000 and heavy T-Replication work) and so cost-wise I don't think using RAID 5 will really save much. So, no, I don't think I will be using RAID 5 for TLogs just yet...but I am open evidence to suggest I should.

Thanks

November 29, 2007 7:12 AM
 

FormulaTroll said:

The "problem" with this comparison is that it's purely sequential, which in my experience is almost never the real-world circumstance. Random reads and writes are generally what I experience more often, even on the log file volume, since I usually don't have the budget to dedicate hardware to a single database. The log file volume is holding the log files for 20-50 databases, not just one.

I do agree that testing is always a good idea though, and that if your circumstances do allow for sequential access, then RAID5 deserves more consideration.

July 13, 2010 12:32 PM
 

Konstantin S, Ivanov said:

RAID 10 is faster that 5 by design. Some raid-board implementation could have better raid 5 that raid 10 - it's an issue of a controller-board - not a criteria to compare the raid type.

raid 5 is more complicated algorithm - that is why the speed. But you can use it if you have no money but want a lot of space with redundancy.

There is still probability you will cross synthetic test that will perform better in raid 5. Pray the real thing will work exactly the same way (it never does).

October 27, 2010 9:14 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement