THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Joe Chang

Consumer SSDs with SQL Server

Over the last two years, I have stood up several proof-of-concept (POC) database server systems with consumer grade SSD storage at cost $2-4K per TB. Of course production servers are on enterprise class SSD, Fusion-IO and others, typically $25K+ per TB. (There are some special situations where it is viable to deploy a pair of data warehouse servers with non-enterprise SSD).

PCI-E SSDs - OCZ RevoDrive, RevoDrive X2, & RevoDrive 3 X2
The first POC system was a Dell T710 with 2 Xeon 5670 processors 96GB (12x8GB) memory, 16x10K SAS HDDs and 6 OCZ RevoDrive (original version) PCI-E SSDs supporting table scans at nearly 3GB/s. The most difficult query repeatedly hashed a large set of rows (as in there were multiple large intermediate result sets) generating extremely heavy tempdb IO. With tempdb on 12 10K HDDs, the query time was 1 hour. With tempdb on the 6 OCZ RevoDrives, the query time was reduced to 20min.

Before SSDs became viable, I would normally have configured a 2-socket system with 48 (2x24) 15K HDDs, with one RAID controller for each 24-disk enclosure. This setup costs about $11K per enclosure with 24x146GB 15K SAS drive and can be expected to deliver 4GB/s sequential bandwidth, 10K IOPS at low queue, low latency (200 IOPS per 15K disk) and in the range of 15-20K IOPS at high queue, high latency. As it was my intent to deploy on SSD, I only configured 16 HDDs in the internal disk bays and did not direct the purchase of external HDDs.

The 6 OCZ RevoDrive 120GB PCI-E SSDs in the POC system cost about $400 each at the time (now $280?). I recall that the tempdb IO traffic was something like 40K IOPS (64KB), around 2.5GB/s bandwidth. This was consistent with the manufacturers specifications of 540MB/s read and 480MB/s write at 128K IO, and considering that there will be some degradation in aggregating performance over 6 devices. The IO latency was somewhere in the range of 40-60ms (note that the SQL Server engine issues tempdb IO at high queue depth). OK, so the real purpose of the POC exercise was to tell the SAN admin in no uncertain terms that the 350MB/s from his $200K iSCSI storage system (4x1GbE) was pathetic, and even the 700MB/s on 2x4Gbps FC ports does not cut mustard in DW.

The next set of systems was ordered with 4 OCZ RevoDrive X2, 160GB (<$500 each). There was some discussion on whether to employ the OCZ enterprise class ZDrive R3, but this product was cancelled and the OCZ substitute, the VeloDrive (4 SandForce 1565 controllers, rated for ~1GB/s), was not yet available. I was expecting somewhat better performance for 4 RevoDrive X2 (4 SandForce 1222 controllers each, rated for 700MB/s) over 6 of the original RevoDrives (2 SandForce controllers each).  The tempdb IO intensive query that took 20min with the 6 RevoDrives now ran in 15min with the 4 RevoDrive X2s. In additional, IO latency was under 10ms.

I was hoping to test the new OCZ RevoDrive 3 X2 with 4 SandForce 2281 controllers, rated for 1500MB/s read and 1200MB/s write. Unfortunately there is an incompatibility with the Dell T110 II with the E3-1240 (Sandy Bridge) processor which has a new UEFI replacing the BIOS. OCZ does not provide server system support on their workstation/enthusiast products. Hopefully Dell will eventually resolve this.

SATA SSDs - OCZ Vertex 2, Vertex 3 & Vertex 3 Max IOPS, Crucial C300 & m4
My preference is to employ PCI-E rather than SATA/SAS SSD devices. This is mostly driven by the fact the disk enclosures reflect the IO capability of HDDs, with 24 bays on 4 SAS lanes. An SSD oriented design should have 4 SSDs on each x4 SAS port. Of course, 4 SSDs and 4-8 HDDs on each x4 SAS port is also a good idea.

So I have also looked at SATA SSDs. Earlier this year, I started with the OCZ Vertex 2 and Crucial C300 SSDs. After encountering the issue with the RevoDrive 3 on the new Dell server, I acquired OCZ Vertex 3, Vertex 3 Max IOPS, and Crucial m4 SATA SSDs. The OCZ Vertex 2 has a 3Gbps interface, the Vertex 3 and both Crucial C300 and m4 all support 6Gbps SATA interface.

The OCZ Vertex SSDs use SandForce controllers; the Vertex 2 uses the previous generation SandForce 1222 and the Vertex 3 uses the current generation 2281 controller. The Crucial SSDs use Marvel controllers (both?). Perhaps the significant difference between the OCZ Vertex and Crucial SSDs are that the SandForce controllers implement compression. The OCZ Vertex SSDs have far better write performance with compressible data, but is comparable for incompressible data. It does appear that SQL Server tempdb IO is compressible and benefits from the compression feature.

Another difference is that OCZ offers 60, 120, 240 and 480GB capacities while Crucial offers 64, 128, 256 and 512GB capacities. All capacities are in decimal, that is, 1GB = 10^9 bytes. Both OCZ 60GB and Crucial 64GB presumably have 64GB NAND flash, the 64GB being binary, meaning 1GB = 1024^3, or 7.37% more than 1GB decimal. Basically, OCZ has more over-provisioning than the Crucial, which in theory should also contribute to better write performance. (Earlier Vertex drives had 50 and 100GB capacities. But there are so many varieties of the Vertex 2 that I cannot keep track.)

In brief, the performance difference between SSD generations, both from the OCZ Vertex 2 to Vertex 3 and from the Crucial C300 to m4, is substantial, so I will focus mostly on the newer Vertex 3 and m4 drives. The performance observed in SQL Server operations seemed to be consistent with manufacturer specifications for both generations of OCZ and Crucial SSDs. It was not noted whether writing compressed SQL Server database tables were further compressible by the SandForce controller. This may be because it is difficult to achieve the high write performance necessary to stress modern SSDs in SQL Server with transactional integrity features.

Test System - Dell T110 II, Xeon E3 quad-core Sandy Bridge processor
The test system is a Dell PowerEdge T110 II, Xeon E3-1240 3.30GHz quad-core processor (Sandy Bridge) with 16GB memory. This system has 2 PCI-E Gen2 x8 and 1 Gen 1 x4 slot. All SSDs were attached to a LSI MegaRAID SAS 8260 controller (PCI-E Gen2 x8, 8 6Gbps SAS ports). I did not some testing with 2 SSDs on the SATA ports (3Gbps) but did make detailed observations. 

Incidentally, the cost of this system, processor, memory and 1 SATA HD was $1078? The 128GB SSDs were about $220 ($268 for the Max IOPS). So a very capable system with 2 SSDs could be built for $1300-1500 (64GB or 128GB SSDs). A better configuration with 2 SATA HDDs, 4 SSDs and SAS controller would push this to $2500. But if Dell and OCZ could resolve this RevoDrive 3 -UEFI issue, then I would recommend the T110 II, E3 processor, 16GB memory, 2 SATA HDDs and 1 RevoDrive 3 X2.

One unfortunate aspect of this system is that the SATA ports are all 3Gbps per the Intel C202 PCH, even though the ever so slightly more expensive C204 supports 6Gpbs SATA on 2 ports. Basically, this has similar characteristics as the database laptop with super IO that I proposed earlier, except that the laptop would be 2.5GHz to keep power reasonable.

Performance tests with the TPC-H SF 100 database
With 8 SSDs (2 Vertex 3, 2 Vertex 3 MaxIOPS, 2 m4, and 2 C300) I was able to generate 2.4GB/s in table scan aggregation query, possibly gated by the 355MB/s rating of the C300s. A configuration consisting of the 4 Vertex 3 and 2 m4’s would have been gated by the 415MB/s rating of the m4. If can get a total 8 Vertex 3s, which are rated at 550/500MB/s for compressible and incompressible data, then I would either be limited by the adapter or the PCI-E Gen2 x8 limit of 3.2GB/s. There is an LSI SAS8265 adapter with dual-cores that has even higher IOPS capability, but it is not known whether this is necessary for large block IO.

The tests consisted of running the TPC-H queries, single stream (but not per official benchmark requirements). The figure below show the time to run the the 22 queries (excluding statistics, parse and compile) for 2, 4 and 6 SSDs with no data compression (raw) and with page mode data compression.

tpch 100 query time

Run time on 2 OCZ Vertex 3 (regular) SSDs was 815 sec with compression and 936 sec raw (w/o compression). On 4 OCZ Vertex 3 SSDs (2 regular, 2 MaxIOPS) total query times were reduced to 658 sec with compression and 622 sec raw. On 6 SSDs, 4 OCZ and 2 Crucial m4, total query times are 633 sec with compression and 586 sec raw.

The figure below shows tempdb IO write latency for 2, 4 and 6 SSDs, with raw and compressed tables.

tpch 100 query time

On the 2 OCZ SSDs, IO latency from fn virtual file stats (for the entire run) averaged 30ms (temp write) and 90ms (data read) with compression and 60ms (temp write) - 130ms (data read) without compression. The performance with 2 Crucial m4 drives was less, showing much higher write latencies. On 4 OCZ Vertex 3 SSDs, IO latency was 14 temp write and 30ms data read with compression and 18-60ms without compression. The IO latencies on the Max IOPS models were lower than on the regular Vertex 3 models. For 6 SSDs, IO latencies are now down to the 15ms range, with somewhat higher latency on the Crucial m4 SSDs.

With data and tempdb on 2 OCZ Vertex 3 SSDs, performance was decent but IO constrained. Performance was 15% better with data compression (page) than without, even though CPU was 23% higher. Performance with 4 OCZ Vertex 3 SSDs (2 regular, 2 Max IOPS) was 20% better for compression on and 34% better without compression, relative to performance with 2 SSDs. The performance without compression was now 6% better than with compression. At 6 SSDs (4 Vertex 3, 2 m4), there was another 5% performance improvement relative to 4 SSDs, for both compressed and not compressed.

In the above tests, each SSD was kept as a standalone disk, i.e., I did not use RAID. There was 1 data and 1 tempdb file on each SSD. I noticed that the uncompressed (raw) database tended to generate 64K or 128K IO, while the compressed database tended to have 256K IO. Two queries, 17 and 19(?) generated 8KB IO, and would have much better performance with data in memory. There was also wide variation from query to query in whether performance was better with or without compression.

SandyBridgeLaptop

Published Saturday, September 17, 2011 12:21 AM by jchang
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Black Bart said:

What about using a RAID Controller with the SSDs?

Wondering if after all these years RAID Calcs were slowing down

all those IOs.

Now what is left to put in front of and speed up the SSDs?

(Rhetorical question)

Answer = nothing

this is the end of the road folks

(AKA Scale-up is finished)

"Everything is now in DRAM"

Now What?

Now nothing:

Get back to optimizing your code and writing a better engine....

;)

September 25, 2011 10:45 PM
 

jchang said:

????

the RAID controller is first for RAID, second for aggregating a lot of units. Even though the RAID controller has cache, it was always better to bypass the cache with direct IO, just using it for read-ahead and some times write back.

The new LSI 2nd generation 6Gb/s SAS controllers, 9265 and 9285 are just fine at supporting SSD IOPS and RAID if you want.

I do think that RAID should be incorporated into the SSD, probably as chip kill, but dual controllers is probably good too.

The main reason I have favored PCI-E SSD is more because the SAS disk enclosures typically have 24 bays on dual-ported x4 SAS, and I think it should be 12-16 bays (8 SSDs + 4-8 HDDs), but this should change once the storage vendors figure it out.

September 26, 2011 9:56 AM
 

Maarten said:

Hi,

I'm investigating the OCZ REVODRIVE 3 X2 MAX IOPS PCI-EXPRESS SSD "RVD3MIX2-FHPX4-240G" for use in our servers.

Let me explain our current setup:

At the moment we have a:

Dell 2900 purchased in 2006

2 CPU’s

4 * 15K SAS disks in a RAID5 config

4Gb memory

1 * 1Gb nic

SQL server 2005

All the databases are on one volume

This has been replaced already with:

SuperMicro H8SGL-F-O Mobo

AMD 6128 CPU

16Gb memory

Raid controller LSI MegaRaid 24i4e

12 Hitatchi 3Tb 6G 72K drives in RAID 50 Config + one Global hot spare (expandable to 24drives).

the Raid strip and NTFS block size are 64K.

The new machine is not living up to it's expectations. YES as soon as SQL has the data in cache and memory it's really fast. But on the disc side it's not a winner. (we are playing with smaller blocksizes but the 7.2K and average seek of at least 8ms will be (and stay) a bottleneck)

From the start we had our thoughts but specs and atto benchmark looked fine. in practise it's too slow.

This brings me to your site:

My 'old' Dell 2900 purchased in 2006 is keeping latency <4ms under real world scenario's. And I'm each day surprised this solution keeps delivering such good results. But it's 5 years old...

Questions:

1. did you already get an "RVD3MIX2-FHPX4" (120/240/480/960 variant) and play with that?

2. how do you and or your users experience the speed on consumergrade /smb level SSD's ?

3. have you stressed the card with SQLIOSimX86.exe? what are your results in queue depth and latency? When I stress my 'old' system I get high latency but this is not real scenario in my situation.

I want to use the card for a year or 6 months or so for the DATA and LOGS until the enterprise solutions are more affordable.

So I'm aware of the risks in using this in enterprise environment, on the other hand If I'll buy 2 to 4 cards and make them High Available in our storage software, then it's a calculated risk, and data does not get lost in case of an ssd failure

The TempDB will be moved to a RAMDRIVE of which I get good results (reason for this is the high amount of writes of the tempDB), and I donnot want to do this on the SSD

Hope to hear from you. Maarten

November 24, 2011 8:52 AM
 

jchang said:

1. As I said, the Revo 3 X2 (and probably the Revo 3) had a BIOS incompatibility with the Dell T110 II with Sandy Bridge processor and brand new UEFI. A later hotfix from Dell resolved this, but my R3 X2 seemed to be running at 2.5Gbps PCI-E instead of 5Gbps. So I was seriously dissapointed.

2. One thing that was noticeable was that consumer SSD performance fell off quickly with use. I am thinking this is because there is too little reserve/set aside capacity, 13% for OCZ and as little as 7% for Crucial. The enterprise SSDs typically set aside 25% or more? This is not important in test, because the performance can be recovered on reformat and reload. But in prod?, maybe ok for DW depending on how bitchy your users are.

3. I don't bother with SQLIOSIM, if you want meaningful measurements, you will learn how to generate specific SQL Server execution plans that will test the range of IO operations. At high-queue, which SQL Server will generate, a single or few SSDs will get swamped. That is why my other blog recommends 4-8 of the Vertex 3 MaxIOPS SSDs per quad-core socket.

I don't understand why you think tempdb needs to be in RAM, I find that the big array of SSDs works. I suggest paying close attention to configuration, it is never a single SSD but many SSDs, fill the slots!

November 26, 2011 7:06 AM
 

Maarten said:

Hi, Thanks for your comments.

December 24, 2011 10:45 AM
 

Michael Brown said:

Well I have tried a few OCZ setups now as well with my Dell R805 servers. I initially went with (2) RevoDrive 3 X2 (max iops not available yet) but my read and write were maxed around 800mb/sec b/c of the PCIe 1.x slots. I put one of these cards in my Dell T7500 workstation with PCIe 2.x slots and I immediately got 1,100mb/sec write and 1,500mb/sec read.

So I decided to go another route on my DB server. My web server is just fine with the RevoDrive 3 X2 for the amount of IIS web requests I have. So for my db server (MSSQL Server 2012) I purchased (4) OCZ Vertex III Max IOPS (120gb) and (2) IBM 1015 Raid controllers flashed to LSI 9410 (very common). I picked all that up for $1,100 which is a great price. I mirrored (2) drives per controller, then used software Raid 0 (striped) once I got into windows server 2008. This gave me 1000mb/sec write and over 1,500mb/sec read! Remember this is on PCIe 1.x slots imagine if that was PCIe 2.x. My 4kb sequential is also much higher due to dual controllers and the mirrored setup. MOST IMPORTANTLY I now have gained redundancy which the RevoDrive does not give, at all.

I used that setup for my DB database files as you can't software RAID 0 the OS. You can software RAID 1, just not RAID 0. I did try a RAID 0 + 1 by way of striping the (2) drives on the controller, then software RAID 1. My results were around 800mb/sec write AND read. So my decision was to purchase (2) more SSD drives and one more RAID controller and mirror them and use that to run my OS off of. This way I have redundancy and performance for my DB files.

Now for testing I also decided to try a RAID 10 setup with the (4) drives on ONE controller card. I only got 700mb/sec read and a bit slower write. So no where near as good. This may be b/c of the class of controller card. A much higher performance card will do better this card is not known for good RAID 10 performance. Regardless using multiple controllers seems to be a no brainer.

Hope this helps someone out there who is considering RevoDrives or multiple Raid controllers and SSD.

February 6, 2012 12:46 PM
 

tim said:

Any suggestion of installing revo 3 on windows 2003 server 64 bit?

March 6, 2012 7:19 PM
 

Yi said:

Hi, All,

I just got a RevoDrive 3 x2 PCIe card to fit into a Dell 1950. It only came with a win 7 driver? Can someone tell me where to find the driver for Windows 2003 32bit? Seems everyone here has gotten it to run in servers with no problem. Thanks much!

Yi

yizhang22@hotmail.com

October 4, 2012 11:38 PM
 

Bob said:

I appreciate your thorough review.  I would like to use a PCIe RevoDrive with my Dell T300 server running 2008 SBS Server.  It will not be the boot drive.  Will this work?

July 12, 2013 12:34 AM

Leave a Comment

(required) 
(required) 
Submit

About jchang

Reverse engineering the SQL Server Cost Based Optimizer (Query Optimizer), NUMA System Architecture, performance tools developer - SQL ExecStats, mucking with the data distribution statistics histogram - decoding STATS_STREAM, Parallel Execution plans, microprocessors, SSD, HDD, SAN, storage performance, performance modeling and prediction, database architecture, SQL Server engine

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement