THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Merrill Aldrich

Flash-y Re-index: Why Defrag on SSD’s?

Some time ago, I blogged about how to really comprehensively re-index a data warehouse:
http://sqlblog.com/blogs/merrill_aldrich/archive/2009/08/09/hexagonal-close-packing-for-your-fact-data.aspx.
(Wow, 2009. Time flies.)

Today I am reviving that old post, because I had a chance this week to put that technique into practice against a +/- 1 TB warehouse running on SSDs. Noteworthy: the technique reclaimed 20 percent of the disk space on the SSD disk set. This is significant on a number of levels - but mostly it was just a whole lot more than I ever expected. The database was already being maintained with a conventional re-indexing scheme, so I did not imagine this process would have such a dramatic effect.

The orange line in this graph from our SCOM dashboard represents percent free space on the data drive array, before (left) and after (right):

BISpaceRecaptured

Where did all that space come from? Two places: first, page density. A fully defragmented index structure can be built with the maximum possible density of rows per page, especially in a warehouse scenario where much of the data is read-only. That means limited empty space on each data page. Second, the free space in each file - in a partitioned warehouse, there can be tens or hundreds of data files - that had been provided as headroom to re-index the data partitions within the file, is no longer needed, once the older data is no longer receiving inserts or updates.

With all the buzz about how SSD’s eliminate most of the performance penalty caused by disk fragmentation, it’s tempting to dial back the defrag strategy, but there are a couple reasons you might not want to do that:

Sparse Data Pages Still Waste RAM

In the buffer cache, the unit of memory is a whole data page. That means, if you are doing decision support, it’s helpful to pack as many rows onto each data page as possible, because the net effect is that more rows fit in memory on your server. (By Sparse here I mean “partly empty,” not the Sparse feature of SQL Server.)

SSDs are Still Expensive

In the past we had to buy spindles by the boatload just to get to the random iops numbers we needed, and the disks were often mostly empty. The main cost benefit in SSD’s, apart from raw datacenter space and power, is that we can replace many, many spindles with a few SSD devices, and fill them up. An interesting side-effect: while I didn’t care too much about ever filling disks on the old system, I now have to start worrying that the SSDs could actually fill up to capacity, and it would be costly to expand the total volume of available space by adding more SSD’s.

SSDs can Slow as they Fill Up

Another reason, maybe more esoteric, is that the write performance of many SSD technologies can deteriorate as the disk becomes full of data. That implies a bit of tension between wanting to pre-allocate enough file space to be comfortable vs., for example, auto-grow, but not over-allocate such that the performance of the disks suffers. In a warehousing scenario, with partitioned tables, this can be handled by automating the process of adding new files weekly, monthly (or whatever your partition size is) incrementally over time, rather than creating all the files and filling the disks up at the outset, as might be tempting on an array of spinning disks.

For all these reasons, it might be a good idea to continue to give all your indexes that same defragging TLC we have come to know and love.

It’s always nice to finish out a Friday on a high note, especially before a holiday weekend. So if you are reading this in the US, Friday, enjoy the long weekend! Otherwise, happy re-indexing!

 

Edit 6/3/2011 Paul Randal has an excellent post on wasted (or potentially wasted) memory from low-density data pages:
http://www.sqlskills.com/BLOGS/PAUL/post/Performance-issues-from-wasted-buffer-pool-memory.aspx

Published Friday, May 27, 2011 5:56 PM by merrillaldrich

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Greg Linwood said:

Re your last point - SSDs are not designed to be run at full capacity under heavy I/O, so it shouldn't surprise that write performance degrades if you write heavily to a fully formatted SSD - this is basically mis-using the device.

This phenomenon stems from the fact that data cannot be over-written in place in Flash memory. Blocks are moved during every write, leaving old data in place waiting garbage collection before being able to be re-used by future writes.

If you fill the device & continue to perform heavy I/O, there is no room for the block relocation required by flash memory, leaving writes to wait on background garbage collection to free up previously dirtied blocks. Naturally performance hits the "write cliff" if this occurs, but its only due to mis-configuration, not failure of the device which is only doing what it is designed to do.

Hence, "over-provisioning" of SSDs is necessary to maintain performance under heavy load. This basically means you should only format the drive to a lower capacity (eg 75% or even lower in extreme cases) than the full capacity of the device. This allows the device to have sufficient space for background garbage collection to support high level of writes.

Re the cost of SSDs, TPC.org audited benchmarks prove that SSDs are vastly cheaper than HDDs as HDDs require massive levels of striping to achieve high levels of performance vs SSDs which only require striping to consolidate mounted storage. Note the highest HDD TPC-C result in 2008 required 11,000 HDDs & cost US$20M yet SSDs have achieved much higher levels of performance for a fraction of the cost.

May 29, 2011 6:25 AM
 

merrillaldrich said:

Greg - thank you for reading, and for the comments. Can you help me clarify one thing - in terms of provisioning, does the write performance penalty apply to allocated but unused portions of the ntfs file system (meaning areas presented to Windows and formatted by NTFS but not part of files)? That would affect the decision, I guess, about whether to actually even disk-format the drive to capacity vs. leave truly unformatted space, which is different than leaving empty space inside the disk volume.

May 29, 2011 12:27 PM
 

merrillaldrich said:

Also, I totally agree about the cost argument - SSD's have become very good in cost/perf. Still, they are expensive from a cost/gb point of view, so it makes sense to me to squeeze the most net benefit out of each one.

May 29, 2011 12:36 PM
 

Greg Linwood said:

Its both really. Wear levelling (the phenomenon resulting from "write amplification" or "relocated writes") occurs throughout the device, but "over-provisioning" is the storage not allocated to the disk. All SSDs have at least 7% over-provisioning in built due to the nature of how NAND Flash RAM capacity calculations work but you can increase this if you expect extremely high levels of I/O such as occurs with SQL Server. Fusion-io presents these options as "performance" options during the initial "low level formatting" setup process, though they really do nothing to improve performance per se, they just leave sufficient space to avoid performance from degrading under load.

Re the cost/gb point of view - single device comparisons aren't very meaningful with DB systems. What really matters is the combined capacity / IOPS, as there's no point sticking a multi TB SATA drive into a DB server just because its cheap.

May 30, 2011 7:07 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement