THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Beware of Shifting SAN

Let’s say you are trying to determine the performance impact of a neat database design change you have just devised on an application. So you run some tests with the existing design and the tests run for several hours. Coming back the next day, you make the change and re-run the same tests. The test results look fantastic. Now, before you jump up and down announcing to the world how great your new design trick is, double check whether your change is the only variable responsible for the performance improvement.

 

If you are using Storage Area Network (SAN) and storage is a significant factor in your tests, there is a danger that your conclusion may be built on shifting sand because the performance of the drives provisioned from SAN may have changed between your tests. Unbeknown to you, the test results reflect primarily, not your design change, but some uncontrolled change inside SAN.

 

I ran into a similar situation when I was checking the performance impact of disk partition alignment (or misalignment). Good thing that the SAN change didn’t happen between tests of different configurations, but took place when I was repeating the same tests. So I caught the change right away. The following chart shows that when I ran the exactly same SQLIO benchmark tests for the 3rd and 4th time, the I/O performance profile of the drive changed dramatically.

Your SAN performance may shift underneath you for many reasons, some good and some not so good. You can try to cozy up to your SAN folks. But that would not guarantee you the inside knowledge of all the changes. Nor would you want to know all the nitty-gritties happening inside SAN.

 

The best way to guard against this situation from misleading you to an embarrassingly wrong conclusion is to randomize your test schedule. You may want to alternate between the different test configurations as you carry out your tests.

 

Say, you are testing the performance difference between config 1 and config 2. So instead of doing all the tests on config 1 on day one and all the tests on config 2 on day two, schedule your tests so that config 1 tests are interleaved with config 2 tests throughout the two days. If that’s not possible, conduct all your config 1 tests followed by all your config 2 tests as usual. But before you draw any conclusion, repeat some of your config 1 tests to verify that the results you obtained previously are still repeatable.

 

Now, don’t get me wrong. I think SAN has been a boon to our industry. It pushes storage down the infrastructure stack and abstracts us away from the nasty details of its management. The adoption of SAN moves us closer to the grand vision of utility computing (okay, not that we are ral close to that vision).

 

Today, many enterprises have deployed SAN of one kind or another, and some large enterprises are exclusively relying on SAN for storage provisioning. You just have to learn to live with its idiosyncrasies.

Published Wednesday, January 03, 2007 10:18 PM by Linchi Shea
Filed under: , , ,

Attachment(s): SANIO.gif

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Declan Mcgrory said:

Linchi,

We constantly run into issues where the SAN gives different performance charactistics, and we have traced this to other servers accessing the SAN, are there any tools that can help me diagnose this, rather than my users calling me up and saying things are slow. I suppose what we are looking for is a SAN aware performance monitor.

Thanks

Declan

January 27, 2007 10:21 AM
 

Linchi Shea said:

SAN vendors have tools that collect perf stats on the SAN side. And these perf stats can highlight whether the I/O path is being shared during a test. Unfortunately, in most cases these perf stats are not available to the database folks.

I/O traffic from other hosts that share the components on the I/O path (e.g. frontend port on a EMC DMX) can certainly mess up your tests. And I have had no shortage of frustration in this department. The most effective approach is to become friends with your SAN folks, and work with them to find a 'quiet time' to conduct your tests.

Linchi

February 20, 2007 9:43 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement