THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

There are no sequential or random disk I/Os

Okay, it’s a bit of a hyperbole. But let me elaborate.

I regularly bump into SQL Server discussions, online or elsewhere, in which I hear people speaking with confidence that an operation is so and so because it is doing sequential disk I/Os or because it is doing random disk I/Os. That always makes me wonder how they know and how they can be so confident about that. In most cases, I absolutely have no idea whether the disk I/Os are sequential or random.

It makes sense to talk about I/Os issued by SQL Server being sequential or random at the SQL Server level. For instance, if a SQL Server operation (e.g. a table scan and a backup) accesses the database pages in the file page number sequence, it’s considered issuing sequential I/Os, and SQL Server certainly has control over that. Similarly, if you write a program (outside of SQL Server in the OS) to access a file, you can control how your I/O requests are issued with respect to how the data is laid out in that file. Whether the I/Os are sequential or random are relative to that file. No question there!

However, in today storage environment, there is usually a very long path and many layers of API calls between the SQL Server operations that we are interested in and I/Os on the physical drives (i.e. spindles and platters). It’s rare that you can control how your I/Os requests at an application level are serviced at the physical disk level. In lot of cases, you don’t have any visibility into how your I/O requests are eventually handled on the disk platters.

In addition, unlike other I/O access characteristics, whether I/Os are sequential or random, or to what extent they are sequential or random, is almost never exposed as any measurable metric. If you can’t measure or monitor whether I/Os are sequential or random, you can’t be so sure whether they are sequential or random. I don’t know where people get their confidence in talking about sequential disk I/Os or random disk I/Os.

All is not lost though. For all the intents and purposes of discussing sequential vs. random disk I/Os, we can turn our attention to large vs. small disk I/Os.

First of all, disk I/O sizes are measurable metrics and they are readily exposed by many tools.

In addition, when people speak of a database system such SQL Server being optimized for sequential I/Os, they really mean to say that the system is optimized for large I/Os. Even if they didn't mean it, you are free to interpret it as such. Looking from an I/O perspective, you can view key database techniques such as database checkpoints, read-ahead reads, and transaction logging as methods to optimize for large disk I/Os (among their other purposes).

On a disk platter, fewer sequential disk I/Os are indeed much more efficient than many random disk I/Os for accessing the same amount of data. The bet is that when you make a single large I/O request at the application level (e.g. SQL Server or your own code), it has a high probability of being translated into sequential disk I/Os at the lowest level (i.e. disk I/Os on the spindles and platters). Or at least, the chance of getting sequential disk I/Os is much higher for a large application-level I/O request than it is for multiple small I/O requests at the same application level. And that is a good bet!

Personally, I pay more attention to the I/O size than I do whether disk I/Os are sequential or random.

Published Friday, April 01, 2011 1:42 PM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

DaniSQL said:

Its hard to take a message of a blog seriously when its written on April Fools day:-)

April 3, 2011 8:47 AM
 

GrumpyOldDBA said:

It was after mid day

April 5, 2011 4:22 AM
 

AlanW said:

I indeed observed some applications do transitions to adopt bigger request size and lower sequential ratio instead of smaller request size and high sequential ratio originally.

people care about sequential or random is hoping to get better performance, but there are tradeoff between sequential and request size.

April 7, 2011 2:17 AM
 

KJON said:

I had a MS Support tech educate me on sequential and random I/Os. Recomendatios that resulted were:

pagefile.sys should be on its own drive

SQL install and OS should have their own drive.

TempDB needs it's own drive

LOG files (Specifically sequential) should have their own drive

Using a Network SAN, we long ago had decided that splitting up the drives had little impact on performance.....We were partially right.  To take advantage of seq vs rdm writes, the SAN has to be configured for the specific access.  Rdm and Sq access methods typically require separate disk pools.  Our Network techs had never thought of that, and only configured one pool optimized for the most common (random) access.

As a result, we hit performance degradation issues due to i/o pressure far below the limits we were told to expect (at about 50% of the expected throughput according to our Vendor stats).

The only drive that they wanted to have SPECIFICALLY configured for sequential write was the log drive.

May 26, 2011 4:27 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement