THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Parse the sqlio.exe Output

Let's say you run sqlio.exe as follows:

D:\sqlIO -kW -t32 -s30 -dE -o1 -fsequential -b8 -BH -LS Testfile.dat

The output typically may look like this:

sqlio v1.5.SG
using system counter for latency timings, 1999980000 counts per second
32 threads writing for 30 secs to file E:Testfile.dat
	using 8KB sequential IOs
	enabling multiple I/Os per thread with 1 outstanding
	buffering set to use hardware disk cache (but not file cache)
using current size: 20000 MB for file: E:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec:  7851.80
MBs/sec:    61.34
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 3
Max_Latency(ms): 18
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  6  7 12 16 30 19  6  2  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

The data you most likely want to get from the above output include the throughput metrics and the latency metrics (highlighted in boldface). Now, getting the throughput and latency metrics from the above single output is easy. But very rarely you'd just run sqlio.exe once to get a single data point, which in itself would not be very useful.

Often, you'd run sqlio.exe in a series with a range of parameter values such as different outstanding I/Os so that you can profile an I/O path. You can write a script to drive sqlio.exe with different parameters. Or, you can accomplish the I/O profiling with a simple batch file such as the following with the outstanding I/Os ranging from 1 through 128:

sqlIO -kW -t32 -s30 -dE -o1 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o2 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o4 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o8 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o16 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o32 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o64 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o128 -fsequential -b8 -BH -LS Testfile.dat
sleep 30

In practice, your batch file may be much longer than this when you want to test, for instance, different I/O paths and different block sizes. In any event, you'll end up with lengthy output and/or have to deal with the output of a batch file multiple times or even many times when you repeat your test to check for data consistency.

Manually extracting throughput metrics and latency metrics is simply out of the question.

Fortunately, parsing the sqlio.exe output to extract the metrics data is rather simple with any language that supports regular expressions and facilitates text manipulation. Below is a Perl script I use to parse the sqlio.exe output and dump the metrics data and the sqlio.exe command-line parameter values into a csv formatted text file for import into Excel.

use strict;

my $logFile = shift or 
    die "***Err: $0 expects a log file on the commend line.\n";

my ($drive, $thread, $rw, $duration, $iotype, $depth, $size, $ref);
open(LOG, $logFile) or 
     die "***Err: caould not open $logFile for reads.\n";
while (<LOG>) {
    if (/sqlio(?:\.exe)?\s+  
                \-k(\w)\s+
                \-t(\d+)\s+
                \-s(\d+)\s+
                \-d(\w+)\s+
                \-o(\d+)\s+
                \-f(\w+)\s+
                \-b(\d+)/ix) {
        ($rw, $thread, $duration, $drive, $depth, $iotype, $size) 
                        = ($1, $2, $3, $4, $5, $6, $7);
        $thread =~ /^\d+$/ or 
            die "***Err: Thread parameter $thread must be a number.\n";
        if ($iotype =~ /(random|sequential)/i) {
            $iotype = ucfirst($iotype);
        }
        else {
            die "***Err: IOType parameter must be random or sequential.\n";
        }
        if ($rw =~/^R/i) {
            $rw = 'Reads';
        }
        elsif ($rw =~ /^W/i) {
            $rw = 'Writes';
        }
        else {
            die "***Err: Read/Write parameter $rw must be R or W.\n";
        }
        
        $size =~ /^\d+$/ or
            die "***Err: Block size parameter $size must be a number.\n";
        $depth =~ /^\d+$/ or
            die "***Err: Queue depth parameter $depth must be a number.\n";
    }                        
    if (/IOs\/sec:\s+([\.\d]+)/i) {
           $ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{IOps} 
                       = int($1 + 0.5);
           next;
    }
    if (/MBs\/sec:\s+([\.\d]+)/i) {
           $ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{MBps} 
                       = int($1 + 0.5);
           next;
    }
    if (/Avg\_Latency\(ms\):\s+([\.\d]+)/i) {
           $ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Avg_Latency} 
                       = int($1 + 0.5);
           next;
    }
    if (/Min\_Latency\(ms\):\s+([\.\d]+)/i) {
           $ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Min_Latency} 
                       = int($1 + 0.5);
           next;
    }
    if (/Max\_Latency\(ms\):\s+([\.\d]+)/i) {
           $ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Max_Latency} 
                       = int($1 + 0.5);
           next;
    }
}
close(LOG);

foreach my $drive (sort keys %{$ref}) {
   foreach my $rw (sort keys %{$ref->{$drive}}) {
      foreach my $iotype (sort keys %{$ref->{$drive}->{$rw}}) {            
foreach my $thread (sort {$a <=> $b} keys %{$ref->{$drive}->{$rw}->{$iotype}}) { print "\n$drive ${size}K $iotype $rw\n"; print "Type,Threads,Depth,Block_Size,IOps,MBps,Avg_Latency(ms),Min_Latency(ms),Max_Latency(ms)\n"; foreach my $depth (sort {$a <=> $b}
keys %{$ref->{$drive}->{$rw}->{$iotype}->{$thread}}) { foreach my $size (sort {$a <=> $b}
keys %{$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}}) { print "$iotype,$thread,$depth,${size}K" . "," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{IOps} . "," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{MBps} . "," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Avg_Latency} . "," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Min_Latency} . "," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Max_Latency} .
"\n"; } } } } } }

This is really just a throw-away script. If you need to arrange the columns or the rows differently, you should modify the script to suit your own need. Or if you run sqlio.exe with the parameters in different order, you'll have to change the regular expression that parses the sqlio.exe command line. I can't predict your requirements (heck, I can't even forecast my own requirements) to put the script in a canned program. But I have no reason at all to worry about the constantly changing requirements or the circumstances I have not programmed the script to handle, because all I have to do is to change the script as I see the need. No heavy initial investment at all! That's the beauty of using a simple throw-away script like this with the source code in plain view.

The script has saved me a lot of time working with sqlio.exe. Hopefully, you'll find it useful as well if you happen to use sqlio.exe.

You can download the script from the attachment of this blog.

Published Wednesday, February 21, 2007 11:12 PM by Linchi Shea
Filed under: , , ,

Attachment(s): grepSQLIO.zip

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Yoni Nakache said:

hi all

inspired by your script ...

wrote a C# win form that execute the sqlio.exe and uses RedirectStandardOutput

to parse the output and it build a datagrid view with the results and setting, you can later to save the results to Excel or CSV file.

download from : http://www.usaupload.net/d/e8ooc6ve0hk

comment,tips,bugs

send to nakache AT gmail dot com

September 9, 2007 2:16 AM
 

Yoni Nakache said:

Hi again,

Updated Version of SQLIO GUI (several Bug fix and some new features)

http://www.usaupload.net/d/4qmkm6bg5k4

Special thanks to :

Kevin G. Boles

TheSQLGuru

Indicium Resources, Inc.

and

Eric Olson

for sending their tips/bugs encountered

good week to all

November 27, 2007 3:57 AM
 

Milena said:

Linchi:

when I use your perl script, only the last test gets populated into the file, here is how my output file is looking:

K  

Type,Threads,Depth,Block_Size,IOps,MBps,Avg_Latency(ms),Min_Latency(ms),Max_Latency(ms)

,,,,359,3,354,1,402

I am not sure why it is not printing the variables or all the other results. I am not familiar with perl either, so I was wondering if you had any thoughts.

Thank you!!

Milena

January 14, 2008 12:27 PM
 

Lucas said:

I had the exact same experience as Milena.

July 1, 2008 2:36 PM
 

Gene said:

Shea

You probably had to mention about sqlio test that the numbers it provides can be used ONLY for comparison purposes between different drives or configurations. Numbers by itself make no sence most of the time.

For instance, when i run sequential read on my local drive, i get 4037 IOPS and write sequential: 8258 IOPS.

No drive in a world (at least to my knowledge) provides even 1/10 of that throughput. we probably have clues how to explain it. But fact is the fact about it.

Now, some 'smart' manufacturers of SAN devices, like Equalogic,Inc use these numbers to over report their devices performance. They use absolute numbers reported for their devices:

http://www.equallogic.com/uploadedfiles/Resources/Tech_Reports/tr-iSCSI_SAN_deployment_and_test.pdf

This is pure missleading. Again, what is important to understand is the limitations of this utility. by the way, Equalogic does not use sqlio test which would report 10 - 20 times smaller numbers. And I beleive on purpose.

October 30, 2008 11:32 AM
 

Thijs Vijn said:

It doen't work on windows 2008 64bit.

But thanks anyway.

January 20, 2010 10:46 AM
 

Boe said:

Same problem as Milena and Lucas!

The script is only writing the last entry in the log file to the csv!

Any help out there?

March 17, 2010 6:17 AM
 

BlackHawk said:

One thing I would like to understand is how to use SQLIO to emulate the ASYNC nature of SQL. I assume outstanding depth would be the option. The question is how deep would that be? 1, 2, 64?

What if there are multiple threads? Does this change the -o setting required?

And along these lines... should we use the -m parameter to kick off scatter/gather routines?

March 27, 2010 11:18 AM
 

The Rambling DBA: Jonathan Kehayias said:

Today Joe Webb ( Blog | Twitter ) blogged about The Power of Regex in Powershell, and in his post he

May 25, 2010 9:12 PM
 

Alex Aguilar said:

Why do you sleep for 30 seconds between each test?  I remember having to do with the Storage Array but I cannot remember.

January 15, 2013 9:30 PM
 

M.Harrison said:

Alex, I have read that sleeping between tests allows the write cache to flush so that tests do not influence each other.

Mart

March 19, 2013 9:10 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement