Let's say you run sqlio.exe as follows:
D:\sqlIO -kW -t32 -s30 -dE -o1 -fsequential -b8 -BH -LS Testfile.dat
The output typically may look like this:
sqlio v1.5.SG
using system counter for latency timings, 1999980000 counts per second
32 threads writing for 30 secs to file E:Testfile.dat
using 8KB sequential IOs
enabling multiple I/Os per thread with 1 outstanding
buffering set to use hardware disk cache (but not file cache)
using current size: 20000 MB for file: E:Testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 7851.80
MBs/sec: 61.34
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 3
Max_Latency(ms): 18
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 6 7 12 16 30 19 6 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The data you most likely want to get from the above output include the throughput metrics and the latency metrics (highlighted in boldface). Now, getting the throughput and latency metrics from the above single output is easy. But very rarely you'd just run sqlio.exe once to get a single data point, which in itself would not be very useful.
Often, you'd run sqlio.exe in a series with a range of parameter values such as different outstanding I/Os so that you can profile an I/O path. You can write a script to drive sqlio.exe with different parameters. Or, you can accomplish the I/O profiling with a simple batch file such as the following with the outstanding I/Os ranging from 1 through 128:
sqlIO -kW -t32 -s30 -dE -o1 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o2 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o4 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o8 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o16 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o32 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o64 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
sqlIO -kW -t32 -s30 -dE -o128 -fsequential -b8 -BH -LS Testfile.dat
sleep 30
In practice, your batch file may be much longer than this when you want to test, for instance, different I/O paths and different block sizes. In any event, you'll end up with lengthy output and/or have to deal with the output of a batch file multiple times or even many times when you repeat your test to check for data consistency.
Manually extracting throughput metrics and latency metrics is simply out of the question.
Fortunately, parsing the sqlio.exe output to extract the metrics data is rather simple with any language that supports regular expressions and facilitates text manipulation. Below is a Perl script I use to parse the sqlio.exe output and dump the metrics data and the sqlio.exe command-line parameter values into a csv formatted text file for import into Excel.
use strict;
my $logFile = shift or
die "***Err: $0 expects a log file on the commend line.\n";
my ($drive, $thread, $rw, $duration, $iotype, $depth, $size, $ref);
open(LOG, $logFile) or
die "***Err: caould not open $logFile for reads.\n";
while (<LOG>) {
if (/sqlio(?:\.exe)?\s+
\-k(\w)\s+
\-t(\d+)\s+
\-s(\d+)\s+
\-d(\w+)\s+
\-o(\d+)\s+
\-f(\w+)\s+
\-b(\d+)/ix) {
($rw, $thread, $duration, $drive, $depth, $iotype, $size)
= ($1, $2, $3, $4, $5, $6, $7);
$thread =~ /^\d+$/ or
die "***Err: Thread parameter $thread must be a number.\n";
if ($iotype =~ /(random|sequential)/i) {
$iotype = ucfirst($iotype);
}
else {
die "***Err: IOType parameter must be random or sequential.\n";
}
if ($rw =~/^R/i) {
$rw = 'Reads';
}
elsif ($rw =~ /^W/i) {
$rw = 'Writes';
}
else {
die "***Err: Read/Write parameter $rw must be R or W.\n";
}
$size =~ /^\d+$/ or
die "***Err: Block size parameter $size must be a number.\n";
$depth =~ /^\d+$/ or
die "***Err: Queue depth parameter $depth must be a number.\n";
}
if (/IOs\/sec:\s+([\.\d]+)/i) {
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{IOps}
= int($1 + 0.5);
next;
}
if (/MBs\/sec:\s+([\.\d]+)/i) {
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{MBps}
= int($1 + 0.5);
next;
}
if (/Avg\_Latency\(ms\):\s+([\.\d]+)/i) {
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Avg_Latency}
= int($1 + 0.5);
next;
}
if (/Min\_Latency\(ms\):\s+([\.\d]+)/i) {
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Min_Latency}
= int($1 + 0.5);
next;
}
if (/Max\_Latency\(ms\):\s+([\.\d]+)/i) {
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Max_Latency}
= int($1 + 0.5);
next;
}
}
close(LOG);
foreach my $drive (sort keys %{$ref}) {
foreach my $rw (sort keys %{$ref->{$drive}}) {
foreach my $iotype (sort keys %{$ref->{$drive}->{$rw}}) {
foreach my $thread (sort {$a <=> $b} keys %{$ref->{$drive}->{$rw}->{$iotype}}) {
print "\n$drive ${size}K $iotype $rw\n";
print "Type,Threads,Depth,Block_Size,IOps,MBps,Avg_Latency(ms),Min_Latency(ms),Max_Latency(ms)\n";
foreach my $depth (sort {$a <=> $b}
keys %{$ref->{$drive}->{$rw}->{$iotype}->{$thread}}) {
foreach my $size (sort {$a <=> $b}
keys %{$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}}) {
print "$iotype,$thread,$depth,${size}K" .
"," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{IOps} .
"," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{MBps} .
"," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Avg_Latency} .
"," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Min_Latency} .
"," .
$ref->{$drive}->{$rw}->{$iotype}->{$thread}->{$depth}->{$size}->{Max_Latency} .
"\n";
}
}
}
}
}
}
This is really just a throw-away script. If you need to arrange the columns or the rows differently, you should modify the script to suit your own need. Or if you run sqlio.exe with the parameters in different order, you'll have to change the regular expression that parses the sqlio.exe command line. I can't predict your requirements (heck, I can't even forecast my own requirements) to put the script in a canned program. But I have no reason at all to worry about the constantly changing requirements or the circumstances I have not programmed the script to handle, because all I have to do is to change the script as I see the need. No heavy initial investment at all! That's the beauty of using a simple throw-away script like this with the source code in plain view.
The script has saved me a lot of time working with sqlio.exe. Hopefully, you'll find it useful as well if you happen to use sqlio.exe.
You can download the script from the attachment of this blog.