<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tag 'Performance'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Performance&amp;orTags=0</link><description>Search results matching tag 'Performance'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>SQL Saturday #220 (Atlanta): Demos</title><link>http://sqlblog.com/blogs/adam_machanic/archive/2013/05/18/sql-saturday-220-atlanta-demos.aspx</link><pubDate>Sat, 18 May 2013 18:33:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:49114</guid><dc:creator>Adam Machanic</dc:creator><description>&lt;p&gt;&lt;b&gt;Today at SQL Saturday #220 in Atlanta I presented a new brand new session&lt;/b&gt;, "SQL Server Query Plan Analysis: The 5 Culprits That Cause 95% of Your Performance Headaches." &lt;/p&gt;&lt;p&gt;This session is designed to help people quickly analyze query plans and find likely culprits without being query tuning experts; I find that in a huge number of cases the root cause of problems is one of just a few potential situations.&lt;br&gt;&lt;/p&gt;&lt;p&gt;Thanks to everyone who joined me today for the deliveries! In addition to it being a new session this was also the first time that I've ever been asked to present the same session twice in a single day. So it was quite an experience.&lt;/p&gt;&lt;p&gt;&lt;b&gt;The demo script is attached&lt;/b&gt;. As always, let me know if you have any questions or comments.&lt;/p&gt;&lt;p&gt;Enjoy! &lt;br&gt;&lt;/p&gt;</description></item><item><title>Israeli SQL Server Usergroup: Locking &amp;amp;amp; Blocking in active environment</title><link>http://sqlblog.com/blogs/michael_zilberstein/archive/2013/05/13/49047.aspx</link><pubDate>Mon, 13 May 2013 13:38:47 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:49047</guid><dc:creator>mz1313</dc:creator><description>&lt;p&gt;A week ago I gave my session on the subject. It is only first part of the two – second will be next time when there is an open slot. Demos from my session are available &lt;a href="https://www.dropbox.com/s/okhmfmbp82ms1ut/LockingInRealEnvironment_Part1.rar"&gt;here&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>SQLCLR Performance Session at TechEd US</title><link>http://sqlblog.com/blogs/adam_machanic/archive/2013/05/08/sqlclr-performance-session-at-teched-us.aspx</link><pubDate>Wed, 08 May 2013 16:14:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48998</guid><dc:creator>Adam Machanic</dc:creator><description>&lt;p&gt;I am super-excited to visit New Orleans next month for Microsoft TechEd; it will be my sixth time speaking at the show.&lt;/p&gt;&lt;p&gt;&lt;a href="http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/DBI-B404"&gt;My session&lt;/a&gt; takes an in-depth look at some of the techniques I've developed for using SQLCLR modules -- and some of the great performance gains I've been able to achieve.&lt;/p&gt;&lt;p&gt;Hope to see you in NOLA! If you're not attending the show, the video will be available on demand a few days after I give the talk.&lt;br&gt;&lt;/p&gt;</description></item><item><title>Read the New TPC Database Benchmarking Series</title><link>http://sqlblog.com/blogs/kevin_kline/archive/2013/04/22/read-the-new-tpc-database-benchmarking-series.aspx</link><pubDate>Mon, 22 Apr 2013 18:17:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48816</guid><dc:creator>KKline</dc:creator><description>&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="line-height:19px;"&gt;Let's talk about database application benchmarking.&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="line-height:19px;"&gt;This is a skill set which, in my opinion, is one of the major differentiators between a journeyman-level DBA and a true master of the trade. In this article published in my monthly column at&amp;nbsp;&lt;a target="_blank" href="http://www.dbta.com/"&gt;&lt;em&gt;Database Trends &amp;amp; Applications magazine&lt;/em&gt;&lt;/a&gt;, I'll give you a brief introduction to TPC benchmarks and, in future articles, I'll be telling you how to extract specific pieces of valuable information from the published benchmark results.&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="line-height:19px;"&gt;But let's get started with an overview …&amp;nbsp;&lt;/span&gt;&lt;a target="_blank" style="line-height:19px;" href="http://www.dbta.com/Articles/Columns/SQL-Server-Drill-Down/Introduction-to-TPC-Database-Benchmarks-86891.aspx"&gt;read more.&lt;/a&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="font-family:Arial, Helvetica, sans-serif;font-size:small;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Many thanks,&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="font-family:Arial, Helvetica, sans-serif;font-size:small;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;-Kevin&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;span style="font-family:Arial, Helvetica, sans-serif;font-size:small;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;a href="http://twitter.com/kekline"&gt;- Follow me on Twitter!&lt;/a&gt;&lt;br&gt;&lt;a href="https://plus.google.com/u/1/113032055249023350257?rel=author"&gt;- Google Author&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Storage Performance</title><link>http://sqlblog.com/blogs/joe_chang/archive/2013/03/24/storage-performance.aspx</link><pubDate>Mon, 25 Mar 2013 03:52:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48393</guid><dc:creator>jchang</dc:creator><description>&lt;p&gt;Storage has changed dramatically over the last three years driven by SSD developments.  
Most of the key components necessary for a powerful storage system are available
and the cost is highly favorable for direct placement of data files. 
Some additional infrastructure elements could greatly enhance the flexibility of storage systems with SSDs. There is still some discussion on whether SSD should interface directly to PCI-E 
or continue using the SAS/SATA interfaces originally designed for hard disks. 
New products coming this year include Express Bay, an ecosystem of connectors allowing 
both PCI-E and SAS/SATA to co-exist until a clear direction is established. 
Also expected in the coming year are PCI-E SSDs based on the NVM Express interface.
&lt;/p&gt;

&lt;h4&gt;System Level&lt;/h4&gt;
&lt;p&gt;
The Intel Xeon E5 processors, codename Sandy Bridge-EP, have 40 PCI-E gen 3 lanes&amp;nbsp;on each processor socket. Even though PCI-E gen 3 is 8GT/s, a change in the encoding means that the usable bandwidth is double that of PCI-E gen2 at 5GT/s. 
The net realizable bandwidth of a PCI-E gen 3 x8 slot is 6.4GB/s versus 3.2GB/s for gen 2.
&lt;/p&gt;

&lt;p&gt;
The unfortunate aspect is that the major server vendors all implement a mix of x16 and x4 slots, 
while the HBA vendors seem to be concentrating on products for PCI-E x8. 
Only Supermicro has a system with 10 &lt;strike&gt;8&lt;/strike&gt; PCI-E gen 3 x8 slots. 
Could a vendor put 2 HBA/RAID Controllers designed for x8 onto a single card for a x16 slot? 
Perhaps the new Express Bay form factor will have some means to use x16 slots?
&lt;/p&gt;

&lt;p&gt;&lt;img alt="Sandy Bridge EP 2-socket" src="http://www.qdpma.com/SystemArchitecture_files/SandyBridgeEP2f.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
Another disappointment is that the 4-socket Xeon E5-46xx systems only connect half of the PCI-E lanes. This might be because the base system configuration is 2-socket populated. 
If a full set of slots are provided, there would no connection to half of the slots unless all four sockets are populated. But this is also an issue on the 2-socket systems if only 1 socket is populated.
&lt;/p&gt;
&lt;h4&gt;Direct-Attach&lt;/h4&gt;
&lt;p&gt;For the most part, I will discuss direct-attach storage configuration, 
as we can pick and choose among the latest components available. 
Technically, direct-attach with SAS can support  a 2-node cluster, 
but few system vendors promote this configuration. 
Dell sells the MD3200 as direct-attach storage supporting 4 hosts in a cluster (or not), 
but technically it is a SAN that just happens to use&amp;nbsp;SAS interfaces on both the front-end and back-end.
&lt;/p&gt;

&lt;p&gt;The objective in the baseline storage configuration below is to achieve 
very high IO bandwidth even in the low capacity configuration. 
Of course it will also have very high IOPS capability because the main elements are SSD. 
My recommended storage system has both SSD and HDD in each IO channel.
&lt;/p&gt;

&lt;p&gt;&lt;img width="600" height="300" alt="PCIe" src="http://www.qdpma.com/Storage_files/Config_2013b.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
This intent is to place the main databases on SSD and use the HDD for backups and for restore verification. For an Inmon style data warehouse, the HDD might also be used for older data. 
The reason for having both SSD and HDD on each IO channel is to take advantage of simultaneous bi-directional IO. On a database backup, the IO systems reads from SSD, and simultaneously writes to HDD.
&lt;/p&gt;

&lt;p&gt;
4 RAID Controllers, 4GB/s per controller, 16GB/s total IO bandwidth&lt;br&gt;
4 Disk enclosures (yes, I am showing 8 enclosures in the diagram above)&lt;br&gt;
4 x 16 = 64 SSD&lt;br&gt;
4 x 8 = 32 (10K) HDD 
&lt;/p&gt;

&lt;p&gt;
The standard 2U enclosure has 24 x 15mm bays. My preference is for a 16 SSD and 8 HDD mix. 
With the small capacity 100GB SSD, there will be 1.6TB per enclosure and 6.4TB over 4 enclosures before RAID. In 7+1 RAID 5 groups, there will be 5.6TB net capacity, and 4.8TB in 3+1 RG across 4 units. The goal is 4GB/s per controller because the SAS infrastructure is still 6Gbps, 
supporting 2.2GB/s on each x4 port. With 16 SSDs per controller, each SSD needs to support 250MB/s. Most of the recent enterprise class SSDs are rated for well over 300MB/s per unit, 
allowing for a large degree of excess capability. 
Another option is to configure 12 SSDs per controller, expecting each SSD to support 333MB/s.
&lt;/p&gt;

&lt;p&gt;
The cost structure for the above is as follows:&lt;br&gt;
&amp;nbsp; RAID controller $1K &lt;br&gt;
&amp;nbsp; 2U Enclosure $3K &lt;br&gt;
&amp;nbsp; Intel SSD DC 3700 100GB SSD $235 x&amp;nbsp;16 = $3760, RAID 5 7+1: 1.6TB&lt;br&gt;
&amp;nbsp; Seagate Savvio 600GB 10K HDD $400
x 8 = $3200.&lt;/p&gt;

&lt;p&gt;
This works out to $11K per unit or $44K for the set of 4. 
The set of 16 x 100GB contributes $3760. For the 800GB SSD, the R5 7+1 capacity is 44.8TB at cost $148K. &lt;/p&gt;

&lt;p&gt;At maximum expansion of 4 enclosures per  RAID controller, capacity is 170TB at cost is $590K. Of course at this level, I would elect a system with more PCI-E slots for greater IO bandwidth. Another option is a RAID controller with 4 x4 SAS ports. 
Unfortunately none of these have 4 external ports.
&lt;/p&gt;

&lt;p&gt;
While the Intel SSD DC 3700 drew reviews for pioneering consistency of IO performance 
instead over peak performance, it is only available in SATA interface.
Micron &lt;strike&gt;Crucial&lt;/strike&gt; has &lt;strike&gt;announced&lt;/strike&gt; the P410m with similar specifications but with SAS interface. This is listed on the Micron website as in production, but probably only to&amp;nbsp;OEM customers.&amp;nbsp;There are other enterprise grade high endureance MLC SSDs with SAS interface as well.
&lt;/p&gt;

&lt;p&gt;
Note: I do not recommend anything less than 10K HDD even to support database backups. 
The 10K HDDs are not particularly expensive as direct-attach components ($400 for the 600GB model). Only SAN vendors sell $400 HDDs for $2K or more. 
&lt;/p&gt;
&lt;h4&gt;SAS 12Gbps Enclosures&lt;/h4&gt;
&lt;p&gt;
Disk enclosures supporting SAS at 12Gbps might become available as early as this year. 
Each of the 12Gbps SAS x4 uplink and down link ports would then support 4GB/s. 
The RAID controller (HBA) can support 6GB/s+ in a PCI-E gen 3 x8.  
The system with 4 RAID controllers could then deliver 24GB/s instead of 16GB/s. 
At 16 SSDs per controller, this would require 400MB/s per SSD. While SSDs are rated as high as 550MB/s, achieving the full aggregate bandwidth in an array&amp;nbsp;is not necessarily practical. So&amp;nbsp;400MB/s per SSD in an array is a more reasonable expectation. Also, enterprise SAS SSDs
many only be rated to 400MB/s.&lt;/p&gt;

&lt;p&gt;
We should not need 12Gbps SAS SSDs or HDDs in the near future (but 8 NAND channels&amp;nbsp;is a good match for a 1.1GB/s interface).&amp;nbsp;The internal wires in the enclosure connect through a SAS expander. 
The IO from each device bay can signal at 6Gbps, then uplink to the HBA at 12Gbps, 
assuming that packets are buffered on the expander.
&lt;/p&gt;

&lt;p&gt;&lt;img width="600" height="520" alt="PCIe" src="http://www.qdpma.com/Storage_files/Enclosure_b.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
The standard 2U disk enclosure today supports 24 or 25 2.5in (SFF) bays, with 15mm thickness. 
This is the dimension of an enterprise class 10K or 15K HDD with up to 3 platters. 
The older full size notebook used a 9mm HDD supporting 2 platters. 
Thin notebooks used a 7mm HDD restricted to a single platter.  
There is no particular reason for an SSD to be more than 7mm. 
&lt;/p&gt;

&lt;p&gt;
 It would be better if the new 12Gbps SAS enclosures support more than 24 bays. 
My preference is  for 16 x 15mm and 16 x 7mm bays. The key&amp;nbsp;is to match the practically realizable aggregate bandwidth of SSDs to the uplink with some&amp;nbsp;degree of excess.
Personally, I would like to discard the SSD case to further reduce thickness.
&lt;/p&gt;

&lt;p&gt;&lt;img width="600" height="221" alt="PCIe" src="http://www.qdpma.com/Storage_files/Enclosure.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
Another option is to employ the NGFF, perhaps a 1U stick, at 5mm or less. 
There could be 2 rows of 24 for SSD, and the 16 x 15mm bays.
&lt;/p&gt;

&lt;p&gt;
I believe that the all-SSD idea is misguided. SSDs are wonderful, but HDD still have an important role. One example is having the HDDs available for backup and restores. 
I want local HDD for backups because so very few people know how to configure for multiple parallel 10GbE network transmission, not to mention IO bandwidth on the backup system.
&lt;/p&gt;

&lt;p&gt;
A database backup that has not been actually verified to restore (with recovery) is a potentially useless backup. 
Having HDDs for backup and restore verification preserves the write endurance on the SSD. 
This allows the use of high-endurance MLC instead of SLC. 
In some cases, it might even be possible to use consumer grade MLC if and only if&amp;nbsp;the database organization maintenance strategy&amp;nbsp;is architected to minimize wear on the SSD.
&lt;/p&gt;
&lt;h4&gt;PCI-E SSD&lt;/h4&gt;
&lt;p&gt;
Some of the discussion on PCI-E versus SATA/SAS interface for the NAND/Flash controller 
incorrectly focuses on the bandwidth of a single 6Gbps lane versus 4 or 8 lanes on PCI-E. 
It is correct that PCI-E was designed to distribute traffic over multiple lanes 
and that hard drives were never expected to exceed to bandwidth of a single lane 
at the contemporary SATA/SAS signaling rate. 
The transmission delay across an extra silicon trip, 
from NAND controller with SATA interface to a SATA to PCI-E bridge chip, 
on the order of 50ns, 
is inconsequential compare this with the 25-50µsec access time of NAND.
&lt;/p&gt;

&lt;p&gt;
The more relevant matter is matching NAND bandwidth to the upstream bandwidth. 
All (or almost all?) the SATA interface flash controllers have 8 NAND channels. 
Back when SATA was 3Gbps and NAND was 40-50MB/s, 8 channel s to the 280MB/s net bandwidth of SATA 3G was a good match. 
About the time SATA moved to 6Gbps, NAND at 100-133MB/s became available so 8 channels was still a good choice.
&lt;/p&gt;

&lt;p&gt;&lt;img alt="PCIe" src="http://www.qdpma.com/Storage_files/SSD_SATA.png"&gt;&lt;br&gt;NAND is now at 200 and 333MB/s, while SATA is still 6Gpbs. 
The nature of silicon product cost structure is such that there is only minor cost reduction in building a 4 channel flash controller. 
The 8 channel controller only requires 256-pin package.
&lt;/p&gt;

&lt;p&gt;
The PCI-E flash controllers have been designed with 32 NAND channels. 
The IDT 32-channel controller has 1517 pins, 
which is not excessively difficult or expensive for a high-end server product. 
Note the Intel Xeon processors are 2011-pins. 
As noted earlier a PCI-E gen 3 x8 port supports 6.4GB/s. 
Over 32 channels, each channel needs to provide 200MB/s. 
The new 333MB/s NAND is probably a better fit to sustain the full PCI-E gen 3 x8 bandwidth after RAID (now RAIN because disks are replaced by NAND).
&lt;/p&gt;

&lt;p&gt;&lt;img alt="PCIe" src="http://www.qdpma.com/Storage_files/SSD_PCIE.png"&gt;
&lt;/p&gt;
&lt;p&gt;
Based on 64Gbit die, and 8 die per package, a package has 64GB raw capacity. The 32-channel PCI-E with 1 package per channel would have 2TB raw capacity (net capacity with 0.78 for over-provisioning and 0.875 for RAIN would be 1400GB) versus 512GB on an 8-channel SATA/SAS SSD.
The IDT document states capacity is 4TB raw for their 32-channel controllers, so perhaps it allows 2 packages per channel? The Micron datasheet mentions 32-channel and 64 placements.
&lt;/p&gt;

&lt;p&gt;
As is today, a PCI-E SSD can achieve maximum bandwidth at lower NAND capacity and in a more compact form factor than with SAS SSD.
On the other hand, SAS infrastructure provides flexible expansion. 
Capacity can be increased without replacing existing devices. 
Some systems support hot swap PCI-E slots. 
However the orientation of the connector in the system chassis makes this a complicated matter. 
The implications are that PCI-E slot SSDs are highly suitable for high density requirements with limited expansion needs. One database server example is tempdb on SSD.
&lt;/p&gt;
&lt;h4&gt;NVM Express&lt;/h4&gt;
&lt;p&gt;
The new generation of PCI-E SSDs may employ the NVMe interface standard. 
There is a standard driver for Windows and other operating systems, 
which will later be incorporated into to the OS distribution media 
allowing boot from an NVMe device, as with SATA devices today. 
This is mostly a client side feature.
&lt;/p&gt;

&lt;p&gt;
For the server side, the NVMe driver is designed for massive bandwidth and IOPS. 
There can be up to 64K queues, 64K commands (outstanding IO requests?). 
The driver is designed for IO to be both super-efficient in cpu-cycles 
and scalable on NUMA systems with very many processor cores. 
&lt;/p&gt;

&lt;p&gt;&lt;img width="600" height="228" alt="EMC VNX" src="http://www.qdpma.com/Storage_files/NVME_ScalableQueuingInterface.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;h4&gt;Express Bay&lt;/h4&gt;
&lt;p&gt;
To promote the growth of SSD without betting on which interface, 
the Express Bay standard defines a connector that can support both PCI-E and SATA or SAS. 
Some Dell servers today support PCI-E to SSDs in the 2.5in HDD form factor (SFF), 
but I am not sure if this is Express Bay. 
This form factor will allow PCI-E devices to hot-swapped with the same ease as SAS devices today.
&lt;/p&gt;

&lt;p&gt;&lt;img alt="PCIe" src="http://www.qdpma.com/Storage_files/ExpressBay.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;h4&gt;PCI-E Switches&lt;/h4&gt;
&lt;p&gt;
As mentioned earlier, the PCI-E slot arrangement in server systems does not facilitate hot-add,
even if it is supported.
Existing PCI-E SSDs also do not provide a mechanism for capacity expansion, 
aside from adding a card to an empty slot or replacing an existing card.
&lt;/p&gt;

&lt;p&gt;
Of course, there are PCI-E switches, just like the SAS expanders.
A 64 lane PCI-E switch could connect 5 x8 PCI-E devices over a x8 upstream link.
Other possibilities is a x16 link supporting 12.8GB/s to host with 4 ports for SSD,
or 8 x4 ports to SSD for finer grain expansion.
It may also be possible to support multiple hosts, as in a cluster storage arrangement?
&lt;/p&gt;

&lt;p&gt;&lt;img alt="PCIe" src="http://www.qdpma.com/Storage_files/PCI-E_switch.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;h4&gt;SAN Configuration&lt;/h4&gt;
&lt;p&gt;
Below is a representation of a typical configuration sold to customers by the SAN vendor. 
I am not joking in that it is common to find 2 ports FC or FCOE on each host. 
The most astounding case was a SAN with 240 x 15K disks and 2 single port FC HBAs in the server. 
Even though the storage system service processors had 4 FC port each (and the FC switches had 48-ports), only 1 on each SP was connected. 
Obviously the storage engineer understood single component and path failure fault-tolerant design. It was just too bad he built a fault-tolerant garden hose system when a fire hose was needed.
&lt;/p&gt;

&lt;p&gt;&lt;img width="465" height="558" alt="SAN_Configuration1c" src="http://www.qdpma.com/Storage_files/SAN_Configuration1c.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
As I understand it, what happened was the SAN engineer asked how much space is needed for the databases accounting for growth, and then created one volume for it. 
The Windows OS does support multi-path IO. 
Originally the storage vendor provided the MPIO driver, but now it is managed by Microsoft. 
Apparently it was not considered that even with MPIO, all IO for a single volume has a primary path. The secondary path is only used when the primary is not available. 
&lt;/p&gt;
&lt;h4&gt;High-bandwidth SAN Configuration&lt;/h4&gt;
&lt;p&gt;
A proper SAN configuration for both OLTP and DW database servers is shown below. 
Traditionally, a transaction processing database generates small block&amp;nbsp;random IO (2KB in the old days, 8KB since SQL Server 7). As it&amp;nbsp;was difficult to get 10K IOPS (x8KB = 80MB/s),&amp;nbsp;it was thought that&amp;nbsp;IO bandwidth was not a requirement. 
This was 15 years ago. Apparently the SAN vendors read documents from this period, but not more recently, hence their tunnel vision on IOPS, ignoring bandwidth. &lt;/p&gt;

&lt;p&gt;For the last 10 or more years, people have been running large queries on the OLTP system. I have noticed report queries that saturate the storage&amp;nbsp;IO channels could essentially shutdown&amp;nbsp;transaction processing. This is because the report query generates asynchronous IO at high queue depth, while the transaction queries issue synchronous IO at queue depth 1.&amp;nbsp;And the report may escalate to a table lock (or it may use nolock).&amp;nbsp;Furthermore, it is desirable to be able to backup and restore the transaction database quickly. This means bandwidth. 
&lt;/p&gt;

&lt;p&gt;
Note that the system below shows 8Gbps FC, not 10Gbps FCOE. 
A single 10Gbps FCOE may have more bandwidth than a single 8Gbps FC port. 
But no serious storage system will have less than 4 or even 8 ports. 
Apparently FCOE currently does not scale well over multiple ports, due to the overhead of handling Ethernet packets? 
An Intel IDF 2012 topic mentions that this will be solved in the next generation.&lt;/p&gt;

&lt;p&gt;&lt;img width="428" height="635" alt="SAN_Configuration2e" src="http://www.qdpma.com/Storage_files/SAN_Configuration2e.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
The above diagram shows 8 x 8Gbps FC ports between host and storage system.
Each 8Gbps FC port can support 700MB/s for a system IO bandwidth target of 5.6GB/s.
An OLTP system that handles&amp;nbsp;very high transaction volume may benefit from a dedicated HBA and  
FC ports for log traffic. This would allow the log HBA to be configured for low latency,
and the data HBA to be configured with interrupt moderation and high throughput.
&lt;/p&gt;

&lt;p&gt;
An alternate SAN configuration for SQL Server 2012 is shown below with local SSD for tempdb.
&lt;/p&gt;

&lt;p&gt;&lt;img width="485" height="633" alt="SAN_Configuration3b" src="http://www.qdpma.com/Storage_files/SAN_Configuration3b.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
The write cache on a SAN must be mirrored for fault tolerance. 
There is very little detail on the bandwidth capability of the link between controllers 
(or SP) on SAN systems, beyond what can be deduced from the fact that the sustained write bandwidth is much lower than the read bandwidth.
So keeping tempdb off the SAN should preserve IO write bandwidth for traffic that
actually needs protection.
&lt;/p&gt;


&lt;p&gt;
The number of volumes for data and temp should be some multiple of 8.
It would be nice to have 1 volume for each FC path.
However we do need to consider how SQL Server place extents over multiple files.
This favors RAID groups of 4 disks.
&lt;/p&gt;
&lt;h4&gt;File Layout&lt;/h4&gt;
&lt;p&gt;
In a HDD storage system, 
the objective for bandwidth is to simultaneously issue large block IO
to all data disks across all IO channels.
A 256K block could be sufficiently large to generate 100MB/s per disk (400 IOPS, not random).
If this were issued at low queue depth (2?), 
then the storage system would not only generate high IO bandwidth
and still be perceptually responsive to other requests for small block IO.
&lt;/p&gt;

&lt;p&gt;
For small block random IO, it is only necessary to distribute IO over all hard disks with reasonable uniformity.
&lt;/p&gt;

&lt;p&gt;
The file layout strategy has two objectives. 
One is to not overwhelm any single IO channel.
In direct-attach, this is not a proper as the smallest pipe is x4 SAS for 2GB/s.
In a SAN, even using 8Gbps FC, this is a concern as 8Gb FC can support only 700-760MB/s.
Although 10Gb FCoE seems to have higher bandwidth, this may not scale with the number of channels
as well as straight FC.
The new Intel Xeon E5 (Sandy-Bridge EP) processors may be able to scale 10Gb FCoE 
with Data Direct IO (DDIO) - but this needs to be verified.
&lt;/p&gt;

&lt;p&gt;
The second is to ensure IO goes to every disk in the RAID Group (or volume).
By default, SQL Server allocates a single 64K extent from each file
before round-robin allocating from the next file.
This might be the reason that many SAN systems generate only 10MB/s per disk (150 IOPS at 64K),
along with no read-ahead.
&lt;/p&gt;

&lt;p&gt;&lt;img width="463" height="496" alt="FileLayout_1" src="http://www.qdpma.com/Storage_files/FileLayout_1.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
The -E startup flag instructs SQL to allocate up to 4 consecutive extents
before proceeding to the next file.
See James Rowland-Jones 
&lt;a href="http://consultingblogs.emc.com/jamesrowlandjones/archive/2010/04/25/focus-on-fast-track-understanding-the-e-startup-parameter.aspx" target="blank"&gt;
Focus on Fast Track : Understanding the –E Startup Parameter&lt;/a&gt;
for more on this.
In a 4-disk RAID group with stripe-size 64K, a 256K IO to the file
would generate a 64K IO to each disk.
&lt;/p&gt;

&lt;p&gt;&lt;img width="463" height="496" alt="FileLayout_2" src="http://www.qdpma.com/Storage_files/FileLayout_2.png"&gt;&lt;br&gt;&lt;/p&gt;

&lt;p&gt;
It would be necessary to rebuild indexes before this scheme takes effect.
Somewhere it was mentioned that it is important to build indexes with max degree of parallelism 
limited to either 4 or 8. It might be in the 
&lt;a href="http://msdn.microsoft.com/en-us/library/hh918452.aspx" target="blank"&gt;
Microsoft Fast Track Data Warehouse Reference Architecture&lt;/a&gt;.
Start with version 4 for SQL Server 2012, and work backwards?
&lt;/p&gt;

&lt;p&gt;
Trace Flag 1117 (-T1117) causes all files in a filegroup to grow together.
See
&lt;a href="http://blogs.technet.com/technet_blog_images/b/sql_server_sizing_ha_and_performance_hints/archive/2012/02/09/sql-server-2008-trace-flag-t-1117.aspx" target="blank"&gt;
SQL Server 2008 Trace Flag -T 1117&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
With SSD, the second may not be important as the SQL Server read-ahead strategy
(1024 pages?) should generate IO to all units.
On the hard disk, generate close-to-sequential IO was important.
On SSD, it is sufficient beneficial just to generate large block IO, with 64K being large.
&lt;/p&gt;
&lt;h4&gt;Summary&lt;/h4&gt;
&lt;p&gt;
The old concept of distributing IO over both devices and channels still apply.
The recent pricing of SSD is sufficiently low to warrant serious consideration ($2-3K/TB eMLC).
While there is more flexibility in SSD configuration, 
it is still necessary to validate performance characteristics
with real queries to an actual database. 
SQLIO or other synthetic tests are not sufficient.
If the SAN vendor advised in the configuration, 
then chances are IO bandwidth will be not be good.
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Addendum&lt;/strong&gt;&lt;br&gt;
If anyone thinks I am being unfair to or&amp;nbsp;overly critical of SAN vendors, do the following test.&lt;br&gt;
Find the biggest table in your database, excluding LOB fields.&lt;br&gt;
Run this:&lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;DBCC &lt;/font&gt;
&lt;font color="#008080" face="Consolas"&gt;DROPCLEANBUFFERS&lt;/font&gt;&lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;GO&lt;/font&gt;&lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;SET STATISTICS&lt;/font&gt;
&lt;font color="#0000ff" face="Consolas"&gt;IO ON&lt;/font&gt; &lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;SET STATISTICS&lt;/font&gt;
&lt;font color="#0000ff" face="Consolas"&gt;TIME ON&lt;/font&gt;&lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;GO&lt;/font&gt; &lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;SELECT&lt;/font&gt;
&lt;font color="#ff00ff" face="Consolas"&gt;COUNT&lt;/font&gt;&lt;font color="#808080" face="Consolas"&gt;(*)&lt;/font&gt;
&lt;font color="#0000ff" face="Consolas"&gt;FROM&lt;/font&gt;
&lt;font color="#008080" face="Consolas"&gt;TableA&lt;/font&gt;
&lt;font color="#0000ff" face="Consolas"&gt;WITH&lt;/font&gt;
&lt;font color="#808080" face="Consolas"&gt;(&lt;/font&gt;&lt;font color="#0000ff" face="Consolas"&gt;INDEX&lt;/font&gt;&lt;font color="#808080" face="Consolas"&gt;(&lt;/font&gt;&lt;font face="Consolas"&gt;0&lt;/font&gt;&lt;font color="#808080" face="Consolas"&gt;))&lt;/font&gt; &lt;br&gt;
&lt;font color="#0000ff" face="Consolas"&gt;GO&lt;/font&gt;
&lt;/p&gt;

&lt;p&gt;
Then compute 8 (KB/page) * (physical reads + read-ahead reads)/(elapsed time in ms)&lt;br&gt;Is this closer to 700 MB/s or 4GB/s? What did your SAN vendor tell you?&lt;/p&gt;

&lt;p&gt;I am also not fan of SSD caching or auto-tiering on critical databases, meaning the database that runs your business, that is managed by one or more full-time DBAs. In other applications, there may not be a way to segregate the placement of hot data differently from inactive data. In SQL Server, there are filegroups and partitioning. We have all the necessary means of isolating and placing hot data whereever we want it. SSD caching or auto-tiering will probably require SLC NAND. With active management using database controls, we should be able to use HET MLC or&amp;nbsp;even MLC.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;I stress the importance of analyzing the complete system and how it will be used instead of over-focusing on the components. There are criteria that might be of interest when there is only a single device or even single HBA. Today it is possible to over-configure the storage performance without unwarranted expense, and&amp;nbsp;this is best accomplished by watching the big picture.&lt;/p&gt;

&lt;p&gt;Adaptec reports that their Series 7 SAS RAID Controller&amp;nbsp;(72405 - PCI-E gen 3 x8 on the upstream side and 6 x4 SAS 6Gpbs) using the PMC&amp;nbsp;PM8015 controller can do&amp;nbsp;500K IOPS and 6.6GB/s.&lt;/p&gt;

&lt;p&gt;I will keep this topic up to date on &lt;a href="http://www.qdpma.com"&gt;www.qdpma.com&lt;/a&gt; 
&lt;a href="http://www.qdpma.com/Storage/Storage2013.html" target="blank"&gt;Storage 2013&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;related posts on storage:&lt;br&gt;
&lt;a href="http://sqlblog.com/blogs/joe_chang/archive/2010/10/18/io-queue-depth-strategy.aspx"&gt;io-queue-depth-strategy&lt;/a&gt; (2010-08)
&lt;br&gt;
&lt;a href="http://sqlblog.com/blogs/joe_chang/archive/2010/03/23/data-log-and-temp-file-placement.aspx"&gt;data-log-and-temp-file-placement&lt;/a&gt; (2010-03)
&lt;br&gt;
&lt;a href="http://sqlblog.com/blogs/joe_chang/archive/2008/09/04/io-cost-structure-preparing-for-ssd-arrays.aspx"&gt;io-cost-structure-preparing-for-ssd-arrays&lt;/a&gt; (2008-09)
&lt;br&gt;
&lt;a href="http://sqlblog.com/blogs/joe_chang/archive/2008/03/04/storage-performance-for-sql-server.aspx"&gt;storage-performance-for-sql-server&lt;/a&gt; (2008-03)
&lt;/p&gt;

&lt;p&gt;
ps,&lt;br&gt;
If you are using SQL Server 2012 clustering on a SAN, 
I do suggest placing tempdb on local SSD, making use of the new 2012 feature 
that does not require tempdb to be on shared storage.
Keep in mind on the SAN, writes must be mirrored between two storage processors for fault recovery, and this is not a cheap thing to do.
We should plan redo whatever was using tempdb at the time.
&lt;/p&gt;</description></item><item><title>Halloween Protection – The Complete Series</title><link>http://sqlblog.com/blogs/paul_white/archive/2013/02/20/halloween-protection-the-complete-series.aspx</link><pubDate>Wed, 20 Feb 2013 23:29:42 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47843</guid><dc:creator>Paul White</dc:creator><description>&lt;p align="left"&gt;&lt;a href="http://sqlblog.com/blogs/paul_white/image_10B79A4D.png"&gt;&lt;img title="image" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;float:left;padding-top:0px;padding-left:0px;margin:0px 12px 0px 0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="image" align="left" src="http://sqlblog.com/blogs/paul_white/image_thumb_00C85589.png" width="240" height="86" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;I have just published a four-part series&lt;/font&gt;&lt;font size="3" face="Calibri"&gt; for &lt;/font&gt;&lt;a href="http://www.sqlperformance.com/" target="_blank"&gt;&lt;font size="3" face="Calibri"&gt;SQLPerformance.com&lt;/font&gt;&lt;/a&gt;&lt;font size="3" face="Calibri"&gt; on the Halloween Problem. Some of you will never have heard of this issue, and those that have might associate it only with T-SQL &lt;code&gt;UPDATE&lt;/code&gt; queries. In fact, the Halloween problem affects execution plans for &lt;code&gt;INSERT, UPDATE, DELETE&lt;/code&gt; and &lt;code&gt;MERGE&lt;/code&gt; statements.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;This is a topic I have been meaning to write about properly for years, ever since I read Craig Freedman’s 2008 &lt;a href="http://blogs.msdn.com/b/craigfr/archive/2008/02/27/halloween-protection.aspx" target="_blank"&gt;blog post&lt;/a&gt; on the topic, which ended with the cryptic comment:&lt;/font&gt;&lt;/p&gt;  &lt;blockquote&gt;   &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;&lt;em&gt;“…although I've used update statements for all of the examples in this post, some insert and delete statements also require Halloween protection, but I'll save that topic for a future post.”&lt;/em&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;That future post never materialized, sadly, so I thought I would have a go. &lt;/font&gt;&lt;font size="3" face="Calibri"&gt;The four parts of the series are summarized and linked below, I hope you find the material interesting.&lt;/font&gt;&lt;/p&gt;  &lt;hr /&gt;  &lt;h3 align="left"&gt;&lt;a href="http://www.sqlperformance.com/2013/02/t-sql-queries/halloween-problem-part-1" target="_blank"&gt;&lt;font face="Calibri"&gt;Part 1 – The Halloween Problem and UPDATE statements&lt;/font&gt;&lt;/a&gt;&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;The SQL standard and three-phase separation&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Logical update processing&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Pipelined execution&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;The Halloween problem&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Avoiding the problem in UPDATE statements&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;h3 align="left"&gt;&lt;a href="http://www.sqlperformance.com/2013/02/t-sql-queries/halloween-problem-part-2" target="_blank"&gt;&lt;font face="Calibri"&gt;Part 2 – The Halloween Problem in INSERT and DELETE queries&lt;/font&gt;&lt;/a&gt;&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;INSERT examples&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;DELETE examples&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Constraint checking and phase separation&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;h3 align="left"&gt;&lt;a href="http://www.sqlperformance.com/2013/02/t-sql-queries/halloween-problem-part-3" target="_blank"&gt;&lt;font face="Calibri"&gt;Part 3 – Halloween Problem optimizations for MERGE&lt;/font&gt;&lt;/a&gt;&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;MERGE contains several optimizations the other DML statements do not&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Hole-filling with merge join&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Hole-filling with nested loops&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Avoiding an extra B-tree navigation&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Avoiding the join&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;h3 align="left"&gt;&lt;a href="http://www.sqlperformance.com/2013/02/t-sql-queries/halloween-problem-part-4" target="_blank"&gt;&lt;font face="Calibri"&gt;Part 4 – The Halloween Problem and the Query Optimizer&lt;/font&gt;&lt;/a&gt;&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Early optimization approaches&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;The SQL Server optimizer approach&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;The case of the redundant sort&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;HP levels and properties&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Plan changes for Halloween Protection&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Non-spool options&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Row versioning&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;Heaps and forwarded records&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;&lt;font size="3" face="Calibri"&gt;T-SQL functions&lt;/font&gt;&lt;/div&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;As always, I appreciate your comments and feedback.&lt;/font&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;font size="3" face="Calibri"&gt;Paul White      &lt;br /&gt;&lt;/font&gt;&lt;a href="http://twitter.com/SQL_Kiwi"&gt;&lt;font size="3" face="Calibri"&gt;@SQL_Kiwi&lt;/font&gt;&lt;/a&gt;&lt;font size="3" face="Calibri"&gt;      &lt;br /&gt;&lt;/font&gt;&lt;a href="mailto:SQLkiwi@gmail.com"&gt;&lt;font size="3" face="Calibri"&gt;SQLkiwi@gmail.com&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;&lt;a title="SQL Intersection" href="http://www.sqlintersection.com/" target="_blank"&gt;&lt;img title="Ill_Be_There4" style="border-left-width:0px;border-right-width:0px;background-image:none;border-bottom-width:0px;padding-top:0px;padding-left:0px;margin:0px;display:inline;padding-right:0px;border-top-width:0px;" border="0" alt="Ill_Be_There4" src="http://sqlblog.com/blogs/paul_white/Ill_Be_There4_2D04DC6D.jpg" width="140" height="110" /&gt;&lt;/a&gt;&lt;/p&gt;</description></item><item><title>One more reason to to understand query plans, not directly performance related</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2013/01/22/one-more-reason-to-to-understand-query-plans-not-directly-performance-related.aspx</link><pubDate>Wed, 23 Jan 2013 04:28:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47252</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;One of the things that separates a good programmer from a great one is a firm understanding about what is going on inside the computer. For some programming languages, it is very obvious what is going on inside the computer because you are working at a very low level. For example, if you are a C/C++ programmer writing an OS, you will know a lot about the hardware as you will interact with it directly. As a .NET programmer you are more encapsulated from the hardware experience, making use of the .NET framework.&lt;/p&gt;  &lt;p&gt;None of the aforementioned programming languages comes anywhere close to the level of encapsulation that we SQL programmers work with.&amp;nbsp; When you execute a statement like:&lt;/p&gt;  &lt;p&gt;SELECT *   &lt;br&gt;FROM&amp;nbsp;&amp;nbsp;&amp;nbsp; Tablename&lt;/p&gt;  &lt;p&gt;A firestorm of code is executed to optimize your query, find the data on disk, fetch that data, format it for presentation, and then send it to the client. And this is the super dumbed down version.&amp;nbsp; SQL is a &lt;a href="http://en.wikipedia.org/wiki/Declarative_language"&gt;declarative language&lt;/a&gt;, where basically we format a question or task for the system to execute without telling it how.&amp;nbsp; It is my favorite type of language because all of the pushing bits around get tedious.&amp;nbsp; However, what is important for the professional SQL programmer is to have some understanding of what is going under the covers, understanding query plans, disk IO, CPU, etc. Not necessarily to the depth that Glenn Alan Berry (&lt;a title="http://sqlserverperformance.wordpress.com/" href="http://sqlserverperformance.wordpress.com/"&gt;http://sqlserverperformance.wordpress.com/&lt;/a&gt;) does, but certainly a working knowledge.&lt;/p&gt;  &lt;p&gt;Performance is the obvious reason, since it is clearly valuable to be able to optimize a query, but sometimes it can come in handy to debug an issue you are having with a query. Today, I ran across an optimizer condition that, while perfectly understandable in functional terms, would have driven me closer to nuts if I hadn’t been able to read a query plan. The problem came in based on the number of rows returned, either it worked perfectly or it failed with an overflow condition. Each query seemingly touches the exact same rows in the table where the overflow data exists…or did it.&lt;/p&gt;  &lt;p&gt;The setup. The real query that the problem was discovered in was our data warehouse, and was a star schema configuration with 20+ joins. In the reproduction, I will use a simple table of numbers to serve as the primary table of the query. &lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;create table ExampleTable&amp;nbsp; -- It really doesn’t matter what this table has. The datevalue column will be used to      &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -- join to the date table, that I will load from the       &lt;br&gt;(&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -- from the values I put in this table to make sure all data does exist      &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; i int constraint PKExampleTable primary key,       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; dateValue date       &lt;br&gt;)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;;with digits (i) as( --(The code for this comes from my snippet page: &lt;a title="http://www.drsql.org/Pages/Snippets.aspx" href="http://www.drsql.org/Pages/Snippets.aspx"&gt;http://www.drsql.org/Pages/Snippets.aspx&lt;/a&gt;).      &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select 1 as i union all select 2 as i union all select 3 union all&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select 4 union all select 5 union all select 6 union all select 7 union all       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select 8 union all select 9 union all select 0)       &lt;br&gt;,sequence (i) as (&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select D1.i + (10*D2.i) + (100*D3.i) + (1000*D4.i) + (10000*D5.i) + (100000*D6.i)&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from digits as D1, digits AS D2, digits AS D3 ,digits AS D4, digits as D5, digits As D6       &lt;br&gt;)       &lt;br&gt;insert into ExampleTable(i, dateValue)       &lt;br&gt;select i, dateadd(day, i % 10,getdate()) -- Puts in 10 different date values      &lt;br&gt;from sequence       &lt;br&gt;where i &amp;gt; 0 and i &amp;lt; 1000       &lt;br&gt;order by i&lt;/font&gt;     &lt;br&gt;&lt;/p&gt;  &lt;p&gt;Next I will load the date table with all of the distinct dateValue values that we loaded into the ExampleTable, plus one, which is the max date value for the datatype. In the “real” world case, this is one of our surrogate null values we use to indicate that it is the end date. (Yes, we are ignoring the Y10K problem.)&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;create table date      &lt;br&gt;(       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; datevalue date constraint PKDate primary key       &lt;br&gt;)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;insert into date      &lt;br&gt;select distinct dateValue       &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;union all       &lt;br&gt;select '99991231'       &lt;br&gt;go&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;In the typical usage, the number of rows is quite small.&amp;nbsp; In our queries, we are adding 1 to the dateValue to establish a range of a day (in the real query it was actually a month). Executing the following query that returns 99 rows is successful:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue       &lt;br&gt;where&amp;nbsp; i &amp;lt; 100&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;However, remove the where clause (causing the query to return 999 rows):&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;And you will see that this results in an overflow condition... &lt;/p&gt; &lt;font face="Courier New"&gt;Msg 517, Level 16, State 3, Line 2    &lt;br&gt;Adding a value to a 'date' column caused an overflow.&lt;/font&gt;  &lt;p&gt;Hmmm, this could be one of those days where I don’t get a lot of sleep :).&amp;nbsp; Next up, I check the max date value that can be returned.&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;--show that the top value that could be returned is &amp;lt; maxdate      &lt;br&gt;select max(date.dateValue)       &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;At this point, I start feeling like I am going nuts. The value returned is 2013-01-30. So no data is actually returned that should be too large for our date column… So then I think, well, let's add one to that value and take the max: &lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select max(date.dateValue), max(dateadd(day,1,date.dateValue))      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;This returns, mockingly:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Msg 517, Level 16, State 3, Line 2      &lt;br&gt;Adding a value to a 'date' column caused an overflow.&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;So, since it worked with fewer rows earlier. I decide to try lowering the number of rows again, this time using a derived table, and it DOESN’T error out, even though it is obvious (because I stacked the deck…data) that the same data is just repeated for the dateValue, particularly since we get the same max dateValue as we did earlier. &lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select max(date.dateValue), max(dateadd(day,1,date.dateValue))      &lt;br&gt;from&amp;nbsp;&amp;nbsp; (select top 100 * from ExampleTable order by i) as ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&lt;/font&gt;&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br&gt;Well, you are possibly thinking, this just doesn't make sense. It is how I felt too after trying to do the logic in my head. I will admit that I didn’t know about query plans I would have been completely lost. But alas, the answer was fairly easily located in the plan. Taking a look at the plan for the query version that returns 99 rows:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue       &lt;br&gt;where&amp;nbsp; i &amp;lt;= 100&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;We get the following estimated plan:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/louis_davidson/image_68ABBCE3.png"&gt;&lt;img style="border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;display:inline;background-image:none;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/louis_davidson/image_thumb_316535A8.png" width="743" height="214"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;In this plan, it uses a nested loops operator, which basically will do 100 seeks from the top input (the ExampleTable), for each row fetching the date value, and then calculating the scalar value (dateadd(day,1,date.dateValue) ) on the values that match in the plan. Since the 9999-12-31 date is never used, there is no overflow.&lt;/p&gt;  &lt;p&gt;However, when the number of rows in the when the size of the output reaches a certain tolerance (in this case 999 instead of 99) from the following query:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;We get a different plan, one that is causing us issues:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/louis_davidson/image4_013DEAE5.png"&gt;&lt;img style="border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;display:inline;background-image:none;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/louis_davidson/image4_thumb_7CC76A1D.png" width="724" height="184"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Instead of nested loops, it uses a &lt;a href="http://en.wikipedia.org/wiki/Hash_join"&gt;Hash Match Join&lt;/a&gt;, which takes the entirety of the smaller table and builds an internal hash index (basically setting up buckets that can be scanned much faster than an entire table…in our case, probably just a single bucket), and then scan the other set checking to see if the row exists in the hash index. &lt;/p&gt;  &lt;p&gt;It is in the process of building the hash index that our query runs into trouble. Since the date table is so much smaller, it plans to build the hash index on that table, and pre-creates the scalar values as it is doing the scan, since there are 11 rows in the date table, rather than having to calculate the value 999 times if it did it after the join. When it adds a day to the 9999-12-31 date, it fails.&lt;/p&gt;  &lt;p&gt;I know, the question of how practical is this scenario is bound to arise. I won’t lie to you and suggest that it is likely to happen to you as it it did to me. However, the point of this blog isn’t that this one scenario is bound to happen to you, but rather that understanding how SQL Server executes queries will help to give you insight to fix problems with your system, mostly performance, but sometimes every esoteric issues that won't just leap out as being based on the query plan that was chosen. (For more reading on query plans, check out Grant Fritchey’s Simple-Talk book on query plans: &lt;a title="http://www.amazon.com/Server-Execution-Plans-Grant-Fritchey/dp/1906434026" href="http://www.amazon.com/Server-Execution-Plans-Grant-Fritchey/dp/1906434026"&gt;http://www.amazon.com/Server-Execution-Plans-Grant-Fritchey/dp/1906434026&lt;/a&gt;).&lt;/p&gt;  &lt;p&gt;In the end, the fix to my problem was simple. Make sure that the value that has meaning in the table, but not in the query, was filtered out:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from&amp;nbsp;&amp;nbsp; ExampleTable       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join date       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue      &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; and date.dateValue &amp;lt; '9999-12-31'&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Note: a commenter noted that in some cases, excluding the offensive data using the ON criteria/WHERE clause may not solve the issue. This is very true, and really will be made evident in the plan. I would expect it to be more likely to be definitely excluded in the JOIN clause, but you really can't guarantee anything that the optimizer might do without changing the source data (or representing the source data using a derived table as):&lt;/p&gt;&lt;p&gt;&lt;font face="Courier New"&gt;select *, dateadd(day,1,date.dateValue)      &lt;br&gt;from   ExampleTable       &lt;br&gt;        &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; join (select * from date       where date.dateValue &amp;lt; '9999-12-31') as date&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on date.dateValue = ExampleTable.dateValue&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/font&gt;&lt;/p&gt;&lt;p&gt;Looking at the different variances to the plan you should be able to diagnose a "hidden" problem such as I have described by finding the flow of data and making sure that the filtering operation happens before the calculating of the scalar that causes the overflow error. This may harm performance in my query for even the more "ideal" case where it could have used indexes, so you may yet have more work to do...But this is what makes data programming fun, now isn't it? &lt;/p&gt;</description></item><item><title>The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx</link><pubDate>Sun, 30 Dec 2012 05:32:25 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46882</guid><dc:creator>rob_farley</dc:creator><description>&lt;p&gt;It’s very easy to get in the habit of imagining the way that a query should work based on the Logical Order of query processing – the idea that the FROM clause gets evaluated first, followed by the WHERE clause, GROUP BY, and so on – finally ending with whatever is in the SELECT clause. We even get in the habit of creating indexes that focus on the WHERE clause, and this is mostly right. &lt;/p&gt;  &lt;p&gt;But it’s only &lt;strong&gt;mostly&lt;/strong&gt; right, and it will often depend on statistics.&lt;/p&gt;  &lt;p&gt;There are other situations where statistics have to play a major part in choosing the right plan, of course. In fact, almost every query you ever run will use statistics to work out the best plan. What I’m going to show you in this post is an example of how the statistics end up being incredibly vital in choosing the right plan. It also helps demonstrate an important feature of the way that Scans work, and how to read execution plans.&lt;/p&gt;  &lt;p&gt;I’m going to use AdventureWorks2012 for this example. I’m going to ask for the cheapest product according to the first letter of the product name. This kind of query:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT MIN(ListPrice)       &lt;br /&gt;FROM Production.Product        &lt;br /&gt;WHERE Name LIKE 'H%';&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Don’t run it yet. I want to ask you how you’d solve it on paper.&lt;/p&gt;  &lt;p&gt;Would you prefer I give you a list of the products sorted by name, or would you prefer I give you that list sorted by price?&lt;/p&gt;  &lt;p&gt;If you want the ‘sorted by name’ option, then you’ll have to look through all the products that start with H, and work out which is the cheapest (notice that my predicate is not an equality predicate – if I knew what the name had to be exactly, then I could have an index which ordered by name and then price, and very quickly find the cheapest with that name). This approach could be good if you don’t have many products starting with that particular letter. But if you have lots, then finding them all and then looking for the cheapest of them could feel like too much work. Funnily enough, this is the way that most people would imagine this query being run – applying the WHERE clause first, and applying the aggregate function after that.&lt;/p&gt;  &lt;p&gt;On the other hand, if you have lots of products with that particular letter, you might be better off with your list sorted by price, looking through for the first product that starts with the right letter. &lt;/p&gt;  &lt;p&gt;Let me explain this algorithm a little more.&lt;/p&gt;  &lt;p&gt;If you’re at a restaurant and are strapped for cash, you might want to see what the cheapest thing is. You’d pick the “sorted by price” menu, and go to the first item. But then if you saw it had peanut in, and you have an allergy, then you’d skip it and go to the next one. You wouldn’t expect to have to look far to find one that doesn’t have peanut, and because you’ve got the “sorted by price” menu, you have the cheapest one that satisfies your condition after looking through just a few records.&lt;/p&gt;  &lt;p&gt;It’s clearly not the same algorithm as finding all the things that satisfy the condition first, but it’s just as valid. If you’re only going to have to look through a handful of products before you find one that starts with the right letter, then great! But what if there are none? You’d end up having to look through the whole list before you realised.&lt;/p&gt;  &lt;p&gt;The Query Optimizer faces the same dilemma, but luckily it might have statistics, so it should be able to know which will suit better.&lt;/p&gt;  &lt;p&gt;Let’s create the two indexes – one sorted by Name, one sorted by Price. Both will include the other column, so that the query will only need one of them.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE INDEX ixNamePrice ON Production.Product (Name) INCLUDE (ListPrice);       &lt;br /&gt;CREATE INDEX ixPriceName ON Production.Product (ListPrice) INCLUDE (Name);&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Now let’s consider two queries. Both queries give the same result – $0.00. But that’s not important, I’m only interested in how they run.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT MIN(ListPrice)       &lt;br /&gt;FROM Production.Product        &lt;br /&gt;WHERE Name LIKE 'I%';&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;SELECT MIN(ListPrice)       &lt;br /&gt;FROM Production.Product        &lt;br /&gt;WHERE Name LIKE 'H%';&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;The two queries are almost identical, but they run quite differently.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_78098898.png" width="575" height="273" /&gt;&lt;/p&gt;  &lt;p&gt;Ok, they’re fairly similar – they both use a Stream Aggregate operator, for example. And they have similar cost. But significantly, one is performing a Seek, while the other is doing a Scan. Different indexes, but nevertheless a Scan and a Seek.&lt;/p&gt;  &lt;p&gt;People will tell you that Scans are bad and Seeks are good, but it’s not necessarily the case. Here, we see that the Scan plan is no more expensive than the Seek plan – it’s just different. We should consider why.&lt;/p&gt;  &lt;p&gt;Those two indexes are the two different stories that I described earlier. There are very few products that start with the letter ‘I’, and quite a number than start with ‘H’, and so the Query Optimizer has chosen differently.&lt;/p&gt;    &lt;p&gt;There are exactly 10 products that start with I. From a total of 504. That’s less than 2% of the products.&lt;/p&gt;  &lt;p&gt;There are 91 products that start with H. That’s 18%. You might not have expected it to be that high, but that’s okay – if SQL has been maintaining statistics for you on this, it hopefully won’t be as surprised as you.&lt;/p&gt;  &lt;p&gt;18% – nearly 1 in 5. So by the time you’ve looked at, oh, a dozen records, you will have almost certainly found one that starts with an H. (Actually, the chance of NOT finding one in the first 12 would be power(.82, 12), which is 0.09. That’s just 9%.) If I do a bit of digging into the internals, I can discover that the pages in my index typically have over a hundred records on them each. The chance of not finding a product that starts with an H on that first page – you’d need lottery-scale luck (1 in 444 million).&lt;/p&gt;  &lt;p&gt;On the other hand, the cost of finding the cheapest value from 91 records is a lot more expensive than finding the cheapest from just 10. And getting all 10 records should be a small number of reads too.&lt;/p&gt;  &lt;p&gt;But a Scan! Really? It has to look through the whole table, right?&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;No. That’s not how it works.&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;You see, execution plans go from left to right. If you start reading these plans from the right, you’ll start thinking that the whole index has been scanned, when it’s simply not the case. That Top operator asks for a single row from the index, and that’s all it provides. Once that row has been found, the Scan operation stops.&lt;/p&gt;  &lt;p&gt;For this information, I don’t even need to pull up the Properties window for the Scan (but I would recommend you get in the habit of doing that). No – this is all available in the Tool Tip. Look at the number of “Actual number of rows” – it’s just one.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_743BBDFB.png" width="345" height="559" /&gt;&lt;/p&gt;  &lt;p&gt;A predicate is applied – it looks through the index for rows that start with H – but it’s doing this in Order (see Ordered = True), and it’s stopping after the first row is found. Remember I mentioned that there are actually 91 rows that satisfy the predicate? The Scan doesn’t care – it only needs one and it stops right then.&lt;/p&gt;  &lt;p&gt;You might figure this is because we are using MIN. What if we needed the MAX though? Well, that’s just the same, except that the Direction of the Scan is BACKWARD (you’ll need F4 for that one).&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_63E04642.png" width="291" height="96" /&gt;&lt;/p&gt;  &lt;p&gt;MIN goes forward, because it’s most interested in the ‘smallest’ ones, MAX will go backward because it wants the ‘largest’. (And as you’d probably expect, if you’d created your index to be descending, then it would be reversed.)&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;But again – being able to tell which is the better algorithm depends entirely on your statistics being known&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;I see so many systems have bad statistics for one reason or another, and typically because the data most frequently queried is the newest data, and that makes up such a small percentage of the table. The statistics will think that there is almost no data for ‘today’, as they probably haven’t been updated since at least some number of hours ago.&lt;/p&gt;  &lt;p&gt;When you look at how a query is running, always have a think about you’d solve it on paper, and remember that you might actually have a better (or worse) picture of the statistics than what the Query Optimizer has.&lt;/p&gt;  &lt;p&gt;And remember that a Scan is not necessarily bad. I might do another post on that soon as well.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Learn More About SQL Server IO and Query Tuning in These Webcasts</title><link>http://sqlblog.com/blogs/kevin_kline/archive/2012/12/14/learn-more-about-sql-server-io-and-query-tuning-in-these-webcasts.aspx</link><pubDate>Fri, 14 Dec 2012 18:50:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46662</guid><dc:creator>KKline</dc:creator><description>
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;I'm doing two new webcasts next week on Wednesday, December 19th, one in the morning and the other after lunch.&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;SSDs are a Game Changer for SQL Server Storage&lt;/h2&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;No, session is not exclusively about SSDs. &amp;nbsp;But this is my first session on IO and storage tuning that emphasizes SSDs over hard disks. &amp;nbsp;As Bob Dylan said "Times, they are a'changin'". &amp;nbsp;This session on Wednesday, December 19th at 11:30 AM EST, sponsored by Astute Networks, takes you through all of the basics of storage and IO tuning, regardless of the underlying storage technology. &amp;nbsp;I'll show you how SQL Server handles storage structures, how to identify IO activity on Windows and SQL Server, and best practices for minimizing IO bottlenecks. &amp;nbsp;Register now for:&lt;a title="Kevin Kline's Storage IO Best Practices for SQL Server" href="http://bit.ly/UcXYI3"&gt;&amp;nbsp;Storage IO Best Practices for SQL Server and a New Approach to Solving Application Performance Issues&lt;/a&gt;.&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;Write Better SQL Queries&lt;/h2&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;The next webcast on Wednesday, December 19th at 2 PM EST, is with me, Aaron Bertrand &amp;nbsp;(&lt;a href="https://twitter.com/#!/AaronBertrand"&gt;Twitter&amp;nbsp;&lt;/a&gt;|&amp;nbsp;&lt;a href="http://sqlblog.com/blogs/aaron_bertrand/rss.aspx"&gt;Blog&lt;/a&gt;)&amp;nbsp;and SQLCruise Impresario &amp;amp; Microsoft MVP Tim Ford &amp;nbsp;(&lt;a href="https://twitter.com/#!/sqlagentman"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://www.ford-it.com/sqlagentman/"&gt;Blog&lt;/a&gt;)&amp;nbsp;as we take you through the query tuning process, discussing important DMVs to use during query tuning, as well as demonstrating several essential query tuning techniques that every SQL developer should know. &amp;nbsp;Not only are we presenting an hour of top quality technical content, we’ll also be giving away some cool prizes, including the grand prize of a paid registration for the upcoming&amp;nbsp;&lt;a target="_blank" href="http://elink.sqlsentry.net/c/1/?aId=67857085&amp;amp;requestId=b34612-273953cd-e600-4a18-979a-a9f2ded860bd&amp;amp;rId=lead-a407ed107f65de119513001e0b614992-c233a49718324979b0d8efc0614ff5d0&amp;amp;ea=aunefuonetre=pbz=vagrepreir&amp;amp;dUrl=http%3A%2F%2Fsqlcruise.com%2F2013-cruises%3F_cldee%3DbmhhcnNoYmFyZ2VyQGludGVyY2VydmUuY29t&amp;amp;uId=0"&gt;SQLCruise Miami&lt;/a&gt;, a $1,395 value! &amp;nbsp;Register now for:&amp;nbsp;&lt;a title="SQL Server Query Tuning Best Practices, Hosted by Kevin Kline, Aaron Bertrand, and Tim Ford" href="http://bit.ly/UskPPm"&gt;SQL Server Query Tuning Best Practices, Hosted by Kevin Kline and Aaron Bertrand with special guest Tim Ford&lt;/a&gt;&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;&amp;nbsp;&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;I hope to see you at both of these sessions next week! &amp;nbsp;Best regards,&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;-Kev&lt;/p&gt;
&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;line-height:19px;"&gt;&lt;a title="Kevin E. Kline on Twitter" href="http://twitter.com/kekline"&gt;-Follow me on Twitter!&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Beware of SQL Server and PerfMon differences in disk latency calculation</title><link>http://sqlblog.com/blogs/michael_zilberstein/archive/2012/11/27/46419.aspx</link><pubDate>Wed, 28 Nov 2012 01:20:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46419</guid><dc:creator>mz1313</dc:creator><description>&lt;p&gt;Recently &lt;a href="http://www.brentozar.com/blitz/"&gt;sp_blitz&lt;/a&gt; procedure on one of my OLTP servers returned alarming notification about high latency on one of the disks (more than 100ms per IO). Our chief storage guy didn’t understand what I was talking about – according to his measures, average latency is only about 15ms. In order to investigate the issue, I’ve recorded 2 snapshots of sys.dm_io_virtual_file_stats and calculated latency per read and write separately. Results appeared to be even more alarming: while for read average latency was only 9ms, for write it skyrocketed to 260ms. Even more strange – PerfMon results for the same time interval showed similar read latency but only 17ms per write! We’ve almost opened support request for it – during informal phone conversations with colleagues in Microsoft nobody could explain it.&lt;/p&gt;  &lt;p&gt;Finally we’ve found the answer after more thorough analysis of PerfMon file. It appears that calculation formula is different. sys.dm_io_virtual_file_stats contains total number of writes and total write latency (stall), so by dividing them we receive average latency per single write operation. PerfMon on another hand measure Disk sec/write counter every sample interval (default is 15 seconds, I used 1 second interval in order to see more detailed picture). And according to PerfMon, average latency per write during certain period is average of samples measured during this period. Disk we’re talking about contains only datafiles of a single database. So we’ve noticed that while number of read operations per second is pretty equal along entire interval, writes behave totally different. Most of the time we see 0 (zero) writes – and 0 write latency accordingly. But once every minute we see massive writes – SQL tries to write almost 4000 IOs per second for several seconds in a row (several means sometimes 2, sometimes 10). I guess, it is checkpoint. And during these peaks latency jumps to huge values – up to 1.8 seconds (!) per write. The catch is that when we have 59 samples with 0 latency (we don’t have writes, remember?) and 1 second with 120ms latency, PerfMon with calculate average and receive (59*0 + 1*120)/60 = 2ms per write latency. sys.dm_io_virtual_file_stats at the same time will report 120 ms / write. And indeed, when I calculated average write latency from PerfMon but took into account only samples with non-zero number of writes, results became very close to what sys.dm_io_virtual_file_stats showed.&lt;/p&gt;  &lt;p&gt;So who is right in this story and should we do something with the disk? Both are right – they just calculate different numbers. sys.dm_io_virtual_file_stats calculates average latency per IO but doesn’t provide any indication whether load is linear or contains peaks and idle periods. Peaks have much more influence – because peaks in latency are usually accompanied by peaks in number of IO operations, so their relative weight is higher. Perfmon on another hand, when it comes to calculate average over period (for example, &lt;a href="http://pal.codeplex.com/"&gt;PAL&lt;/a&gt; tool does it automatically), smoothens peaks because for it every sample – the one with zero writes and the one with 10K writes per second – has same relative weight.&lt;/p&gt;  &lt;p&gt;Here is how our latency graph looks:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/michael_zilberstein/image_24BA5D57.png"&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" width="722" height="414" src="http://sqlblog.com/blogs/michael_zilberstein/image_thumb_6122AC34.png"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;My first thought was: “Checkpoint is background process, shouldn’t really affect user experience – storage guy is right, leave the disk alone”. But then I decided to check read latencies – indeed, when it is so busy that write takes 1.8 seconds, there is no reason to suppose that reads fill any better. As I said in the beginning, overall average read latency was 9ms. But when separated for write-idle and checkpoint periods, results were completely different. For write-idle periods, read latency was 1.5ms. But during checkpoint it jumped to 120ms. So user experience is affected – read queries that go to disk can run very slow or even appear as stuck during checkpoints. Beware &lt;img style="border-bottom-style:none;border-left-style:none;border-top-style:none;border-right-style:none;" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://sqlblog.com/blogs/michael_zilberstein/wlEmoticon-smile_69E28EB3.png"&gt;.&lt;/p&gt;</description></item></channel></rss>