<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx</link><description>It’s very easy to get in the habit of imagining the way that a query should work based on the Logical Order of query processing – the idea that the FROM clause gets evaluated first, followed by the WHERE clause, GROUP BY, and so on – finally ending with</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46884</link><pubDate>Sun, 30 Dec 2012 10:28:50 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46884</guid><dc:creator>Martin Smith</dc:creator><description>&lt;p&gt;In the seek version performance is much more predictable. It always has to perform the same work.&lt;/p&gt;
&lt;p&gt;With the scan version the worst case scenario can be much more expensive than estimated even with perfect statistics due to the modelling assumption that the matching rows will be evenly distributed with respect to the other column. If in fact there is some correlation such that the matching rows for one column value all happen to be at one end of the range of values for the other column then a full scan is needed which can be much worse.&lt;/p&gt;
&lt;p&gt;Example of that here&lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://stackoverflow.com/questions/7481818/sql-why-is-select-count-mincol-maxcol-faster-then-select-mincol-max/7482342"&gt;http://stackoverflow.com/questions/7481818/sql-why-is-select-count-mincol-maxcol-faster-then-select-mincol-max/7482342&lt;/a&gt;&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46885</link><pubDate>Sun, 30 Dec 2012 10:41:58 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46885</guid><dc:creator>Rob Farley</dc:creator><description>&lt;p&gt;Hi Martin,&lt;/p&gt;
&lt;p&gt;There are also plenty of times when predicates with low selectivity cause a large number of rows to be returned by a Seek, so I would suggest your argument could work on both sides.&lt;/p&gt;
&lt;p&gt;I would suggest it's an example which suits making index choices based on business knowledge of the data model, if you are worried about the statistics being insufficient for good choices to be made.&lt;/p&gt;
&lt;p&gt;Rob&lt;/p&gt;
</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46887</link><pubDate>Sun, 30 Dec 2012 13:53:27 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46887</guid><dc:creator>Martin Smith</dc:creator><description>&lt;p&gt;Hi Rob,&lt;/p&gt;
&lt;p&gt;I'm not saying that the scan plan is definitely worse (at least assuming perfect statistics) just that it can have more variable performance when the rows are not in fact exactly evenly distributed.&lt;/p&gt;
&lt;p&gt;Say there are 1,000,000 rows. 1,000 match the seek predicate. Under the even distribution assumption SQL Server will assume that 1,000 (1 million / 1 thousand) rows need to be scanned.&lt;/p&gt;
&lt;p&gt;For the seek plan the best, worst, and estimated case are all 1,000 rows&lt;/p&gt;
&lt;p&gt;For the scan plan the best, worst, and estimated case are (1, 999,000, 1,000) and if the statistics are not perfect and in fact no rows match at all then the real worst case would be 1 million rows.&lt;/p&gt;
&lt;p&gt;If the predicate is made less selective so 10,000 rows now match&lt;/p&gt;
&lt;p&gt;For the seek plan the best, worst, and estimated case are all 10,000 rows&lt;/p&gt;
&lt;p&gt;For the scan plan the best, worst, and estimated case are (1, 990,000, 100)&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46888</link><pubDate>Sun, 30 Dec 2012 14:46:28 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46888</guid><dc:creator>Martin Smith</dc:creator><description>&lt;p&gt;Just pointing it out as a potential issue to be considered. &lt;/p&gt;
&lt;p&gt;In the real world there may be all sorts of correlations that don't necessarily occur to one when writing queries.&lt;/p&gt;
&lt;p&gt;SELECT MIN(OrderDate), MAX(OrderDate)&lt;/p&gt;
&lt;p&gt;FROM Orders&lt;/p&gt;
&lt;p&gt;WHERE ProductId = @ProductId&lt;/p&gt;
&lt;p&gt;It is highly unlikely that productIds will be evenly distributed throughout the Orders table as new products get launched and old ones get discontinued for example.&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46889</link><pubDate>Sun, 30 Dec 2012 20:45:07 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46889</guid><dc:creator>Rob Farley</dc:creator><description>&lt;p&gt;Your points are all valid, and again I'll say that designing indexes around business knowledge is important. Many business scenarios would consistently be much closer to the best case, and it would be foolish to settle for the &amp;quot;least bad worst case&amp;quot; alternative.&lt;/p&gt;
&lt;p&gt;Obviously the queries can be forced into either plan, once the developers have considered their options.&lt;/p&gt;
&lt;p&gt;And your query example doesn't work. If you need both MIN and MAX you'd need to approach from both ends, and your equality predicate causes a simple composite index most effective. &lt;/p&gt;
</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46890</link><pubDate>Sun, 30 Dec 2012 21:37:06 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46890</guid><dc:creator>Martin Smith</dc:creator><description>&lt;p&gt;Actually the link in my first post does show sometimes SQL Server will generate such a plan calculating the MIN and MAX separately (in that case it also has added lookups which make things worse.)&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46895</link><pubDate>Mon, 31 Dec 2012 00:09:40 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46895</guid><dc:creator>Ian Yates</dc:creator><description>&lt;p&gt;Great post! I really liked the clear explanations 😊 I think I shall use them in real life when trying to explain some of this behaviour to others (or just point them to this blog)&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46896</link><pubDate>Mon, 31 Dec 2012 00:14:40 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46896</guid><dc:creator>Rob Farley</dc:creator><description>&lt;p&gt;Martin: Yes - if it thinks the range is wide enough and low enough selectivity it can decide to go from both sides. But in that scenario you can solve it easily with a composite index, because the predicate is an equality.&lt;/p&gt;
</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46897</link><pubDate>Mon, 31 Dec 2012 00:16:34 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46897</guid><dc:creator>Rob Farley</dc:creator><description>&lt;p&gt;Ian: Thanks &lt;/p&gt;
</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46922</link><pubDate>Tue, 01 Jan 2013 12:51:53 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46922</guid><dc:creator>Simon Sabin</dc:creator><description>&lt;p&gt;Having so few rows in the table make this a little contrite. The indexes only have 4 leaf pages to read. All data is found on one page and so its a question in both of scanning a single page for a value.&lt;/p&gt;
&lt;p&gt;It would be helpful to show this with more data where the impact is significant.&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46934</link><pubDate>Wed, 02 Jan 2013 02:09:46 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46934</guid><dc:creator>Rob Farley</dc:creator><description>&lt;p&gt;Ok Simon... how about in AdventureWorksDW, which has 60K records in dbo.FactInternetSales? I get it's still not huge, but it shows that it's easy to have a bad plan come out.&lt;/p&gt;

&lt;p&gt;CREATE INDEX ix1 ON dbo.FactInternetSales(OrderDateKey) INCLUDE (UnitPrice);&lt;/p&gt;

&lt;p&gt;CREATE INDEX ix2 ON dbo.FactInternetSales(UnitPrice) INCLUDE (OrderDateKey);&lt;/p&gt;

&lt;p&gt;SELECT MIN(OrderDateKey)&lt;br&gt;
FROM dbo.FactInternetSales&lt;br&gt;
WHERE UnitPrice between 0 and 100;&lt;br&gt;
--Prefers ix1. 20 reads&lt;/p&gt;

&lt;p&gt;SELECT MIN(OrderDateKey)&lt;br&gt;
FROM dbo.FactInternetSales&lt;br&gt;
WHERE UnitPrice between 600 and 700;&lt;br&gt;
--Prefers ix2. 2 reads.&lt;/p&gt;

&lt;p&gt;SELECT MIN(OrderDateKey)&lt;br&gt;
FROM dbo.FactInternetSales WITH (INDEX(ix2))&lt;br&gt;
WHERE UnitPrice between 0 and 100;&lt;br&gt;
--Forced ix2. 150 reads.&lt;/p&gt;

&lt;p&gt;Of course, with a correlation between the two fields, it could be possible to show an example of ix1 being particularly nasty as well.&lt;/p&gt;</description></item><item><title>re: The SQL Query Optimizer – when Logical Order can get it wrong</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/30/the-sql-query-optimizer-when-logical-order-can-get-it-wrong.aspx#46980</link><pubDate>Thu, 03 Jan 2013 13:48:02 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46980</guid><dc:creator>Chris Adkin</dc:creator><description>&lt;p&gt;Rob,&lt;/p&gt;
&lt;p&gt;You may be alluding to what the optimizer team refer to as the ascending key problem, whereby data gets added to a table in ascending key order, for large tables ( opposite of the 'Small' table you mentioned - I know ), this can be mitigated against via trace flag 2371. You may also want to look at connect item 676224, another popular database engine includes the ability for hints to be used on statements that specify:&lt;/p&gt;
&lt;p&gt;1. That data in the relevant table(s) should be sampled in order to produce better plans.&lt;/p&gt;
&lt;p&gt;2. How aggresively the sampling should be&lt;/p&gt;
&lt;p&gt;Refer to connect item 676224 &lt;/p&gt;</description></item></channel></rss>