<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Joe Chang : Optimizer</title><link>http://sqlblog.com/blogs/joe_chang/archive/tags/Optimizer/default.aspx</link><description>Tags: Optimizer</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Run-time Execution Plan Options</title><link>http://sqlblog.com/blogs/joe_chang/archive/2009/02/21/run-time-execution-plan-options.aspx</link><pubDate>Sat, 21 Feb 2009 18:11:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:12066</guid><dc:creator>jchang</dc:creator><slash:comments>3</slash:comments><comments>http://sqlblog.com/blogs/joe_chang/comments/12066.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/joe_chang/commentrss.aspx?PostID=12066</wfw:commentRss><description>&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;What are the top core SQL Server engine performance issues today, after all the improvements that have gone into 2005 and 2008? (I am excluding matters beyond the power of Microsoft, like eliminating bad developers.)&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;SQL Server already has statement level recompile for multi-statement stored procedures. But there is no option to change a plan for a single statement in the course of execution. SQL Server also it has the OPTIMIZE FOR hint and Plan Guide options. This allows for a single plan for a particular query. What if there needs to be multiple plans for a single query?&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;&lt;FONT size=3&gt;&lt;FONT face="Times New Roman"&gt;Runtime Plan Change.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;Consider a single SQL query, with multiple table access steps in the execution plan. It is not always possible to estimate row count at each step in the execution plan. This is why it is called an estimate. (I know some complain when the estimate is off by 10%, Lubor once said he is interested in estimates off by 2X, but I consider catastrophic errors tend to be orders of magnitude, 10-1000X or more). Suppose the SQL Server engine starts the first steps of execution plan. The estimated row count was low, so the subsequent steps employ index seeks with nested loops joins to complete the query. In actuality, the row count is very large, so table scans with hash or merge joins are much better. The opposite scenario is a very high estimate row count that turns out to be low. The remaining steps of the execution plan involve table scans with hash or merge joins, when index seeks with nested loops joins would be far more efficient. Now there are two options. One is to retain the current partial results, but change subsequent steps, to account for the new row count. In some cases, it will be much better to discard the current progress and generate a completely new plan based on the new row count. If this could be implemented, my thoughts are that this should only be enabled explicitly for known problem queries instead of a system wide on/off switch.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;&lt;FONT size=3&gt;&lt;FONT face="Times New Roman"&gt;Multiple Execution Plans for a single SQL&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;Now consider a single SQL query, involving multiple search arguments. Depending on the input parameters, the row count involved for each argument varies drastically. So the Optimize For and Plan Guide options can only optimize for one particular scenario. I have handled this by making two or more copies of the original stored procedure, then having the original procedure check actual data distribution, and then call one of the copies. The sub-procedures are identical, but copy A is only called when parameter 1 is high, parameter 2 is low, and Copy 2 is called when parameter 1 is low and 2 is high. It is not necessary to have one copy for each possibility, only if that possibility requires a significantly different execution plan. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;My thought on handling this automatically is to extend the OPTIMIZE FOR hint to from specifying a single parameter value to multiple. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;Example: instead of &lt;/FONT&gt;&lt;/P&gt;&lt;o:p&gt;&lt;FONT face="Times New Roman" size=3&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;OPTION&lt;/SPAN&gt; (OPTIMIZE &lt;SPAN style="COLOR:blue;"&gt;FOR&lt;/SPAN&gt; @P1 = 1, @P2 = ‘A’, …) &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;New:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;OPTION&lt;/SPAN&gt; (OPTIMIZE &lt;SPAN style="COLOR:blue;"&gt;FOR&lt;/SPAN&gt; @P1 = 1 or 2, @P2 = ‘A’ or ‘B’, …) &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;The presumption above is that @P1 values 1 and 2 have drastically different distributions that really need separate execution plans. So the SQL Server engine would generate a plan for each option. At run time, depending on whether the input parameter has distribution closer option 1 or 2, the appropriate plan is selected. Of course, this is a rather sophisticated hint that should not be used by anyone without detailed understanding of the execution plan. Also consider that as data distribution changes over time (or on statistics recomputed) the Optimize For values might be rendered ineffective. The SQL Server optimizer could also check whether each Optimizer For option actually leads to a significantly different plan, and discard irrelevant options. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;Microsoft might take the position that statement level recompile is the right solution for this. My argument only stands if the cost of recompile is high relative to the cost of execute for the desired plan. So maybe an alternative is I want to use the&amp;nbsp;OPTIMIZE FOR in the low row count scenario, but a RECOMPILE if the actual parameter value has high row count.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;FONT face="Times New Roman" size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;&lt;FONT face="Times New Roman" size=3&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;Correlated Column Statistics&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;I posted this a comment in another blog, but am moving here. I do not consider a top issue, but desirable. The reason SQL Server needs correlated column statistics (it is in at the vector level, but not at the histogram). &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;Consider the query:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;SELECT&lt;/SPAN&gt; * &lt;SPAN style="COLOR:blue;"&gt;FROM&lt;/SPAN&gt; Women &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;WHERE&lt;/SPAN&gt; Attractiveness &amp;gt; 9 &lt;SPAN style="COLOR:#999999;"&gt;AND&lt;/SPAN&gt; MonthlyShoppingExpenses &amp;lt; $3000 &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;Or: &lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;SELECT&lt;/SPAN&gt; * &lt;SPAN style="COLOR:blue;"&gt;FROM&lt;/SPAN&gt; Men&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;SPAN style="COLOR:blue;"&gt;WHERE&lt;/SPAN&gt; FinancialStability &amp;gt; 9 &lt;SPAN style="COLOR:#999999;"&gt;AND&lt;/SPAN&gt; Fidelity &amp;gt; 9 &lt;SPAN style="COLOR:#999999;"&gt;AND&lt;/SPAN&gt; IsSingle &amp;gt; 50%&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;The individual column statistics may indicate 10% on each column condition, leading one to think that the combined AND conditions yields 1%, which will still have a respectable row count output. But in fact, if correlation is taken into account, a drastically&amp;nbsp;lower row count is the result. Yes, I know: get a life, and I have been called MCP before the MS had the MCP program.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;Additional Items&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;Index Seek with Key Lookup versus Table/Clustered Index Scan&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;In previous blogs, I also complained about the static approximate 4:1 (1350:320) ratio in table scan versus index seek followed key lookup. That is, the cross-over point from an index seek with key lookup to table scan occurs when the number of rows requiring key lookups reaches approximately one-fourth the number of pages. SQL Server 2008 has the hint &lt;SPAN style="COLOR:blue;"&gt;OPTION&lt;/SPAN&gt; (&lt;SPAN style="COLOR:blue;"&gt;TABLE HINT&lt;/SPAN&gt; (table,&lt;SPAN style="COLOR:blue;"&gt;FORCE SEEK&lt;/SPAN&gt;)), which pushes the index seek with key lookup regardless of the number of rows.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;I commented that this was rather like giving us a sledge hammer when a chisel was preferred. My preference is to have a hint that changes the cross-over ratio from 4:1 to 1:1.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;Hash and Sort spill to tempdb&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;Back when SQL Server 2005 was in late beta, I looked at SQL Server 2000 and 2005 execution plans with Hash (Join and Match) and Sort operations. I noted that the point at which hash and sort operations incurred an IO cost was function of the size of the intermediate table and system memory. Naturally AWE memory in 32-bit system does not count, i.e. only directly addressable memory counts. I did not verify that this was the actual point at which a hash or sort spooled to tempdb,&amp;nbsp;and I did not notice any performance issues on my systems with adequate storage performance. Of course, many people have really crappy storage performance. Should there be a tunable parameter for handling this? Especially now that full 64-bit systems with huge (128GB) memory (and crappy storage performance) are common? Or does this fall under matters beyond the power of Microsoft?&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;Default Settings for Cost Threshold for Parallelism and Max Degree of Parallelism&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;Oh yeah,&amp;nbsp;probably a tier 2 item. It is high time Microsoft change the default settings for these. A default of 0 or all available is potentially silly and severely adverse on systems with multi-core processors. The current cost threshold for parallelism of 5 is also really silly. I recall seeing queries that run in 50ms (non-parallel) get a parallel plan. This is way too low considering the overhead of setting up parallel execution. At minimum, the cost threshold should be recalibrated so that parallelism is not initiated until true CPU (on Core 2 or later architecture) is around 300-500 milli-sec. This is almost getting like how Oracle left the default buffer cache setting to 8-16MB, when system memory typically reached GBs, and the "DBA" neglected to change the default.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;Multi-level Cost Threshold and Max Degree of Parallelism&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;I am thinking this should really be a bi-level or even multi-level setting. That is, a plan cost of 50-1000 allows MAXDOP 2-4, and plan cost 1000-10000 targets MAXDOP 8-16, and so on. (I am winging these numbers. If considered, then I can a more serious study on proper values.)&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;&lt;B style="mso-bidi-font-weight:normal;"&gt;New Parallel Execution Plan Formulas&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN:0in 0in 0pt;"&gt;I have previously griped about the formulas used in parallel execution plans (OK, so you name it, and I have complained about it!). Anyways, the current execution plan cost model is preponderantly IO weighted, and IO costs are not reduced in parallel execution plans (except for large hash and sort operations). This has the effect of inhibiting parallel execution when in fact parallel execution would be beneficial, especially when almost all hot data is memory, so the IO argument does not hold. One more complaint/request. The bitmap filter operation in large parallel hash operations really help. In some cases, the parallel operation must be inhibited, but it would still be nice to employ the bitmap filter, even though large hash operations should be parallel.&lt;/P&gt;&lt;/FONT&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=12066" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/joe_chang/archive/tags/Performance/default.aspx">Performance</category><category domain="http://sqlblog.com/blogs/joe_chang/archive/tags/Optimizer/default.aspx">Optimizer</category><category domain="http://sqlblog.com/blogs/joe_chang/archive/tags/Query+Hints/default.aspx">Query Hints</category></item></channel></rss>