THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Benjamin Nevarez

Fooling the Query Optimizer

Did you ever wanted to know which execution plans the Query Optimizer would generate for your queries should your tables have millions of records? You can actually generate those plans by using the undocumented ROWCOUNT and PAGECOUNT options of the UPDATE STATISTICS statement. These options can be used on small or empty tables and can be helpful for testing in some scenarios where you may not want to spent time or disk space creating big tables.

 

By using this method you are tricking the Query Optimizer as it will generate execution plans using cardinality estimations as if the table really had millions of records. Note that this option, available since SQL Server 2005, only helps in creating the execution plan for your queries. Actually running the query will use the real data and of course will execute faster than a table with millions of records.

 

UPDATE STATISTICS WITH ROWCOUNT, PAGECOUNT does not change the table statistics, only the counter of number of rows and pages of a table. But the Query Optimizer uses this information to estimate the cardinality of queries as I will show later. Also keep in mind that these are undocumented and unsupported options and should not be used in any production environment.

 

Let us see an example. Run the following query to create a new table on the AdventureWorks database

 

select * into dbo.Address

from Person.Address

Inspect the number of rows by running the following queries. It must show 19,614 rows.

select * from sys.partitions

where object_id = object_id('dbo.Address')

 

select * from sys.dm_db_partition_stats

where object_id = object_id('dbo.Address')

 

Run the following query

 

select * from dbo.Address

where city = 'London'

 

Running this query will create new statistics for the city column and will show the following plan. Note that the estimated number of rows is 434 and it is using a simple Table Scan operator

 

 clip_image002

 

We can see where the Query Optimizer is getting the estimated number of rows by inspecting the statistics object. Run this query to see the name of the statistics object

 

select * from sys.stats

where object_id = object_id('dbo.Address')

 

Then use the displayed statistics object name in the following statement (the name may be different in your case)

 

dbcc show_statistics('dbo.Address', _WA_Sys_00000004_46136164)

 

By looking at the histogram you can find the value 434 on EQ_ROWS for the RANGE_HI_KEY value ‘London’ (Statistics and histograms are explained on previous posts in this blog)

 

clip_image004

 

Now run the UPDATE STATISTICS WITH ROWCOUNT, PAGECOUNT (you can specify any other value for rowcount and pagecount)

 

update statistics dbo.Address with rowcount = 1000000, pagecount = 100000

 

If you inspect the number of rows again from sys.partitions or sys.dm_db_partition_stats, as shown previously, it will now show 1,000,000 rows. sys.dm_db_partition_stats also shows the new number of pages. Clear the plan cache and run the query again

 

dbcc freeproccache

go

select * from dbo.Address

where city = 'London'

 

Note that the estimated number of rows has changed from 434 to 22,127.1 and that a different plan was generated using this new cardinality estimation. The Query Optimizer decided to parallelize this plan. But this is a very simple query, more dramatic plan changes can happen with more complex queries.

 

 

clip_image006

 

After execution the actual number of rows obviously is still is 434 but the Query Optimizer is not able to see this value.

 

If you look at the statistics object again, using DBCC SHOW_STATISTICS as shown before, the histogram has not changed. One way to obtain the estimated number of rows shown in the new execution plan is calculating the percentage or fraction of rows for the value ‘London’ from the statistics sample, which in this case is 19,614, as shown on the header of the statistics object. So the fraction is 434 / 19,614 or 022127052. Then obtain the same percentage from the new “current” number of rows which is 1,000,000 calculated as 1,000,000 * 0.022127052 and we get 22,127.1 which is the estimated number of rows displayed in the plan.

 

Finally, drop the table you just created

 

drop table dbo.Address

 

Published Wednesday, January 13, 2010 9:57 PM by Ben Nevarez
New Comments to this post are disabled

About Ben Nevarez

Benjamin Nevarez is a database professional based in Los Angeles, CA, and author of "Inside the SQL Server Query Optimizer". He has also contributed to other SQL Server books including "SQL Server 2012 Internals". Benjamin has 20 years of experience with relational databases and has been working with SQL Server since version 6.5. He holds a Master’s Degree in Computer Science and has been a speaker at many SQL Server conferences, including the PASS Summit and SQL Server Connections. Benjamin's blog is at http://www.benjaminnevarez.com and can be reached by e-mail at admin at benjaminnevarez dot com and on twitter at @BenjaminNevarez.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement