THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Geek City: Accessing Distribution Statistics

Distribution statistics are one of the most important sources of information that the Query Optimizer uses to determine a good query plan. In this post, I’m not going to tell you everything about distribution statistics. I’m just going to show you a few tricks for getting access to the statistics.

If you want a deeper understanding of what the statistics keep track of, and you don’t have any of my SQL Server Internals books handy, check out this whitepaper: Statistics Used by the Query Optimizer in Microsoft SQL Server 2008 

Microsoft does provide us a tool called DBCC SHOW_STATISTICS for examining the distribution statistics.

Microsoft has gradually been making more of the more of the old DBCC commands available as DMVs, even some undocumented ones. For example, one of my favorites, DBCC IND, has now been replaced in SQL Server 2012  by sys.dm_db_database_page_allocations.

I have been wishing for several versions that Microsoft would make the DBCC SHOW_STATISTICS information available as a DMV. But it hasn’t happened yet, and I’m tired of waiting, so I decided to do something about it.

My solution is not quite as easy to use as a DMV might be, but it allows you to get the information that DBCC SHOW_STATISTICS provides into a set of three tables that can then be saved into a more permanent location of your choice, and/or queried as desired.

DBCC SHOW_STATISTICS returns three sets of information, with different columns in the output, so three different tables are needed. The DBCC SHOW_STATISTICS command can be called with an argument that specifies that you just want one of the three sets returned. The options are

WITH STAT_HEADER – returns basic info such as last update date, and number of rows in the table/index. Also reports number of steps returned for HISTOGRAM section.

WITH DENSITY_VECTOR – returns density info for each left-based subset of columns in the index. For example, an index on (lastname, firstname, city) would have a density value for (lastname), for (lastname, firstname), and for (lastname, firstname, city). Each density value is a single number representing the average number of occurrences and depends on the number of distinct values. For example, if there are only 2 possible values in the column, the density would be 0.5. Multiplying density by the number of rows in the STAT_HEADER section would give the average expected rowcount if a query was executed looking for an equality on the specified column(s).

WITH HISTOGRAM – returns a set of ordered values from the first column of the index, creating a histogram. This histogram provides the optimizer with selectivity information for specific values or ranges of values in the first column of the index.

To collect this info, I will use one of my favorite tricks, which is to create a table in the master database with a name starting with sp_. (I’ve written about this trick several times, including in this earlier blog post.) Once I have the table(s) created, I can access them from any database. So here are the three tables:

USE Master;

IF  (SELECT object_id('sp_stat_header')) IS NOT NULL
  DROP TABLE sp_statsheader;
CREATE TABLE sp_stat_header
(   Name sysname,
    Updated datetime,
    Rows bigint,
    Rows_sampled bigint,
    Steps smallint,
    Density numeric (10,9),
    Average_key_length smallint,
    String_index char(3),
    Filter_expression nvarchar(1000),
    Unfiltered_rows bigint);

IF  (SELECT object_id('sp_density_vector')) IS NOT NULL
  DROP TABLE sp_density_vector;
CREATE TABLE sp_density_vector
(  all_density numeric(10,8),
   average_length smallint,
   columns nvarchar(2126) );

IF  (SELECT object_id('sp_histogram')) IS NOT NULL
  DROP TABLE sp_histogram;
CREATE TABLE sp_histogram
(   RANGE_HI_KEY sql_variant,
    RANGE_ROWS bigint,
    EQ_ROWS bigint,
    AVG_RANGE_ROWS bigint);

The second trick is to use INSERT … EXEC to execute a DBCC statement and populate the tables. I will build the DBCC  command dynamically, after capturing the schema, table and index names in variables. You of course could take this code and turn it into a stored procedure, for which the schema, table and index names are passed as parameters. I’ll use as an example a table in the AdventureWorks2008 sample database, just so you can try running the code, and I can verify that it actually works!

I will use the table Sales.SalesOrderDetail and the index IX_SalesOrderDetail_ProductID. So the object name (@oname) is SalesOrderDetail, the schema name (@sname) is Sales, and the index name (@iname) is IX_SalesOrderDetail_ProductID.

USE AdventureWorks2008;
DECLARE @oname sysname,  @iname sysname, @sname sysname

SELECT @oname = 'SalesOrderDetail',  @sname = 'Sales', @iname = 'IX_SalesOrderDetail_ProductID';
-- Update the object name to include the schema name, because that is the format the DBCC command expects
SELECT @oname = @sname +'.' + @oname;

TRUNCATE TABLE sp_stat_header;
INSERT INTO sp_stat_header
    EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH STAT_HEADER');

TRUNCATE TABLE sp_density_vector;
INSERT INTO sp_density_vector
    EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH DENSITY_VECTOR');

TRUNCATE TABLE sp_histogram;
INSERT INTO sp_histogram
    EXEC ('DBCC SHOW_STATISTICS(['+ @oname + '],' + @iname +') WITH HISTOGRAM');

So now you can look at the values collected and filter or query in any way, or use SELECT INTO to save them into another table, so the sp_ tables can be used the next time you want to capture distribution statistics information.

SELECT * FROM sp_stat_header;

SELECT * FROM sp_density_vector;

SELECT * FROM sp_histogram;


Let me know if you find this useful, and especially if you embellish it to create a procedure or an automated process of your own!



Published Friday, January 18, 2013 1:08 PM by Kalen Delaney
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



sfibich said:

I’m guessing you already know but SQL 2008 R2 introduces a new DMV sys.dm_db_stats_properties that has almost all of the same properties as the DBCC show_statistics WITH STAT_HEADER returns.  The DMV is missing the density which I believe is no longer used by the optimizer, string index, average key length, and filtered expression.  The filtered expression column can be picked up from sys.stats filter definition column.  It’s not a perfect fit but if you can do with the average key length and the string index flag it’s a nice alternative.  You can also pull in the average record length for the 0 level index of a statistic if it has one from the sys.dm_db_index_physical_stats DMV using the DETAILED setting.  It’s not exactly the same thing but it may give some insight to the statistic.  This combination has been getting me through an high level look at statistics for a database when I’m having performance problems.

I like your solution as it solves the issue for getting some of the more detailed statistics information.

something like...






sys.dm_db_stats_properties (object_id('FactInternetSales'), 2) as ddsp

inner join

sys.stats ss


ddsp.object_id =ss.object_id


ddsp.stats_id =ss.stats_id

January 22, 2013 10:14 AM

Kalen Delaney said:

Thanks sfibich... I actually just found out about the new DMV when this issue of programmatically accessing stats info was being discussed on a private forum. Someone asked why we couldn't use sys.dm_db_stats_properties, and that was the first time I saw it! But as you say, it doesn't have everything.

Thanks again


January 22, 2013 1:08 PM

Kalen Delaney said:

  I just started using a new DMV (one that’s actually an ‘F’ not a ‘V’, as in Function) that gives

April 9, 2013 9:39 PM

Kalen Delaney said:

I just started using a new DMV (one that’s actually an ‘F’ not a ‘V’, as in Function) that gives us more

April 9, 2013 9:40 PM

Jeff Humphreys said:

Thank you! I forgot about the INSERT INTO preceding a command feature.

This is really helpful for dynamically identify cardinality for columns, as well as identifying the top 200 most common values in each column. Since I have auto statistics on, I just pass the statistic name into DBCC and I get the generated stats on every column of every table.

November 13, 2014 9:49 AM

Kalen Delaney said:

Hi Jeff, I'm glad you found this useful. Keep in mind that the 200 steps in the histogram are very unlikely to be the 200 most common values. They are sample points spread throughout the ordered key set that give us useful information on the intervals.



November 13, 2014 7:11 PM

Lori Edwards said:

Kalen,  This is great!  I'll definitely using it.  Thanks!

April 28, 2015 1:24 PM

Steve Sarre said:

Hi Kalen

Thanks for that

(Just 2 years out of date)

I ran something like this on all of our databases on a server running

Microsoft SQL Server 2008 R2 (SP1)

and found that some columns needed to be int, rather than smallint


[average_length] [int]


[Average_key_length] [int]

I also found that where there are statistics on timestamp columns, you cannot use the SQL_VARIANT data type, you have to use varchar(max)


[RANGE_HI_KEY] [varchar] (max)



London England

May 1, 2015 9:23 AM

Kalen Delaney said:

Thanks Steve... who would thunk average key length needed to be more than 32K bytes... but then of course this can keep track of any stats, not just stats on indexed columns. So we could have stats on a varchar(max) or some such.

Thanks again!


May 4, 2015 9:08 PM

king said:



January 16, 2018 11:30 PM

aaaa said:

mt0417 mt0417

April 16, 2018 8:51 PM

kanker paru paru bisa sembuh tanpa operasi said:

June 28, 2018 6:41 PM

kesehatan said:

July 4, 2018 10:58 PM

obat meningkatkan kecerdasan otak said:

July 13, 2018 9:44 PM

obat keputihan berlebihan dan berbau said:

July 15, 2018 8:52 PM

linying123 said:



July 16, 2018 8:24 PM

dongdong8 said:



July 23, 2018 11:47 PM

obat diabetes said:

August 1, 2018 11:18 PM

obat bronkitis anak secara alami said:

August 7, 2018 7:49 PM

obat sinusitis said:

August 10, 2018 7:23 PM

yaoxuemei said:


August 15, 2018 2:15 AM

masker wajah alami untuk menghilangkan jerawat said: Masker wajah alami untuk menghilangkan jerawat Obat liver alami aman dan terbaik Obat herbal sinusitis kronis terbaik Obat bronkitis anak secara alami Obat Kanker Payudara terbaru 2018 Cara meningkatkan nafsu makan secara alami Obat herbal diabetes menurunkan gula darah Obat herbal kusta terbaik Obat benjolan di leher tanpa operasi Obat Herbal Kanker Otak tuntas hingga akar Obat TBC alami aman tanpa efek samping Obat radang amandel sembuh tanpa operasi Obat asam urat alami tanpa efek sampinfg Obat tetes Sariawan secara alami Cara mengobati mata bengkak dengan cepat Obat tradisional kanker lambung terbaik Obat kanker paru paru Obat usus buntu alami tanpa operasi Obat asam lambung naik cara alami

August 20, 2018 8:59 PM

obat herbal koresterol terbaik said:

August 23, 2018 10:58 PM

obat usus buntu alami tanpa operasi said:

August 24, 2018 12:21 AM

obat benjolan di tubuh said:

August 30, 2018 7:40 PM

obat wasir said:

September 4, 2018 8:26 PM

obat katarak said:

September 16, 2018 5:46 PM

obat maag said:

September 17, 2018 7:11 PM

xiaojun said:

20180928 junda

September 27, 2018 10:58 PM

chenqiuying said:


October 10, 2018 6:33 PM

xiaozhengzheng said:

2018-11-20 xiaozhengzheng

November 19, 2018 8:31 PM

Leave a Comment


This Blog


Favorite Non-technical Sites or Blogs

Privacy Statement