THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Tibor Karaszi

Is statistics over non-indexed columns updated by index rebuild?

This blog has moved! You can find this content at the following new location:

http://sqlblog.karaszi.com/is-statistics-over-non-indexed-columns-updated-by-index-rebuild/

Published Thursday, August 9, 2007 6:18 PM by TiborKaraszi

Comments

 

James Luetkehoelter said:

Nice to point out - I find it amazing how often the two (indexes and statistics) are misunderstood. Also, when indexes or statistics are not updated - doing mass data loads, etc.

Well done, a simple question with a clear answer, as well as a more technical discussion of why. Great post.

August 10, 2007 12:06 PM
 

Alejandro Mesa said:

Hi Tibor,

Good question, short answer, great explanation and test script.

Microsoft should be more precise describing STATS_DATE function. The [index_id] could be also the [stats_id]. As you know, previous to 2005, they use to be in [sysindexes], so I guess that is why they continue mentioning just [index_id].

select [name], stats_date([object_id], index_id) as [stats_date]

from

(

select [name], [object_id], index_id from sys.indexes where [name] = 'x1' and [object_id] = object_id('t')

union all

select [name], [object_id], stats_id from sys.stats where [name] = 's1' and [object_id] = object_id('t')

) as t

Cheers,

AMB

August 22, 2007 3:15 PM
 

TiborKaraszi said:

Thanks for the kind words, James and Alejandro. :-)

Alejandro:

I did consider what you say above before I posted the blog. But, MS seems to have goofed on this. The problem is that the STATS_DATE function only takes two parameters: table_id and index_id. And, you can find a row in sys.stats with the same "index id" as a row in sys.indexes. So, how to tell these apart? Basically, STATS_DATE() needs a 3:rd parameter like "index" or "statistics". Try below (you might need to try a few tables before you find "duplicates"):

USE Adventureworks

DECLARE @o_id int

SET @o_id = OBJECT_ID('Production.Product')

SELECT 'Index' AS "type", [name] AS "Index or stats name", index_id AS "Index or stats id" FROM sys.indexes WHERE [object_id] = @o_id

UNION ALL

SELECT 'Stats' AS "type", [name] AS "Index or stats name", stats_id AS "Index or stats id" FROM sys.stats WHERE [object_id] = @o_id

ORDER BY "Index or stats id"

On my machine, for id 1 - 4, we have both statistics and indexes. And for id 5 - 10 only statistics. An interesting thing is whether STATS_DATE accepts an id for which we *dont* have an index?

SELECT STATS_DATE(OBJECT_ID('Production.Product'), 7)

Hmm, yes it does and probably reports the datetime for that statistics. I don't like this, at all. Makes me wonder what STATS_DATE reports if we have both an index and statistics with the same id? I would guess the index, but even if a test would show that, we can't say for sure. Time for a connect entry...

August 23, 2007 3:13 AM
 

TiborKaraszi said:

August 23, 2007 3:25 AM
 

Alejandro Mesa said:

Hi Tibor,

I noticed that the rows matching between sys.indexes ans sys.stats, for the same [object_id], are really related, in other words, the statistics of the indexes. Let us tweak your last query a little bit:

USE Adventureworks

DECLARE @o_id int

SET @o_id = OBJECT_ID('Production.Product')

select

*

from

(

SELECT [object_id], 'Index' AS "type", [name] AS "Index or stats name", index_id AS "Index or stats id"

FROM sys.indexes

WHERE [object_id] = @o_id

) as si

inner join

(

SELECT [object_id], 'Stats' AS "type", [name] AS "Index or stats name", stats_id AS "Index or stats id"

FROM sys.stats

WHERE [object_id] = @o_id

) as ss

on si.[object_id] = ss.[object_id]

and si.[Index or stats id] = ss.[Index or stats id]

go

If we can find matching rows, where the statistic is not related to the index, then we have a case.

What do you think?

Regards,

AMB

August 23, 2007 7:54 AM
 

TiborKaraszi said:

Mon dieu! But of course... you are correct!

(Sorry, I've been watching too much Poirot lately ;-) ).

A certain value for an index/stats id refers to the *same* index/statistics, not different. I didn't even pay attention to the index/stats name from my earlier query...

I guess I was stuck in 2000-mode somehow, and figured a UNION ALL would basically give us old sysindexes. Not the case.

So, no problem with the STATS_DATE function, then. It is just a matter of cleaning up the BOL description. I'll update my connect entry and post a new BOL feedback.

Thanks Alejandro for sticking with me and sorting it out... :-)

So, no problem with the STATS_DATE function, then. It is just a matter of cleaning up the BOL description. I'll update my connect entry.

August 23, 2007 10:48 AM
 

Alejandro Mesa said:

As you mentioned, It just a matter of cleaning up BOL description. BOL should refer to [stats_id], instead [index_id], for STATS_DATE function. Personally, I think that the name of the function implies it.

Cheers,

AMB

August 23, 2007 12:14 PM
 

Denis Gobo said:

A year in review, The 21 + 1 best blog posts on SQLBlog Best posts according to me, it might have been

December 27, 2007 4:11 PM
 

Kal said:

Hi Tibor,

This behaviour seems to be different when you use DBCC DBREINDEX instead of ALTER INDEX ... REBUILD. When I used DBCC DBREINDEX ('Sales.CustomerAddress') on AdventureWorks, STATS_DATE showed me most recent date for all statistics, including the auto-generated stats (_WA_Sys..) on this table.

Have you tested your theory using DBCC DBREINDEX?

February 4, 2008 6:15 PM
 

TiborKaraszi said:

Hi Kal,

Hmm, yes indeed. DBCC DBREINDEX does update statistics over the non-indexed column, something which ALTER INDEX ALL ... REBUILD do not do. Thanks for catching that, I would have assumed that these two behaves the same...

February 12, 2008 6:14 AM
New Comments to this post are disabled
Privacy Statement