THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Tibor Karaszi

Is statistics over non-indexed columns updated by index rebuild?

This blog has moved! You can find this content at the following new location:

http://sqlblog.karaszi.com/is-statistics-over-non-indexed-columns-updated-by-index-rebuild/

Published Thursday, August 9, 2007 6:18 PM by TiborKaraszi

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

James Luetkehoelter said:

Nice to point out - I find it amazing how often the two (indexes and statistics) are misunderstood. Also, when indexes or statistics are not updated - doing mass data loads, etc.

Well done, a simple question with a clear answer, as well as a more technical discussion of why. Great post.

August 10, 2007 12:06 PM
 

Alejandro Mesa said:

Hi Tibor,

Good question, short answer, great explanation and test script.

Microsoft should be more precise describing STATS_DATE function. The [index_id] could be also the [stats_id]. As you know, previous to 2005, they use to be in [sysindexes], so I guess that is why they continue mentioning just [index_id].

select [name], stats_date([object_id], index_id) as [stats_date]

from

(

select [name], [object_id], index_id from sys.indexes where [name] = 'x1' and [object_id] = object_id('t')

union all

select [name], [object_id], stats_id from sys.stats where [name] = 's1' and [object_id] = object_id('t')

) as t

Cheers,

AMB

August 22, 2007 3:15 PM
 

TiborKaraszi said:

Thanks for the kind words, James and Alejandro. :-)

Alejandro:

I did consider what you say above before I posted the blog. But, MS seems to have goofed on this. The problem is that the STATS_DATE function only takes two parameters: table_id and index_id. And, you can find a row in sys.stats with the same "index id" as a row in sys.indexes. So, how to tell these apart? Basically, STATS_DATE() needs a 3:rd parameter like "index" or "statistics". Try below (you might need to try a few tables before you find "duplicates"):

USE Adventureworks

DECLARE @o_id int

SET @o_id = OBJECT_ID('Production.Product')

SELECT 'Index' AS "type", [name] AS "Index or stats name", index_id AS "Index or stats id" FROM sys.indexes WHERE [object_id] = @o_id

UNION ALL

SELECT 'Stats' AS "type", [name] AS "Index or stats name", stats_id AS "Index or stats id" FROM sys.stats WHERE [object_id] = @o_id

ORDER BY "Index or stats id"

On my machine, for id 1 - 4, we have both statistics and indexes. And for id 5 - 10 only statistics. An interesting thing is whether STATS_DATE accepts an id for which we *dont* have an index?

SELECT STATS_DATE(OBJECT_ID('Production.Product'), 7)

Hmm, yes it does and probably reports the datetime for that statistics. I don't like this, at all. Makes me wonder what STATS_DATE reports if we have both an index and statistics with the same id? I would guess the index, but even if a test would show that, we can't say for sure. Time for a connect entry...

August 23, 2007 3:13 AM
 

TiborKaraszi said:

August 23, 2007 3:25 AM
 

Alejandro Mesa said:

Hi Tibor,

I noticed that the rows matching between sys.indexes ans sys.stats, for the same [object_id], are really related, in other words, the statistics of the indexes. Let us tweak your last query a little bit:

USE Adventureworks

DECLARE @o_id int

SET @o_id = OBJECT_ID('Production.Product')

select

*

from

(

SELECT [object_id], 'Index' AS "type", [name] AS "Index or stats name", index_id AS "Index or stats id"

FROM sys.indexes

WHERE [object_id] = @o_id

) as si

inner join

(

SELECT [object_id], 'Stats' AS "type", [name] AS "Index or stats name", stats_id AS "Index or stats id"

FROM sys.stats

WHERE [object_id] = @o_id

) as ss

on si.[object_id] = ss.[object_id]

and si.[Index or stats id] = ss.[Index or stats id]

go

If we can find matching rows, where the statistic is not related to the index, then we have a case.

What do you think?

Regards,

AMB

August 23, 2007 7:54 AM
 

TiborKaraszi said:

Mon dieu! But of course... you are correct!

(Sorry, I've been watching too much Poirot lately ;-) ).

A certain value for an index/stats id refers to the *same* index/statistics, not different. I didn't even pay attention to the index/stats name from my earlier query...

I guess I was stuck in 2000-mode somehow, and figured a UNION ALL would basically give us old sysindexes. Not the case.

So, no problem with the STATS_DATE function, then. It is just a matter of cleaning up the BOL description. I'll update my connect entry and post a new BOL feedback.

Thanks Alejandro for sticking with me and sorting it out... :-)

So, no problem with the STATS_DATE function, then. It is just a matter of cleaning up the BOL description. I'll update my connect entry.

August 23, 2007 10:48 AM
 

Alejandro Mesa said:

As you mentioned, It just a matter of cleaning up BOL description. BOL should refer to [stats_id], instead [index_id], for STATS_DATE function. Personally, I think that the name of the function implies it.

Cheers,

AMB

August 23, 2007 12:14 PM
 

Denis Gobo said:

A year in review, The 21 + 1 best blog posts on SQLBlog Best posts according to me, it might have been

December 27, 2007 4:11 PM
 

Kal said:

Hi Tibor,

This behaviour seems to be different when you use DBCC DBREINDEX instead of ALTER INDEX ... REBUILD. When I used DBCC DBREINDEX ('Sales.CustomerAddress') on AdventureWorks, STATS_DATE showed me most recent date for all statistics, including the auto-generated stats (_WA_Sys..) on this table.

Have you tested your theory using DBCC DBREINDEX?

February 4, 2008 6:15 PM
 

TiborKaraszi said:

Hi Kal,

Hmm, yes indeed. DBCC DBREINDEX does update statistics over the non-indexed column, something which ALTER INDEX ALL ... REBUILD do not do. Thanks for catching that, I would have assumed that these two behaves the same...

February 12, 2008 6:14 AM

Leave a Comment

(required) 
(required) 
Submit
Privacy Statement