THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Tibor Karaszi

Heaps, forwarding pointers, ALTER TABLE REBUILD and non-clustered indexes

Let's start with some background on forwarding pointers:

Forwarding pointers in heaps can be a mess to get rid of. A forwarding pointer happens when you modify a row so that the row doesn't fit on the same page anymore. SQL Server moves the row to a new page and leaves a forwarding pointer where the old row used to be. This means that non-clustered indexes are not affected by the moving of the row - it can still point to the old location. So the modification is less costly than if all non-clustered indexes needed to reflect the row move. But having forwarding pointers can be bad for performance when you read data. A perhaps less known fact is that a scan over a table needs to follow forwarding pointers - essentially "jumping back and forth" a lot if you have many forwarding pointers. That can be really bad for performance (if you have table scans, of course). So, how do we get rid of forwarding pointers? Well, we can shrink the database file, but that is a little like curing a headache by jumping into a big meat grinder. We can export all data and re-import it - not very practical.

Greg Linwood reminded me that in SQL Server 2008, we can do ALTER TABLE ... REBUILD. Now, I knew about this option, and every time I tell about it in class I've been thinking silently for myself "I need to test whether this is a way to get rid of fwd pointers". (You generally talk about ALTER TABLE ... REBUILD when you talk about enabling compression on a heap.) So, doing a REBUILD of a table using ALTER TABLE sounds promising. Will it get rid of forwarding pointers? Will it also rebuild all non-clustered indexes?

Quick answer for those who don't care reading the TSQL script:
ALTER TABLE ... REBUILD will remove forwarding pointers, but for some strange reason it will also rebuild all non-clustered indexes on that table.

See the TSQL code below. It was adapted from a challenge by Linchi Shea to produce a data loading script resulting in worst performance (where I immediately thought of forwarding pointers). See for instance http://sqlblog.com/blogs/linchi_shea/archive/2009/06/07/t-sql-exercise-simplest-data-loading-script-to-produce-worst-query-performance-june-7th-update.aspx.

USE tempdb
GO

IF EXISTS (SELECT FROM sysobjects WHERE name 'test')
   
DROP TABLE test
GO
CREATE TABLE test (
   
INT NOT NULL
  ,
x2 INT NOT NULL
  ,
CHAR(10) NOT NULL DEFAULT ('')
  ,
CHAR(10) NOT NULL DEFAULT('')
)
 
DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT
 
INSERT 
test (xx2)
SELECT TOP(@rows)
       
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS r
      
,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) % 10 AS s
  
FROM master..spt_values a CROSS JOIN master..spt_values b
GO
ALTER TABLE test ALTER COLUMN CHAR(892)
ALTER TABLE test ALTER COLUMN CHAR(100)
GO
 
DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT
DELETE TOP
(@rows @toKeep
  
FROM test WHERE x2 IN(2468)
GO

CREATE INDEX x1 ON test(x)
CREATE INDEX x2 ON test(x2)

SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
--First  run (no nc indexes ): 387157 fwd records (out of 400000 rows), 96104 pages
--Second run (two nc indexes): 387157 fwd records (out of 400000 rows), 96105 pages

CHECKPOINT
DBCC 
DROPCLEANBUFFERS
DECLARE @t time SYSDATETIME()
ALTER TABLE test REBUILD
SELECT DATEDIFF(ms@tCAST(SYSDATETIME() AS time))
--First run, no non-clustered indexes, three subsequent executions (ms): 19351, 20683, 20275
--Second run, with two non-clustered indexes, three subsequent executions (ms): 31803, 35065, 37511

SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
--First  run, heap = "index" 0 (no nc indexes ): 0 fwd records, 50002 pages
--Second run, heap = "index" 0 (two nc indexes): 0 fwd records, 50003 pages
--But: avg_page_space_used_in_percent changes for the nc indexes after the rebuild.
--That I take as nc indexes are also rebuilt.

Published Friday, August 28, 2009 1:15 PM by TiborKaraszi

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

GilaMonster said:

It probably rebuilds the nonclusters because, in the process of removing the forwarding pointers, it could move rows to different pages. Since the nonclusters refer to the RID when the underlying structure is a heap, they'll have to be modified or they'd be pointing to rows where they no longer are.

August 28, 2009 7:07 AM
 

TiborKaraszi said:

Hehe, yes of course. I was in some strange kind of thinking. like "nc indexes aren't affcted by row movements, so why do the nc indexes need to be rebuild if I rebuild the table". D'oh. Thanks! :-)

August 28, 2009 7:14 AM
 

Greg Linwood said:

Excellent - another chance to rave about heaps! (c:

First, re forwarding pointers - they're only possible at all on tables with variable width columns. Next, they only actually occur on tables with variable width columns that actually get updated such that the rows are widened & then only for specific rows that no longer fit on their existing page. This scenario exists far more in the minds of theorists than in real systems. It certainly does happen, but on a fairly small scale in real practise.

Second, NCIXs fragment FAR faster than heaps as inserts into an underlying heap (table) is typically added to the end of the extent chain, but the corresponding maintenace of NCIXs is nearly always random. Hence it's NCIXs that typically fragment fast & need more frequent maintenance, not heaps.

You might recall that on much older versions of SQL Server (6.5 & earlier) NCIXs worked the same way on CIXs as on HEAPs. The change to CIX keys was made in 7.0 & the rationale given at the time was that this allowed CIXs to be rebuild in isolation of NCIXs, which is silly when you realise that its NCIXs that fragment much more frequently than CIXs or HEAPs.

August 28, 2009 9:16 AM
 

Uri Dimant said:

Hi Tibor

Yes , I was referencing also tho this blog when I want to compress in HEAP

http://blogs.msdn.com/sqlserverstorageengine/archive/2008/12/20/enabling-compression-on-a-heap.aspx

August 30, 2009 1:35 AM
 

Tad Huck said:

I know that Cognos hardly qualifies as a "real system" but from my experience: Cognos = Heap table fragmentation

June 12, 2012 12:57 PM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement