THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

When heap tables don't recover deleted pages

Today I solved an issue for a customer that was a little bit unexpected, at least for me.

The scenario: a table with a few records is periodically updated: a few records are inserted (with a bulk insert) and a few records are deleted (the older ones). It's a sort of circular buffer. That table hadn't any indexes or primary key. I know, this is not a best practice, but since the table was very small, I didn't realize the impact of the missing clustered index...

The problem: after a few months of this table has being in production, query performance joining this table degraded at an unacceptable level. I was astonished when I discovered that table had only a few rows... but something like 2 millions of pages was used by that table, resulting in a 16Gb of size for a bunch of records!. In this condition, a table scan needed a complete read of the table.

The solution: I created a non-unique clustered index on a field representing the date of the row.

The explanation: SQL Server doesn't release a page of a heap table, and doesn't use free space in previously used page (this was the very unexpected part) unless you use a table-level lock when you delete data. Details are explained in KB 913399.

Best practice: take care of any heap table you have in a production database if data are deleted and inserted. Standard maintenance plan doesn't work well in this case, because a simple reindex operation doesn't compact the heap table.

Published Wednesday, August 27, 2008 4:25 PM by Marco Russo (SQLBI)

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Scott R. said:

Marco,

Thanks for this post!

I had heard that heap tables do not release free space as regularly as clustered tables, but I did not know the specific reasons behind it until now.  There is a Microsoft white paper (http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/clusivsh.mspx) comparing clustered versus heap tables that suggests the same, but did not have backing references such as the KB article you referenced.

The performance impacts of more widely scattered rows in heap tables (as you mentioned) and potential fragmentation impacts (or the DB maintenance time and effort to resolve these issues more frequently) may allow folks considering heap tables in specific application design situations to look at the big picture before assuming that “heap tables are faster / better than clustered tables”.

I guess that gives the pro-clustered table folks one more provable reason.

Scott R.

August 27, 2008 10:39 AM
 

Marco Russo (SQLBI) said:

Scott,

I could understand the limitation of recovering pages, but what I can't understand is why there is not a way to "compact" an heap table without creating a clustered index...

I'd like to know if this is only my ignorance..

Marco

August 27, 2008 11:01 AM
 

Greg Linwood said:

There's not much value reading anything into how heaps behave without any form of index structure as this really violates the most fundamental principles of physical modelling. The fact that the table didn't even have any form of key (primary or unique) means that it was never meant to be queried. You might have solved the performance problem by adding an index to the table, but there are obviously still gaping problems in relation to data integrity & query result predictability.

It's also worth mentioning that Microsoft has added a REBUILD option in SQL Server 2008 in the ALTER TABLE command so that heaps can be rebuilt without having to add a CIX.

August 28, 2008 12:03 AM
 

Marco Russo (SQLBI) said:

Greg,

I know very well that heap tables are not a good choice in a "regular" database. That said, as part of an ETL process, or as part of a sort of "logging" system that makes snapshot of external not-validated and not-normalized data, they could be a useful tool (or at least I thought that).

In my case, I added a non-unique clustered index and I solved the issue - however, dealing with an higher number of record than in my case this solution might be not ideal from a performance point of view.

REBUILD option in SQL2008: I will make some test to understand if it can be used for this scope even if you don't change anything (like the compression setting) in the table. Did you already tested it?

Thanks for your comment!

Marco

August 28, 2008 3:05 AM
 

Marcus said:

One way to rebuild a heap (again, not best practice), would be to create an exact copy of the heap with a different name, insert all the records into it, drop the original and rename the new table.

Also, if you are concerned about write speeds, you can create the clustered index, then drop it.  This will rebuild the table as well.

September 26, 2008 10:31 AM
 

Mostafa Elmasry said:

Thanks For your Help

But also you can make REBUILD to this Heap Table By this Code

ALTER TABLE Table_Name REBUILD WITH (ONLINE = on)

GO

By this Code you will reduce the Logical Read

June 1, 2013 9:45 AM
 

Marco Russo (SQLBI) said:

Mostafa,

probably this syntax was not available in SQL2005 when I wrote the blog post, and I never updated it - thanks for the comment!

Marco

June 2, 2013 1:49 AM
 

BD said:

Does this apply to SQL Server 2008 and higher? The KB article says it applies only to SQL Server 7, 2000 and 2005.

Thanks

September 2, 2013 7:47 AM
 

Marco Russo (SQLBI) said:

Probably is no longer an issue on SQL 2008 and higher, because I never encountered this problem again and the KB article doesn't apply to other versions newer than 2005.

September 2, 2013 9:07 AM

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement