THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

SELECT Hints, Tips, Tricks FROM Hugo Kornelis WHERE RDBMS = 'SQL Server'

The table scan from hell

This blog has moved! You can find this content at the following new location:

Published Friday, November 3, 2006 7:55 PM by Hugo Kornelis

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



Alex Kuznetsov said:

Hi Hugo,

Very interesting. I think you are making good point, but in fact, there is a way to get rid of fragmentation for a heap table:

------ populate with 10000 rows
SELECT Number, CAST('s' AS VARCHAR(890)) AS v
FROM Numbers
UPDATE TEST SET v = 'shrunk'
--- now it is horribly fragmented
DBCC results for 'Test'.
There are 10000 rows in 1272 pages for object 'Test'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC results for 'Test'.
There are 10000 rows in 29 pages for object 'Test'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
November 3, 2006 2:43 PM

Hugo Kornelis said:

Hi Alex,

You are right, creating and dropping a clustered index will indeed remove all fragmentation from a heap. I had not thought aboout that.

Of course, the overhead of creating and removing a clustered index on a big table is enormous - all data has to be moved, and all nonclustered indexes have to be redone as well - and all this twice!! Unless your database has a very long maintenance window, there's no way to do this without significantly impacting performance.

Best, Hugo
November 4, 2006 3:20 PM

Greg Linwood said:

Hi Hugo

Thanks for the response to my blog - it's good to get some public discussion going in a forum

others can contribute to or learn from. I'd also like to point out that I hadn't "forgotten" to

discuss fragmentation - I simply hadn't got around to that topic yet. Here are some thoughts in

response to this thread, as spilled out on a lazy sunday afternoon...

First, the overhead of creating and removing a CIX is certainly more than simply rebuilding a CIX

but it's not as enormous or impossible as you've stated. The main overhead with the approach

described by Alex above isn't the burden of rebuilding associated NCIXs. Whilst there certainly is

overhead associated with this, by far the bigger burden is usually the sort operation that's

required to sort the un-ordered Heap storage into the order of the CIX key/s. Depending on the size

of the Heap, this can be very significant. Rebuilding NCIXs is usually relatively less effort,

simply due to their much smaller size. The ratio of importance between the CIX sort effort & NCIX

rebuilds is a factor of how many NCIXs the table has, so there are no golden rules but in my

experience, by far the bigger factor is usually the CIX sort.

Second, CIXs can be created online with SQL 2005 EE, so I have no idea how you came to the

conclusion that building a CIX on a Heap will significantly impacting performance. Whilst there is

a performance overhead associated with indexing online, it shouldn't *significantly* impact

performance. Sure, not everybody can afford EE, but most who run systems that truely need to be

available 24x7 should be able to.

Third & most important of all, you only need to defrag tables if your queries are scanning those

tables. If you provide good indexes in the first place, your queries shouldn't need to scan your

tables other than smaller range scans. Larger range scans should be serviced by dedicated NCIXs.

Assuming you have decent indexes & your queries arne't scanning your tables, table defrags

shouldn't be required very frequently. I was actually planning to write about this in my next few

blog entires & will do so more fully when time permits..

Fourth, whilst the points you've made about forwarding pointers are theoretically correct, the

example you've provided really is an unrealistically cook up. Your script performs en-mass widening

updates in a way that doesn't happen in any real world systems that I've seen. Can you provide any

real-world examples of where such behaviour actually occurs? In nearly all real world systems, the

occassional widening updates can be easily managed via fill-factor to provide reasonably long

maintenance windows. Whilst your do describe what can potentially happen if you deliberately set

out to abuse the technology, I can't imagine many systems suffering these hypothetical effects.

When you consider that fragmentation in Heaps can be easily managed with fill-factor, defrag'd with

the approach provided by Alex or rebuilt online, your points about fragmentation effects from Heaps

aren't significantly important. I'm always surprised about how hysterical people seem to get about

table fragmentation when the real performance killer is index fragmentation.

Lastly, CIX bookmark lookups are definitely a major performance killer in real-world systems & this

is important to take into consideration if you susbecribe to the view that every table should have

a CIX. Sure, the best answer to avoiding bookmark lookups is to provide good indexes in the first

place, but bookmark lookups worsen the penalty for those who don't provide the right indexes (&

designing good indexes is an advanced skill). You stated in your post "A bookmark lookup is used

when a nonclustered index can first be used to narrow down the number of rows to read" but you

didn't mention the far worse scenarios where bookmark lookups are used to narrow the number of rows

to read. When filters are evaluated accross bookmarks lookups, the penalty is huge but would be

significantly lessened if no CIX existed to force the bookmark lookups. This is a very common

performance killer scenario - the CIX bookmark lookups usually multiply the IO (& locking) involved

by a factor of 2 to 3 times what would be needed if the table wasn't implemented on a CIX.

I think it's also interesting to take into consideration that SQL Server is the only major DBMS

platform which doesn't currently provide some form of "direct-to-block" I/O method, providing only

bookmark or RowID lookups instead. In my opinion, this is a substantial over-sight from MS which

has contributed to SQL Server falling significantly behind in TPC-C to other platforms which do

provide these more efficient lookup mechanisms. Note that Oracle achieved 1.6M tpm in TPC-C on HALF

the number of CPUs to SQL Server's 1.2M tpm. Note also that Oracle were using their more efficient

hash cluster technology in achieving this rather than their equiavlent technology to SQL Server's

CIX (Index Organised Tables). I reckon much of that extra CPU used in SQL Server's TPC-C result was

thrown away processing (reading & locking) inefficient bookmark lookups. If you've got a better

idea about what's going on there, I'm all ears..

IMO, Microsoft needs to work on adding more sophisticated lookup methods to SQL Server. We've been

working with the current tools for too long now & they haven't changed substantially since 1998.

I'm definitely a big fan of SQL Server, but in this area I believe MS needs to get their act

together with some new block / page access technology & stop encouraging excessive over-use of

their inefficient Clustered Index technology.

Greg Linwood
November 4, 2006 8:15 PM

Mattford said:

What about on a mulit-column clustered index?  My experience has been exceptional in that M$ has definitely had their stuff in gear.  Best practices indicate the almost never do we have a single column clustered index "key".  At least that's my experience and my $0.02.  Have not run the metrics yet, but very impressed with in-place production examples on large volume db's in both SQL 2000 and 2005 versions.  Of course, my production systems rarely exceed 20m records.
November 5, 2006 8:58 PM

Denis the SQL Menace said:

And we all know that in SQL Server 2000 there is no  REBUILD WITH (ONLINE=ON) option for rebuilding indexes. Why do you think most people are rebuilding indexes at 2AM Sunday morning?

November 6, 2006 7:36 AM

Louis Davidson said:

Excellent information.  I am doing some work with the dynamic management views, and you can really see this with the dm_db_index_operational_stats object.  After running your first script, I ran:

select cast(object_name(ddios.object_id) as varchar(30)) as objectName,


from   sys.dm_db_index_operational_stats(db_id(),object_id('dbo.Persons'),null,null) as ddios

And it returned:

objectName                     forwarded_fetch_count

------------------------------ ---------------------

Persons                        0

Then I ran the fragmenting script, and reran and got:

objectName                     forwarded_fetch_count

------------------------------ ---------------------

Persons                        3204

What is most disturbing though, is after running a simple:

Select *

from   dbo.Persons

objectName                     forwarded_fetch_count

------------------------------ ---------------------

Persons                        4676

And again:

objectName                     forwarded_fetch_count

------------------------------ ---------------------

Persons                        6148

That's 1472 extra operations, just for a simple select * from a 19972 row table!

June 9, 2007 3:03 PM

Louis Davidson said:

This object provides very useful stats on how many times an index has been used, locked, waited on, etc.

August 26, 2007 4:12 PM

Aviv said:


We are getting up to 99% fragmentation in only bulk insert (24X7) for non clustered indexes in clustered indexe's table any way to reduce it?

September 27, 2007 12:07 AM

Paul White said:


Defragmenting a heap by creating and dropping a clustered index.

The entire table must be rewritten twice.  Once for the sort and once for the new table.  COnsider offline bulk copy out and in instead.

No LOB types are permitted for an online rebuild.  So no MAX types, XML, and so on.  Quite a restriction.

Nonclustered indexes must be maintained, requiring yet more time and space, including the extra mapping table.

If nonclustered indexes are dropped first, then in what sense is the operation online?

Online index builds almost always result in greater remaining fragmentation than offline builds.

Building a non-unique clustered index will result in uniqueifiers being added to any nonclustered indexes temporarily.

Buildinding a unique clustered index will abort if a user inserts a duplicate value which the online process does not process first.

Online index building is not for free.  In real-life production systems with a heavy OLTP load, online is not practical.

These are some of the reasons that "building a clustered index on a heap" will significantly affect performance.  There may be others.

Your point about avoid scans on heaps is a good one, as is the remark about widening updates.  However, real life systems are not always as well-designed as we might like.  Missing non-clustered indexes, 100% fillfactors, widening updates and so on are all things that will happen.  A lot of this is down to old code which was written in a hurry, is complex and hard to maintain, and frankly people are too scared to touch it.  In situations like these, adding a clustered index to the heap permanently will make the problems go away.  Bear it mind that it has been shown over and over again that heaps are inferior when it comes to INSERT, UPDATE and DELETE performance - the lifeblood of OLTP.

As for a real-life example of widening updates, please feel free to pop across the ditch to see how my workplace uses transactional replication, and how often widening updates occur.  This is on a system serving 60M web page impressions per day.  Setting a low fill factor to work around the weakness reduces data density - less real data fits in the available pages in the buffer pool for example.

The heap versus clustered index debate was over a long time ago.  Clustered indexes by default, heaps in very rare circumstances, following extensive testing and justification.  Sure, both have their weaknesses, but the list is very much longer on one side.  Attempting to dismiss the technical inadequacies by saying they are not important or significant is churlish, borderline arrogant.  There will be systems out there where these issues are very important indeed, on a daily basis.

The point about overhead on key lookups versus RID lookups is interesting, but ultimately flawed.  The extra operations will always be on data already in memory, so the only overhead we need concern ourselves with is CPU.  This is cheap and getting cheaper.  For the very very small number of instances where CPU is a problem, there must be so many key lookups that it seems the clustered index must be on the wong column.  Range seek on the clustered index for very large tables - it is what it is good at.  On smaller tables, clustering on an identity column is pretty good practice, overall.

In your comparison with Oracle - I'm afraid it left me comparing the list prices and considering the value of benchmarks.  I note that you fail to mention the much larger page size used by Oracle in those tests.

Apologies for the long reply.  You're a smart guy Greg, with lots of good insights, knowledge and experience, but this holy war on clustered indexes does you no good at all I'm sad to say.



April 27, 2009 7:35 AM

zeus said:

thank you very much kaydesim..

April 7, 2010 5:06 AM

Tibor Karaszi said:

Never? You are not alone. None of the maintenance solutions I use includes functionality to rebuild a

March 6, 2014 6:57 AM

Leave a Comment


About Hugo Kornelis

Hugo is co-founder and R&D lead of perFact BV, a Dutch company that strives to improve analysis methods and to develop computer-aided tools that will generate completely functional applications from the analysis deliverable. The chosen platform for this development is SQL Server. In his spare time, Hugo likes to visit the SQL Server newsgroups, in order to share and enhance his knowledge of SQL Server.
Privacy Statement