THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul White: Page Free Space

A technical SQL Server blog from New Zealand.

A Tale of Two Index Hints

If you look up Table Hints in Books Online, you’ll find the following statement:

If a clustered index exists, INDEX(0) forces a clustered index scan and INDEX(1) forces a clustered index scan or seek.
If no clustered index exists, INDEX(0) forces a table scan and INDEX(1) is interpreted as an error.

The interesting thing there is that both hints can result in a scan.  If that is the case, you might wonder if there is any effective difference between the two.  This blog entry explores that question, and highlights an optimizer quirk that can result in a much less efficient query plan when using INDEX(0).  I’ll also cover some stuff about ordering guarantees.

Test Environment

As usual, we need to create a test table:

USE     tempdb;
SET NOCOUNT ON;
GO
IF OBJECT_ID(N'Test.Orders', N'U')
IS NOT NULL
DROP TABLE Test.Orders;
GO
IF SCHEMA_ID(N'Test')
IS NOT NULL
DROP SCHEMA Test;
GO
CREATE SCHEMA Test

CREATE TABLE Orders
(
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
padding CHAR(980) NOT NULL DEFAULT (SPACE(0)),

CONSTRAINT [PK Test.Orders order_id]
PRIMARY KEY CLUSTERED (order_id ASC)
);
GO
CREATE NONCLUSTERED INDEX [IX Test.Orders quantity]
ON Test.Orders (quantity ASC)
INCLUDE (padding);
GO
DECLARE @Counter INTEGER = 1,
@RowCount INTEGER = 64 * 8;

WHILE @Counter <= @RowCount
BEGIN
INSERT Test.Orders
(order_id, product_id, quantity)
VALUES (
@RowCount - @Counter,
@RowCount - @Counter,
@RowCount - @Counter
);

SET @Counter += 1;
END;
GO

The test table has 512 rows, padded so that eight rows fit on each 8KB data page.  The table is clustered on the order_id column, and there is a nonclustered index on the quantity column, with the padding column INCLUDEd to again ensure that eight rows fit on each nonclustered index page.

Table-Contents

The script also deliberately generates a very high level of logical fragmentation, which we can see with the following query:

SELECT  SI.name,
IPS.index_type_desc,
IPS.avg_fragmentation_in_percent,
IPS.fragment_count,
IPS.page_count
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'Test.Orders', N'U'),
DEFAULT,
DEFAULT,
DEFAULT
) IPS
JOIN sys.indexes SI
ON SI.[object_id] = IPS.[object_id]
AND SI.index_id = IPS.index_id
ORDER BY
IPS.index_id;
GO

The reason for the fragmentation will become apparent later on.  These are the results I saw on my machine:

Index-Physical-Stats

Background Information on Scans

A scan of a table or index processes every row, whereas a seek returns one or more rows from one or more ranges of the index. 

Scan Strategies

When the SQL Server optimizer produces a plan containing a scan of an index or clustered table, the Storage Engine may be able to choose from two different strategies: scanning the pages in logical order, or the order in which they were allocated.  In the first case, it follows the doubly-linked list at the leaf level of the index.  In the second case, it can use the Index Allocation Map (IAM) pages to drive the scan.

A scan of a heap always uses the IAM pages of course (there is no clustered index to provide a logical page order), but this post is concerned mainly with indexes and clustered tables, so I won’t talk much more about heaps.

The advantage of an IAM-driven scan is that pages will tend to be read in physical order on the disk, which generally results in sequential I/O, and promotes effective read-ahead.  How efficient scanning the pages in logical order is depends on the level of logical fragmentation: if fragmentation is high, the scan will tend to result in a high proportion of random I/Os, and read-ahead may be much less effective.

The downside to the IAM-driven scan is that it is not guaranteed to return results in logical key order (in fact it usually won’t), and there is an extra start-up cost in accessing the IAM pages, and planning the scan.

IAM-driven scans

The Storage Engine may choose an IAM-driven scan if:

  1. The index contains 64 or more pages (giving the method a chance to recover the start-up cost);
  2. The optimizer indicates that an ordered scan is not required for correct results; and
  3. Either the data cannot change during the scan, or the query indicates that incorrect results are acceptable.

Hopefully the first point makes sense on it’s own.  The second point refers to a flag that the optimizer can set on a scanning iterator in a query plan.  This flag shows up as an “Ordered:True” or “Ordered:False” property of the iterator.  When set to true, it indicates that the Storage Engine must perform a scan in logical key order for the plan to produce correct results.  When set to false, it means that the order does not matter.  In the latter case, the Storage Engine has the choice of performing either type of scan.  The query plan does not show which type of scan was chosen.

The third point comes in two parts, but both relate to the fact that an IAM-driven scan does not take the locks necessary to prevent other concurrent operations from changing the data in the table, or moving the pages around (perhaps as the result of a page split).

The simplest way to satisfy the first part (to ensure that the data cannot be changed or moved) is to take a shared table lock.  The second part (acceptable incorrect results) refers to transaction isolation level.  If the scan runs at an effective isolation level of READ UNCOMMITTED (either explicitly or through the use of a NOLOCK hint) we are effectively saying that we don’t care about reading consistent data.  More than that, we imply that we are happy to read the same data twice, or not at all.

The Reason for the Fragmentation

Our test table has been created with a high level of fragmentation.  This allows us to see whether the Storage Engine used an IAM-driven scan or not.  If the results come back in logical key order, we can be sure (in these specific examples) that the scan was performed by following the doubly-linked list at the leaf level of the index.  If the results come back in allocation order, they will look very different, and we can infer that an IAM-driven scan was performed.


Important: Please don’t take any of that to mean that you can rely on query results to come back in a particular order in more general cases.  The very simple tests shown here are carefully crafted to encourage the ordering behaviour I have just described, but there are no guarantees.  Never rely on plan details to provide ordering – always use an ORDER BY clause if you require ordered results.


Scanning Examples

Let’s look first at two simple queries that return all columns and all rows from our test table:

-- Query 1
SELECT T.*
FROM Test.Orders T;

-- Query 2
SELECT T.*
FROM Test.Orders T
ORDER BY
T.order_id ASC;

Both queries produce very similar execution plans, but the first specifies “Ordered:False” while the second specifies “Ordered:True”.  The second example avoids an expensive explicit Sort iterator by requiring the Storage Engine to return rows from the scan in logical key order (the clustered index is on order_id, remember).

Example-1

Both queries return records ordered by the clustering key; Query 1 because the Storage Engine cannot perform an IAM-driven scan (we haven’t met all the conditions); and Query 2 because an ordered scan was requested by the optimizer.

Query 1 is not guaranteed to return records in that order, but Query 2 (almost) is.  In case you are wondering what might cause Query 1 to return records in a different order, consider the effect of a parallel plan, or Enterprise Edition’s Advanced Scan feature.  For anyone that follows that second link, I should mention that Advanced Scan is only possible when the Ordered flag is set to False.

An IAM-driven Scan

Query 1 above meets two of the conditions for an IAM-driven scan: the index contains at least 64 pages, and the optimizer did not specify Ordered:True.  We can meet the third condition by adding a WITH (TABLOCK) hint:

SELECT  T.*
FROM Test.Orders T WITH (TABLOCK);

We receive the following result set:

Query-1-with-TABLOCK

Each group of eight records (each 8KB page) comes back ordered by the clustering key, but the overall order of pages is in the order the pages were written.  This is the result of an IAM-driven scan: pages in allocation order, records within the pages in clustered key order.  As you may have guessed, neither of these two observed orderings can be relied on either – I mention them for pure geek interest value.

Adding the TABLOCK hint to Query 2 does not result in an IAM-driven scan, because the optimizer continues to specify Ordered:True on the scan (to avoid the sort that would be necessary to honour the ORDER BY clause).

For Query 2, the optimizer does still consider a plan with an unordered scan followed by a sort, but rejects it as being more expensive than an ordered scan.  For very large tables, the optimizer may calculate that an IAM-driven scan might save more time than would be consumed by the extra sort, and a plan featuring the unordered scan + sort would be chosen.  This is a heuristic optimization: the optimizer knows nothing about the actual fragmentation level of an index (though it does know how many pages it occupies).

Index Hints

Let’s now run the Query 1 test again, this time specifying INDEX(0) or INDEX(1) hints, as well as choosing to specify TABLOCK or not:

SELECT  T.* FROM Test.Orders T WITH (INDEX(0));
SELECT T.* FROM Test.Orders T WITH (INDEX(1));
SELECT T.* FROM Test.Orders T WITH (INDEX(0), TABLOCK);
SELECT T.* FROM Test.Orders T WITH (INDEX(1), TABLOCK);

In all cases, the results are the same: if we specify TABLOCK, we meet the conditions for an IAM-driven scan, and the records comes back in physical order.  If we omit TABLOCK, we happen to get results in clustered-key order – but we know that is just coincidental.  All four query plans look exactly the same as the simple clustered index scans shown earlier, and all feature “Ordered:False”.

Let’s try the same thing with the ORDER BY clause (Query 2):

SELECT  T.* FROM Test.Orders T WITH (INDEX(0)) ORDER BY T.order_id;
SELECT T.* FROM Test.Orders T WITH (INDEX(0), TABLOCK) ORDER BY T.order_id;
SELECT T.* FROM Test.Orders T WITH (INDEX(1)) ORDER BY T.order_id;
SELECT T.* FROM Test.Orders T WITH (INDEX(1), TABLOCK) ORDER BY T.order_id;

All four queries produce the same results, in the same order, and are guaranteed to do so thanks to the ORDER BY clause.  The execution plans are rather different, however:

INDEX-hints

Both INDEX(1) plans produce the plan shown on the left, with the scan showing “Ordered:True”, and a cost of 0.05.
Both INDEX(0) plans produce the plan shown on the right, with the scan showing “Ordered:False”, and a cost of 0.07.

Optimizer Bug, Limitation, or “By Design”?

It seems that INDEX(0) forces “Ordered:False” on the scan.  As a consequence, we get an explicit Sort iterator when the ORDER BY clause is present.  Why should INDEX(0) force an unordered scan?

One can imagine a line of reasoning that says we would normally specify INDEX(0) only on a heap table, where an ordered scan is not possible.  The flaw in this argument is that INDEX(1) does not enable us to force a scan of a clustered table – it will result in a seek if at all possible.

INDEX(0) and Ordered:True

Nevertheless, it is possible to produce “Ordered:True” when INDEX(0) is specified, in an INSERT, UPDATE or DELETE query.  Let’s look at three alternative formulations of a trivial UPDATE query on our test table:

UPDATE-Examples

All three plans feature the “Ordered:True” flag, but the one with INDEX(0) includes a sort on order_id.  That sort is probably redundant, since the Clustered Index Scan (with “Ordered:True”) ought to be guaranteed to produce rows in that same order.  Not only will that sort slow down your query, it will require a memory grant, and might even spill to tempdb.

A Data Modification Optimization

When performing our UPDATE query, the optimizer takes into account the effect of ordering.  If rows arrive at the update iterator in clustered index order, the pattern of updates will likely result in sequential I/O.  If rows are in a different order, the updates will likely cause random I/O as the index is updated row by row.

Often, the optimizer may choose to pre-sort the records on the clustered index key to optimize for sequential I/O.  On other occasions, it may decide that the cost of the sort would exceed the savings made – this is seen when the number of rows to be updated is very small, and the rows can be located most efficiently from a nonclustered index.

When the rows are sourced from the clustered index, the optimizer always adds the “Ordered:True” flag to maximize the chances for sequential I/O, regardless of the estimated number of rows.  This explains how the Clustered Index Scan above manages to acquire the “Ordered:True” flag when INDEX(0) was specified.  It seems that the optimizer misses the fact that by adding the ordering flag, the sort is now redundant.

Final Thoughts

Although I have concentrated very much on clustered indexes in this entry, I want to stress that the two scanning strategies apply to both clustered and non-clustered indexes.  I created a nonclustered index on the test table so you can verify for yourself that IAM-driven scans are possible with nonclustered indexes, and the rules are exactly the same.

It is also not necessary to specify TABLOCK explicitly to get an IAM-driven scan: if the index in question happens to have row and page locks disabled (using ALTER INDEX (…) SET ALLOW_ROW_LOCKS = OFF, ALLOW_PAGE_LOCKS = OFF) the Storage Engine will recognise that a table lock has to be taken.  Other circumstances like the database being read-only will also meet the underlying requirement that the data cannot change during the scan.

Beware of using NOLOCK when the other conditions for an IAM-driven scan are met.  Not only might you read data that has not been committed yet, you can also read committed data twice, or not at all.  Your query might also fail completely with the dreaded error: “Could not continue scan with NOLOCK due to data movement”.

Finally, watch out for unnecessary sorts in plans that use the INDEX(0) table hint – consider using INDEX(1) instead, or rewriting the query to avoid the hint altogether.

Paul White
Email: SQLkiwi@gmail.com
Twitter: @SQL_Kiwi

Further Reading

Clustered Index Scans – Itzik Ben-Gan
Advanced Scanning – Books Online
When IAM Scans Can Be Used – SQL Server Storage Engine Team
IAM Pages – Books Online
Beware the NOLOCK hint – Itzik Ben-Gan

Published Thursday, September 23, 2010 11:44 AM by Paul White

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Brad Schulz said:

Wow! Every new post is like another new treasure.

Thanks for another great post.  I'm always learning something... I never realized that an IAM scan was even available... yet another reason that it's a big mistake to make any kind of assumption on the order of a result set (when you don't specify an ORDER BY).

I marvel at the complexity and depth of the optimizer... and I marvel at your continued excellent output.

Thanks, Paul!

--Brad

September 22, 2010 7:37 PM
 

Paul White said:

Thank you, Brad!

September 23, 2010 6:41 AM
 

mjswart said:

After last T-SQL Tuesday I thought I was indexed out. But this is a great post. It must have taken a lot of investigation.

Good job

September 25, 2010 10:18 AM
 

Paul White said:

Hey Michael,

This was going to be my first contribution to T-SQL Tuesday, but circumstances conspired against me.  I'll try to be better organised for the next one.

Thanks for the kind words.

Paul

September 25, 2010 7:30 PM
 

J.Wilson Lòpez M. said:

GOOD JOB! ...from COLOMBIA

October 2, 2010 1:32 AM
 

ManishKumar said:

Great Article :)

May 21, 2013 6:29 PM
 

Julian said:

index(0) is heap, index(1) cluster index

August 13, 2013 11:23 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement