THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Geek City: Ordered Seeks and Scans

I got a couple of really great questions during my SSWUG Workshop this morning, as I was discussing seeks and scans, and since the answers to the two questions are very related, I decided to address both of them in more detail in this blog post.

Most people think about a seek operation as retrieving just one or a few rows, and a scan as accessing the entire table  or index leaf level.  And that usually is what I mean when I am just describing SQL Server behavior. But there is also the possibility that SQL Server could perform a 'partial scan', starting at some row in the table or index leaf, and then scanning only to a particular ending row. However, there is no operator in a query plan called 'partial scan'.  So how can you tell when you have one?

Before I specifically answer that, let me tell you about the first question in the workshop this morning. I had been showing the Properties window (available from View/Properties Window), pointing out that when you click on one of the icons in a graphical query plan, the Properties window shows the same information that is in the yellow box that pops up, plus additional information. One of the values shown in one called 'Ordered', and when 'Ordered' has the value true, then there is also a value called 'Scan Direction', which might make you think that ordering is only relevant to scans.  But you will see True values for 'Ordered' even when the operation is a seek. This is usually an indication that a partial scan is being performed.

And because I was talking about heaps being unordered sets of rows, one of the delegates asked: "Are index seeks against a heap always unordered?"

The answer is, of course, "it depends".  If you have a covered query, the index access could be an ordered seek of the covering index, even though the underlying table is a heap. Here is an example from the AdventureWorks database. I make a copy of the Sales.SalesOrderHeader table, so it is a heap, and then build a nonclustered index on the Freight column.

IF EXISTS (SELECT * FROM dbo.SalesOrderHeader)
   DROP TABLE dbo.SalesOrderHeader;
GO
SELECT * INTO dbo.SalesOrderHeader
FROM Sales.SalesOrderHeader;
GO
CREATE INDEX ix_Freight ON dbo.SalesOrderHeader(Freight);
GO

Then I look at the plan for a covered query using the ix_Freight index:

SELECT AVG(Freight)
FROM dbo.SalesOrderHeader
WHERE Freight < 100;

And here is the plan showing the index seek:

image

And the Properties sheet for the Index Seek shows that Ordered - True and Scan Direction = FORWARD.

image

This is an example of an ordered seek on a nonclustered index, or what we can think of as a partial scan, or range scan.

The other question during the workshop was how to tell if the seek was really a range scan. Part of the answer is to look for the Ordered = True value, as I just illustrated. Another thing to notice in the graphical query plan is the thickness of the line moving rows to the left from the Index Seek, indicating that LOTS of rows are being returned from the Seek. That's another clue that we have a partial scan.

We can also get partial scans with clustered indexes. I'll build a clustered index on my new table.

CREATE CLUSTERED INDEX clus_OrderID ON dbo.SalesOrderHeader(SalesOrderID);
GO

There are 31465 rows in the dbo.SalesOrderHeader table, with SalesOrderID values ranging from 43659 to 75123. I'll SELECT a small range of data:

SELECT * FROM dbo.SalesOrderHeader
WHERE SalesOrderID < 44000

The plan shows a Clustered Index Seek, and the Properties shows Ordered = True. 

If I make the range bigger, and look for SalesOrderID < 60000, I still get an Index Seek, and the same when using an upper limit of 75000. If I run a query that returns ALL the rows, I still get an Index Seek.

image

Both queries are using an Index Seek, but notice the difference in the thickness of the arrows.

It turns out that the only time the optimizer will tell us it is doing a scan is when it knows absolutely that it will be looking at every row in the table or index leaf. And the only way it can know this is if there is no filter condition (NO WHERE clause) or no index on the filter condition. With the index on SalesOrderID, there is no way the optimizer can be absolutely sure that it will have to scan the whole table. Someone might have inserted a value of 100001 since the last time statistics were updated, and we wouldn't want to perform a scan and return a value that is not in the specified range. So the Index Seek in the plan really just means that SQL Server will access the data using the index, but not with a complete scan. There is something in the query that give a starting point, a stopping point or both.

I've got another Workshop for SSWUG on Index Internals, coming up on February 25th. Hopefully, I'll get a lot more great questions during that one!

Thanks!

~Kalen

Published Friday, January 21, 2011 4:28 PM by Kalen Delaney

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alexander Kuznetsov said:

Hi Kalen,

Can you please explain what do you mean by "the covering index is a heap". Thanks!

January 21, 2011 7:11 PM
 

Kalen Delaney said:

Sorry, that should have been "the covering index is ON a heap", but I reworded the sentence completely.

Thanks for the heads up!

~Kalen

January 21, 2011 7:19 PM
 

Mark said:

Hi Kalen,

Very interesting article.  I got to play around with this quite a bit more.

I found that I get a seek vs a scan when I specify a search predicate greater than the data type allows.  It is the same behavior when I add a constraint and then set my WHERE clause to match the constraint definition.  So even when a table scan could be inferred by the optimiazer based on the value limit of a data type or a constarint, it still shows the seek.

January 22, 2011 5:35 PM
 

Kalen Delaney said:

Thanks Mark. It seems the optimizer is very stingy about actually saying it is going to do a scan, even if it actually does end up doing the same work as a scan. So that just means you need to be careful when looking at plans and determining where there might be room for improvement. An index seek is not always a good thing.

January 22, 2011 6:56 PM
 

Chris Wood said:

Kalen,

If you are tracking the index_usage and missing_indexes DMV's would this show as a user_seek or a user_scan? Would it showup in the missing_indexes DMV counts or not?

Thanks

Chris

January 24, 2011 10:36 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement