THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul White: Page Free Space

A technical SQL Server blog from New Zealand. See also my articles on SQLperformance.com

So…is it a Seek or a Scan?

Scan Seek TreeYou’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS).  The image to the left shows the most common ones, with the three types of scan at the top, followed by four types of seek.  You might look to the SSMS tool-tip descriptions to explain the differences between them:

Scan and Seek ToolTips

Not hugely helpful are they?  Both mention scans and ranges (nothing about seeks) and the Index Seek description implies that it will not scan the index entirely (which isn’t necessarily true).

Recall also yesterday’s post where we saw two Clustered Index Seek operations doing very different things.  The first Seek performed 63 single-row seeking operations; and the second performed a ‘Range Scan’ (more on those later in this post).  I hope you agree that those were two very different operations, and perhaps you are wondering why there aren’t different graphical plan icons for Range Scans and Seeks?  I have often wondered about that, and the first person to mention it after yesterday’s post was Erin Stellato (twitter | blog):

Erin Tweet

Before we go on to make sense of all this, let’s look at another example of how SQL Server confusingly mixes the terms ‘Scan’ and ‘Seek’ in different contexts.  The diagram below shows a very simple heap table with two columns, one of which is the non-clustered Primary Key, and the other has a non-unique non-clustered index defined on it.  The right hand side of the diagram shows a simple query, it’s associated query plan, and a couple of extracts from the SSMS tool-tip and Properties windows.

Example 1

Notice the ‘scan direction’ entry in the Properties window snippet.  Is this a seek or a scan?  The different references to Scans and Seeks are even more pronounced in the XML plan output that the graphical plan is based on.  This fragment is what lies behind the single Index Seek icon shown above:

XML Seek Scan

You’ll find the same confusing references to Seeks and Scans throughout the product and its documentation.

Making Sense of Seeks

Let’s forget all about scans for a moment, and think purely about seeks.  Loosely speaking, a seek is the process of navigating an index B-tree to find a particular index record, most often at the leaf level.  A seek starts at the root and navigates down through the levels of the index to find the point of interest:

Seek Diagram

Singleton Lookups

The simplest sort of seek predicate performs this traversal to find (at most) a single record.  This is the case when we search for a single value using a unique index and an equality predicate.  It should be readily apparent that this type of search will either find one record, or none at all.  This operation is known as a singleton lookup.  Given the example table from before, the following query is an example of a singleton lookup seek:

Singleton Seek

Sadly, there’s nothing in the graphical plan or XML output to show that this is a singleton lookup – you have to infer it from the fact that this is a single-value equality seek on a unique index.  The other common examples of a singleton lookup are bookmark lookups – both the RID and Key Lookup forms are singleton lookups (an RID lookup finds a single record in a heap from the unique row locator, and a Key Lookup does much the same thing on a clustered table).  If you happen to run your query with STATISTICS IO ON, you will notice that ‘Scan Count’ is always zero for a singleton lookup.

Range Scans

The other type of seek predicate is a ‘seek plus range scan’, which I will refer to simply as a range scan.  The seek operation makes an initial descent into the index structure to find the first leaf row that qualifies, and then performs a range scan (either backwards or forwards in the index) until it reaches the end of the scan range.

The ability of a range scan to proceed in either direction comes about because index pages at the same level are connected by a doubly-linked list – each page has a pointer to the previous page (in logical key order) as well as a pointer to the following page.  The doubly-linked list is represented by the green and red dotted arrows in the index diagram presented earlier.  One subtle (but important) point is that the notion of a ‘forward’ or ‘backward’ scan applies to the logical key order defined when the index was built.  In the present case, the non-clustered primary key index was created as follows:

CREATE  TABLE 
        dbo.Example
        (
        key_col INTEGER     NOT NULL,
        data    INTEGER     NOT NULL,
        
        CONSTRAINT [PK dbo.Example key_col]
            PRIMARY KEY NONCLUSTERED (key_col ASC)
        )
;

Notice that the primary key index specifies an ascending sort order for the single key column.  This means that a forward scan of the index will retrieve keys in ascending order, while a backward scan would retrieve keys in descending key order.  If the index had been created instead on key_col DESC, a forward scan would retrieve keys in descending order, and a backward scan would return keys in ascending order.

A range scan seek predicate may have a Start condition, an End condition, or both.  Where one is missing, the scan starts (or ends) at one extreme end of the index, depending on the scan direction.  Some examples might help clarify that: the following diagram shows four queries, each of which performs a single seek against a column holding every integer from 1 to 100 inclusive.  The results from each query are shown in the blue columns, and relevant attributes from the Properties window appear on the right:

Range Scan Examples

Query 1 specifies that all key_col values less than 5 should be returned in ascending order.  The query plan achieves this by seeking to the start of the index leaf (there is no explicit starting value) and scanning forward until the End condition (key_col < 5) is no longer satisfied (SQL Server knows it can stop looking as soon as it finds a key_col value that isn’t less than 5 because all later index entries are guaranteed to sort higher).

Query 2 asks for key_col values greater than 95, in descending order.  SQL Server returns these results by seeking to the end of the index, and scanning backwards (in descending key order) until it comes across a row that isn’t greater than 95.  Sharp-eyed readers may notice that the end-of-scan condition is shown as a Start range value.  This is a bug in the XML show plan which bubbles up to the Properties window – when a backward scan is performed, the roles of the Start and End values are reversed, but the plan does not reflect that.  Oh well.

Query 3 looks for key_col values that are greater than or equal to 10, and less than 15, in ascending order.  This time, SQL Server seeks to the first index record that matches the Start condition (key_col >= 10) and then scans forward through the leaf pages until the End condition (key_col < 15) is no longer met.

Query 4 performs much the same sort of operation as Query 3, but requests the output in descending order.  Again, we have to mentally reverse the Start and End conditions because of the bug, but otherwise the process is the same as always: SQL Server finds the highest-sorting record that meets the condition ‘key_col < 25’ and scans backward until ‘key_col >= 20’ is no longer true.

One final point to note: seek operations always have the Ordered: True attribute.  This means that the operator always produces rows in a sorted order, either ascending or descending depending on how the index was defined, and whether the scan part of the operation is forward or backward.  You cannot rely on this sort order in your queries of course (you must always specify an ORDER BY clause if order is important) but SQL Server can make use of the sort order internally.  In the four queries above, the query optimizer was able to avoid an explicit Sort operator to honour the ORDER BY clause, for example.

Multiple Seek Predicates

As we saw yesterday, a single index seek plan operator can contain one or more seek predicates.  These seek predicates can either be all singleton seeks or all range scans – SQL Server does not mix them.  For example, you might expect the following query to contain two seek predicates, a singleton seek to find the single record in the unique index where key_col = 10, and a range scan to find the key_col values between 15 and 20:

SELECT  key_col 
FROM    dbo.Example 
WHERE   key_col = 10
OR      key_col BETWEEN 15 AND 20
ORDER   BY key_col ASC
;

In fact, SQL Server transforms the singleton seek (key_col = 10) to the equivalent range scan, Start:[key_col >= 10], End:[key_col <= 10].  This allows both range scans to be evaluated by a single seek operator.  To be clear, this query results in two range scans: one from 10 to 10, and one from 15 to 20.

Final Thoughts

That’s it for today – tomorrow we’ll look at monitoring singleton lookups and range scans, and I’ll show you a seek on a heap table.
Yes, a seek.  On a heap.  Not an index!

If you would like to run the queries in this post for yourself, there’s a script below.  Thanks for reading!

IF      OBJECT_ID(N'dbo.Example', N'U')
        IS NOT NULL
BEGIN
        DROP TABLE dbo.Example;
END
;
-- Test table is a heap
-- Non-clustered primary key on 'key_col'
CREATE  TABLE 
        dbo.Example
        (
        key_col     INTEGER     NOT NULL,
        data        INTEGER     NOT NULL,
        
        CONSTRAINT [PK dbo.Example key_col]
            PRIMARY KEY NONCLUSTERED (key_col)
        )
;
-- Non-unique non-clustered index on the 'data' column
CREATE  NONCLUSTERED INDEX 
        [IX dbo.Example data]
ON      dbo.Example (data)
;
-- Add 100 rows
INSERT  dbo.Example 
        WITH (TABLOCKX)
        (
        key_col, 
        data
        )
SELECT  key_col = V.number, 
        data = V.number
FROM    master.dbo.spt_values AS V
WHERE   V.[type] = N'P'
AND     V.number BETWEEN 1 AND 100
;
-- ================
-- Singleton lookup
-- ================
;
-- Single value equality seek in a unique index
-- Scan count = 0 when STATISTIS IO is ON
-- Check the XML SHOWPLAN
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col = 32
;
-- ===========
-- Range Scans
-- ===========
;
-- Query 1
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col <= 5 
ORDER   BY 
        E.key_col ASC
;
-- Query 2
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col > 95 
ORDER   BY 
        E.key_col DESC
;
-- Query 3
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col >= 10 
AND     E.key_col < 15 
ORDER   BY 
        E.key_col ASC
;
-- Query 4
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col >= 20 
AND     E.key_col < 25 
ORDER   BY 
        E.key_col DESC
;
-- Final query (singleton + range = 2 range scans)
SELECT  E.key_col 
FROM    dbo.Example AS E
WHERE   E.key_col = 10
OR      E.key_col BETWEEN 15 AND 20
ORDER   BY 
        E.key_col ASC
;
-- === TIDY UP ===
DROP    TABLE dbo.Example;

© 2011 Paul White

email: SQLkiwi@gmail.com
twitter: @SQL_Kiwi

Published Thursday, February 17, 2011 1:34 AM by Paul White
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alexander Kuznetsov said:

Hi Paul,

I am totally confused. You are saying that the root page is on the left, and the leafs are on the right, correct?

However, Michelle Ufford states that the root node is on the top, and the leafs are below it (follow the link and scrill to Fig 2):

http://www.simple-talk.com/sql/learn-sql-server/effective-clustered-indexes/

IMO having leaf pages at the bottom should be more stable, with less torn pages and less fragmentation. On top of that, index seeks from top to bottom should be faster.

What do you think?

February 16, 2011 11:49 AM
 

tobi said:

Would it make sense to implement a special kind of seek which ignores the first level of an index? Say you have a table T(FirstName, LastName, ...) and you have an index on (FirstName, LastName) and a predicate LastName = @p0. The index cannot be used with this predicate. But what if there are only 10 distinct FirstNames? Then SQL Server could just go to the first FirstName, then seek down to the LastName. Next it goes up a level and seeks to the next FirstName which is greater than the one just processed. And so on. Would it make sense to have such a "skip seek" in SQL Server? It would mean that many existing indexes suddenly become applicable to new queries.

February 16, 2011 5:01 PM
 

Paul White said:

Hi Alex,

The diagram is drawn that way just to make it fit on the page better :)

Paul

February 16, 2011 7:56 PM
 

Paul White said:

Hi Tobi,

Don't confuse the levels of an index with the keys in that index.  An index on (FirstName, LastName) would not have FirstNames in one level of the index and LastName in a lower one, for example.

Paul

February 16, 2011 8:15 PM
 

Allen kinsel said:

Thanks for the post, great detail.

Keep it up!

February 16, 2011 10:20 PM
 

Paul White: Page Free Space said:

So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much

February 17, 2011 2:24 AM
 

Paul White said:

Thank you, Allen!

February 17, 2011 6:45 AM
 

tobi said:

I mean the following:

var last_FirstName = null

while true:

fn = (select top 1 FirstName from Index where (last_FirstName is null or FirstName > last_FirstName ) order by FirstName)

select * from Index where FirstName = fn and LastName = @p0

That would work in any index structure.

February 18, 2011 10:44 AM
 

Muqadder said:

Woukd it really matter whether the B tree structure is "displayed" top to bottom or Left to right?? From a memory management perspective, there is no such notion of top-down or left-right storage of data as I recollect. All we are doing is reading a bunch of memory addresses, either the "pointers" in a memory location or the data in that memory location itself. B Tree is just a

data-structure (collection of mem addresses) linked to each other (doubly linked in case of a Doubly Linked List). I hope I didn't make it more confusing for someone reading this post :)

February 18, 2011 5:37 PM
 

Paul White said:

Tobi,

Sorry it's taken me a day or two to get to this.  You are describing a potential optimization that SQL Server does not currently implement - the ability to seek directly to the next distinct element in an index, skipping duplicates at that level.

It is possible to simulate this facility using a recursive CTE - though to be more efficient than the current implementation requires a fair number of duplicates.  It took me a while to find it, but here's a link to a demo script I posted on SSC in October last year, demonstrating how much faster an rCTE can be, over the index scan alternative:

http://www.sqlservercentral.com/Forums/FindPost1013407.aspx

Paul

February 20, 2011 6:34 AM
 

Paul White said:

Muqadder,

No, I don't think it matters at all - though I regret any confusion caused by breaking with convention in my B-tree diagram ;)

Paul

February 20, 2011 6:39 AM
 

tobi said:

Very cool, thanks for sharing. Probably the most creative misuse of functionality I have ever seen ;-) I a good way.

February 20, 2011 9:55 AM
 

Todd Everett said:

Great article Paul - very helpful in understanding this terminology.  As an old DB2 DBA, we always referred to an index operation as a scan.  The explain would tell you how many matching columns.  So a "seek" in SQL Server terms we would call a "matching index scan" with match columns of 1, or 2, or whatever it was.  Match columns 0 meant the whole index was being scanned from front to back - what we called "index screening."  So there was never any confusion.  Moving into the SQL Server world has been tough for me given my seeing "scan" in a show plan and thinking the old DB2 matching index scan.  Maybe, maybe not.  Thanks again for this great explanation.  I've book marked it!

April 6, 2011 10:03 AM
 

Martin said:

Hi Paul,

Have you any insight into why the IO for a range scan is not simply depth of index + number of leaf pages?

For example

   CREATE  TABLE WideKey

   (

   Id char(900) primary key,

   Filler CHAR(1)

   )

   INSERT INTO WideKey(Id)

SELECT TOP 10000 ROW_NUMBER() over (order by (select 0))

FROM master..spt_values V1,  master..spt_values V2

   SELECT index_depth,page_count,index_level

   FROM

   sys.dm_db_index_physical_stats (DB_ID(),OBJECT_ID('WideKey'), DEFAULT,DEFAULT, 'DETAILED')

   /*

index_depth page_count           index_level

----------- -------------------- -----------

6           1250                 0

6           312                  1

6           77                   2

6           19                   3

6           4                    4

6           1                    5

   */

   SET STATISTICS IO ON

   SELECT COUNT(*) FROM WideKey WHERE Id >''

   SET STATISTICS IO OFF

   /*Table 'WideKey'. Scan count 1, logical reads 1567*/

So why 1567 reads and not 1255?

Thanks!

May 6, 2011 2:37 PM
 

Paul White said:

Hi Martin,

Yes!

SQL Server doesn't know that the leaf pages are already in the buffer pool, so it uses read-ahead to pre-fetch them.

It does this by scanning the linked list of pages in the upper levels of the index; while one thread is returning leaf pages to the query processor, another is scanning ahead in levels above the leaf and issuing asynchronous I/O if it finds the page not in the data cache.

So, the read-ahead mechanism is responsible for the extra logical I/O - but notice that 'read ahead reads' are counted by STATISTICS IO only if a read from disk actually needs to be scheduled (because the page is not already in memory).

If you turn off read-ahead with trace flag 652, you'll see 1255 reads as expected.  Use DBCC TRACEON (652) to do that, but beware it disables read-ahead for the whole instance, so be careful, and be sure to turn it off afterward with DBCC TRACEOFF (652).

Thanks for a great question.

Paul

May 6, 2011 2:53 PM
 

Martin said:

Ah, That explains it!

Thanks very much for the illuminating response and congratulations on the MVP award.

Martin

May 6, 2011 5:32 PM
 

Quassnoi said:

Hi Paul,

There was a discussion on Stack Overflow about the question @Martin just asked:

http://stackoverflow.com/questions/5900703/logical-reads-for-seeks-on-a-non-unique-clustered-index

https://connect.microsoft.com/SQLServer/feedback/details/667095/extra-logical-read-in-a-clustered-index-seek

There is one more thing not clear to me: at which point exactly the read-ahead thread decides to look at the key page?

Here's the setup I've posted on Microsoft Connect

CREATE TABLE accounts

       (

       id INT NOT NULL,

       filler CHAR(1000)

       )

CREATE CLUSTERED INDEX

       ix_accounts_id

ON     accounts (id)

INSERT

INTO    accounts (id)

SELECT c

FROM    (

       VALUES

       (4),

       (6),

       (7)

       ) vals (c)

CROSS JOIN

       (

       VALUES

       (1),

       (2),

       (3),

       (4),

       (5),

       (6),

       (7)

       ) nums (n)

SELECT id, p.*

FROM    accounts

CROSS APPLY

       sys.fn_physloccracker(%%physloc%%) p

SET STATISTICS IO ON;

SELECT TOP 6 * FROM accounts WHERE id = 4

SELECT TOP 7 * FROM accounts WHERE id = 4

SELECT TOP 8 * FROM accounts WHERE id = 4

SET STATISTICS IO OFF;

(6 row(s) affected)

Table 'accounts'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(7 row(s) affected)

Table 'accounts'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(7 row(s) affected)

Table 'accounts'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

, all 7 records with id = 4 (and only them) being located on the first page.

The extra logical read only happens when we issue TOP 7 and not TOP 6. So what exactly makes the engine to perform the read-ahead? How 7 is different from 6 if the number of records on the page is 7 and the engine knows that?

Thanks in advance!

May 6, 2011 5:46 PM
 

Paul White said:

Hi Quassnoi,

There are some details I cannot disclose here.  Sorry about that.

I will say though that different versions (and builds) of SQL Server will exhibit different detailed behaviours and, though the details are interesting, they are not of any practical use, and cannot be relied upon in any case.

Paul

May 6, 2011 6:25 PM
 

Quassnoi said:

OK, thanks nevertheless!

Let me word the question differently.

To me, there is no obvious reason to make the extra read with TOP 7 but not with TOP 6.

Was this extra read implemented on purpose (whatever it might be), or this is just a side effect and serves no real purpose?

Thanks again!

May 6, 2011 6:45 PM
 

Paul White said:

If you play around with TF652 a bit, and think creatively, you should be able to satisfy your curiosity on this one.  You might come to the conclusion that read-ahead reads the first non-leaf page when the last row on the first page has been read, who knows? ;c)

May 7, 2011 7:51 PM
 

Paul White: Page Free Space said:

Just a quick blog entry today, leading up to the next one (which is going to be *awesome*).&#160; A little

July 28, 2011 11:38 AM
 

robnoi said:

Regarding the discussion about the "skip scan" above, there is now a Microsoft Connect item: https://connect.microsoft.com/SQLServer/feedback/details/695044/implement-index-skip-scan

Please vote for it.

October 15, 2011 10:32 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement