THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul White: Page Free Space

A technical SQL Server blog from New Zealand. See also my articles on SQLperformance.com

When is a Seek not a Seek?

The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive.

IF      OBJECT_ID(N'tempdb..#Test', N'U')
        IS NOT NULL
        DROP TABLE #Test
;
GO
CREATE  TABLE #Test
        (
        id  INTEGER PRIMARY KEY CLUSTERED
        );
;
INSERT  #Test (id)
SELECT  V.number
FROM    master.dbo.spt_values AS V
WHERE   V.[type] = N'P'
AND     V.number BETWEEN 1 AND 1000
;

Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10.  One way to write that query would be:

SELECT  T.id
FROM    #Test AS T
WHERE   T.id IN 
        (
        101,102,103,104,105,106,107,108,109,
        111,112,113,114,115,116,117,118,119,
        121,122,123,124,125,126,127,128,129,
        131,132,133,134,135,136,137,138,139,
        141,142,143,144,145,146,147,148,149,
        151,152,153,154,155,156,157,158,159,
        161,162,163,164,165,166,167,168,169
        )
;

That query produces a pretty efficient-looking query plan:

Seek 1

Knowing that the source column is defined as an INTEGER, we could also express the query this way:

SELECT  T.id
FROM    #Test AS T
WHERE   T.id >= 101
AND     T.id <= 169
AND     T.id % 10 > 0
;

We get a similar-looking plan:

Seek 2

If you look closely, you might notice that the line connecting the two icons is a little thinner than before.  The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return.  The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0).  Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate.  Adding the opaque modulo expression results in SQL Server guessing at the selectivity.  As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows.

The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement.  For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models.  Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054.  The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167).

If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution.  We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON:

Table '#Test'. Scan count 63, logical reads 126, physical reads 0.
Table '#Test'. Scan count 01, logical reads 002, physical reads 0.

The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1).  It is no coincidence that 126 = 63 * 2, by the way.  It is almost as if the first query is doing 63 seeks, compared to one for the second query.

In fact, that is exactly what it is doing.  There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon.  To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu):

Expanded Seek Properties

The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query.  I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101.  Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan.

Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value.  Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf).  We can see the index depth using the INDEXPROPERTY function, or by using a DMV:

SELECT  S.index_type_desc,
        S.index_depth
FROM    sys.dm_db_index_physical_stats
            (
            DB_ID(N'tempdb'),
            OBJECT_ID(N'tempdb..#Test', N'U'),
            1,
            1,
            DEFAULT
            ) AS S
;

Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected:

Expanded Seek Properties 2

There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101).  It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169).  Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well.

You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks.  It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster.  Let’s run both query forms 10,000 times and measure the elapsed time:

DECLARE @i  INTEGER, 
        @n  INTEGER = 10000, 
        @s  DATETIME = GETDATE()
;        
SET     NOCOUNT ON;
SET     STATISTICS XML OFF;
;
WHILE   @n > 0
BEGIN
        SELECT  @i = 
                T.id
        FROM    #Test AS T
        WHERE   T.id IN 
                (
                101,102,103,104,105,106,107,108,109,
                111,112,113,114,115,116,117,118,119,
                121,122,123,124,125,126,127,128,129,
                131,132,133,134,135,136,137,138,139,
                141,142,143,144,145,146,147,148,149,
                151,152,153,154,155,156,157,158,159,
                161,162,163,164,165,166,167,168,169
                )
        ;
        SET     @n -= 1;
END
;
PRINT   DATEDIFF(MILLISECOND, @s, GETDATE())
;
GO
DECLARE @i  INTEGER, 
        @n  INTEGER = 10000, 
        @s  DATETIME = GETDATE()
;        
SET     NOCOUNT ON
;
WHILE   @n > 0
BEGIN
        SELECT  @i = 
                T.id
        FROM    #Test AS T
        WHERE   T.id >= 101
        AND     T.id <= 169
        AND     T.id % 10 > 0
        ;
        SET     @n -= 1;
END
;
PRINT   DATEDIFF(MILLISECOND, @s, GETDATE())
;

On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms.  The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail.

When is a seek not a seek?  When it is 63 seeks Sarcastic smile

© Paul White 2011

email: SQLkiwi@gmail.com
twitter: @SQL_kiwi

Published Wednesday, February 16, 2011 2:30 AM by Paul White
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Amit Banerjee said:

The scan count in the Statistics IO output typically tells you if the number of traversals required for fetching the actual data. A SEEK is not necessarily the best option always :)

I had written about this sometime back. Ref: http://blogs.msdn.com/b/sqlserverfaq/archive/2010/09/14/scan-count-meaning-in-set-statistics-io-output.aspx

February 15, 2011 8:41 AM
 

Paul White said:

Thanks Amit and a nice explanation of scan count there, too!

Paul

February 15, 2011 10:10 AM
 

tobi said:

Could you speed up the IN query by adding "AND ID between min and max" for appropriately chosen min and max values?

February 15, 2011 4:32 PM
 

tobi said:

And because I never commented here before: This blog is in the top 5 of blogs I subscribed to that I enjoy most. I have subscribed to about 100 quality blogs. Your content is just great.

February 15, 2011 4:34 PM
 

Paul White said:

Hey Tobi,

Great question - you would indeed hope that adding an AND range to the IN query would help the optimizer, but sadly it just isn't bright enough to understand what you are hoping for.

If you add a BETWEEN 101 AND 169 clause, it just uses that range to restrict the IN list - it does not turn the 63 seeks into a single range seek.  Try it :)

As is so often the case, we have to play tricks with the syntax to get SQL Server to do what we want:

SELECT  T2.id

FROM    (

      SELECT  TOP (69)

              T.id

      FROM    #Test AS T

      WHERE   T.id BETWEEN 101 AND 169

      ) AS T2

WHERE   T2.id IN

      (

      101,102,103,104,105,106,107,108,109,

      111,112,113,114,115,116,117,118,119,

      121,122,123,124,125,126,127,128,129,

      131,132,133,134,135,136,137,138,139,

      141,142,143,144,145,146,147,148,149,

      151,152,153,154,155,156,157,158,159,

      161,162,163,164,165,166,167,168,169

      );

That results in the BETWEEN being applied as a single range seek, followed by the TOP and a separate Filter for the IN list.  Without the TOP, the optimizer collapses the filter back into the seek, and we are back where we started :(

Thanks for the kind words in your second comment!

Cheers,

Paul

February 15, 2011 8:43 PM
 

Paul White: Page Free Space said:

You’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL

February 16, 2011 7:34 AM
 

jchang said:

I am not entirely happy that the SQL Server query optimizer converts the elements of an IN list into individual seeks (when a suitable index exists). Ok, it might be the best execution plan for that query, but the execution plan cannot be used for any other IN list, and it bloats the procedure cache. Compare the plan size for a single seek and 63 seeks.

My thought is the IN list should be treated as a row source, which loop joins to the table. For all practical purposes, we are still doing an index seek for each element, but the plan is now both compact and reusable for other lists.

If the optimizer were really clever, it would have a branch that elects the loop join + index seek plan when the list is fewer than x elements, and a scan when more than x elements. But I also believe in Santa Claus.

February 16, 2011 9:07 AM
 

Paul White said:

Hey Joe,

I think this came about because people used to complain (a long time ago) that queries with a predicate like 'WHERE x = 3 OR x = 5' used to result in a scan.  Surely that should be two seeks, right?

So, IN lists (which are just a rewritten OR) get transformed to multiple seeks.  The problem is that it isn't costed properly.  Look at the cost estimates in the main text - they are nonsense.

There is an arbitrary threshold at which the optimizer refuses to do any more singleton seeks.  In current versions, that is 64.  Any more than 64 IN/OR values, and you will get a Constant Scan driving a nested loops join with a seek.  This is even less efficient that the multiple seeks in a single operator, of course.

What might be nice would be if the optimizer could reason about types - we know that IN (5,6,7,8,9) is the same as BETWEEN 5 AND 9 for integer types, but it doesn't do that at the moment.

Thanks for the comment!

Paul

February 16, 2011 9:31 AM
 

jchang said:

ah, I did not know that 54-65 was the cross-over point from multiple seeks to the constant scan + loop join.

Ok, so I have the plan cost for your query with 63 elements at 0.0033878 (mostly from the 0.003125 IO cost (for a single random IO)

the plan cost for 65 elements in the IN clause is 4X higher:

0.0137384, in essence, the loop join is costed at more than 1 IO.

Also, the cached plan size for 63 elements is 56(K?) and 32 for 65 elements.

Running your test loop, but using dm_exec_query_stats to collect the elapsed and work time (same) time

I get

342773 - btween

1155273 - 63 elements, single index seek

985351 - 65 elements, constant scan

so what it comes down to is the execution plan cost is not a reliable indicator, and I prefer the compact plan.

Of course, I could say that to me, Santa Claus bears a resemblance to Conor C and the Easter bunny to Boris B,

February 16, 2011 10:12 AM
 

Paul White said:

Agreed - especially the last line :)

February 16, 2011 10:34 AM
 

jchang said:

I did one more test, I ran the query WHERE t.ID (list) with INDEX(0) hint,

this brought the logical IO down from 126 to 4 (size of table #Test), versus 2 for the WHERE t.ID BETWEEN,

the force scan took 8700ms, so clearly not good.

And this is additional evidence logical IO is a worthless metric, there is no consistency of value between LIO in different operations, and there are many things that have cost but zero LIO

My thinking is the cost of the IN list is not just the LIO,

so even if the optimizer recognized that we could have ranged scanned from 101 to 169 for 2 LIO, the logic cost of evaluating the IN list would be substantial. Presumably there is no guarantee that the list is in sorted order, negating the need to evaluate the full list for row?

February 16, 2011 10:35 PM
 

Paul White: Page Free Space said:

So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much

February 17, 2011 2:24 AM
 

Paul White said:

Hi Joe,

The number of predicates that can be pushed into the scan operator is limited to 15.  For 16 or more (otherwise pushable) predicates, the predicates are split out into a separate filter.  This 'magic number' is significantly lower than the 64 that can be pushed into a seek.

It is much more efficient to evaluate a predicate inside the scan operator as a residual.  Evaluated as a pushed predicate, the engine can latch the data page, read a row and evaluate the predicate until a row matches. Once a match is found (or we reach the end of the rowset) the latch is released and the row is emitted.

With a separate filter, the scan latches the page, reads a row, releases the latch and passes the row on to the filter.  The filter then checks the row against each predicate - if it doesn't match, it asks the scan for another row, repeating the whole process.

Aside from anything else, there are potential CPU cache inefficiencies there as different code is loaded to execute the filter and scan operations.

If you run the test with 15 or fewer IN list entries (which are de-duplicated and sorted at optimization time) you should see a filter-less plan.

You are right though that scanning requires many more comparisons.  Each of the 100 rows is compared as many as 63 times, and the test repeats all that 10,000 times.  No wonder it chews up some CPU ;c)

Final point - yes INDEX(0) produces an unordered scan, as shown by the Ordered:False property of the scan operator.  I don't know whether the Filter operator includes any optimizations for a sorted stream.

Logical IO is a highly flawed performance metric, yes.

Paul

February 17, 2011 3:10 AM
 

ALZDBA said:

All very nice stuff, Paul.

What I find even more frightening is that it ( still ) generates a Clusterd index scan for a Select MAX(id) as Max_id from #Test whilst this can be solved in the CLIX Btree with eventually only a read of the last leaf level page.

But that is another story ;-)

March 2, 2011 7:07 AM
 

Paul White said:

Hi ALZDBA,

Regarding the MAX/MIN scan - notice that it is an ordered scan, and the following TOP operator is a TOP (1).

The 'scan' doesn't perform a full scan at all, it just retrieves one row from one 'end' of the index or other, at which point the TOP has all the rows it needs, and execution stops.

This is an optimizer trick (which is broken for partitioned structures) that takes advantage of the fact that MAX(col1) is the same as TOP (1) col1 ... ORDER BY col1 DESC, for example.

Run the MIN or MAX query with STATISTICS IO ON, and a cold buffer cache, and notice how many reads you see :)

Cheers!

Paul

March 2, 2011 8:48 AM
 

hallidayd said:

Teeny one Paul:

"Final point - yes INDEX(0) produces an unordered scan, as shown by the Ordered:False property of the scan operator."

I thought that meant the scan could be in allocation order, but might not be. In other words, if Ordered property is true the scan must be ordered. If false then the scan can be ordered or not depending on cost. In terms of cost, I thought that there had to be 64 extents or more to be scanned otherwise the start up cost of a scan by allocation was prohibitively expensive so a logical ordered scan is used.

Incorrect?

June 27, 2011 4:06 AM
 

Paul White said:

Hi,

Yes, Ordered:False just means the QP doesn't insist that SE produce rows in index order.  There are a number of conditions for SE to choose an IAM-driven scan, one being that the scan must be a minimum size.  INDEX(0) produces an unordered scan from the QP's point of view because there is no ordering guarantee.

Paul

June 27, 2011 9:24 PM
 

Paul White: Page Free Space said:

This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley . The topic this month is Plan

June 11, 2013 5:00 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement