A little while back, I posted a short series on seeks and scans, and one of the things I highlighted was the difference between a singleton seek and a range scan. You can find that post here, if you want a refresher.
Anyway, the broad point is that a singleton equality seek always retrieves exactly one row, and is guaranteed to do so because a unique index exists to enforce it. A range scan, on the other hand, seeks down the B-tree to a starting (or ending) point, and scans forward (or backward) from that point using the next or previous page pointers. The point of today’s short post is to show how much faster a singleton seek is, compared with a range scan, even when both return exactly the same number of records.
Pretty simple test rig today; a 5 million row table with a single BIGINT NOT NULL column with all the values from 1 to 5,000,000:
CREATE TABLE dbo.SeekTest
(
col1 BIGINT NOT NULL
);
GO
CREATE CLUSTERED INDEX cx ON dbo.SeekTest (col1);
GO
-- 5 million rows numbered 1 to 5,000,000
INSERT dbo.SeekTest WITH (TABLOCKX)
(col1)
SELECT TOP (5000000)
ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM
master.sys.columns AS c,
master.sys.columns AS c2,
master.sys.columns AS c3;
Notice that the clustered index is not defined as unique. There will not be any uniquifiers (in case you think I am cheating) because SQL Server only adds them when necessary, and there are no duplicates in this example. The test query simply joins the table to itself, forcing a nested loops join so we get 5 million seek operations:
SELECT
COUNT_BIG(*)
FROM dbo.SeekTest AS st WITH (TABLOCK)
INNER LOOP JOIN dbo.SeekTest AS st2 WITH (TABLOCK) ON
st2.col1 = st.col1
OPTION (MAXDOP 1);
And here’s the query plan:

On my machine, typical results are:

Notice the Scan Count. We get one scan for the Clustered Index Scan iterator in the plan, and one scan for every seek operation. This is one way to see that we are getting range scans here – the lack of a unique index means that SQL Server cannot guarantee that only one row will be returned from each seek, so a range scan is performed to pick up any extra rows.
Now let’s create a second table, where the only difference is that the index is declared as UNIQUE:
CREATE TABLE dbo.SeekTestUnique
(
col1 BIGINT NOT NULL
);
GO
CREATE UNIQUE CLUSTERED INDEX cuq ON dbo.SeekTestUnique (col1);
GO
-- 5 million rows numbered 1 to 5,000,000
INSERT dbo.SeekTestUnique WITH (TABLOCKX)
(col1)
SELECT TOP (5000000)
ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM
master.sys.columns AS c,
master.sys.columns AS c2,
master.sys.columns AS c3;
As before, we join this table to itself using loops join (the TABLOCKs are just to reduce locking overheads):
SELECT
COUNT_BIG(*)
FROM dbo.SeekTestUnique AS stu WITH (TABLOCK)
INNER LOOP JOIN dbo.SeekTestUnique AS stu2 WITH (TABLOCK) ON
stu2.col1 = stu.col1
OPTION (MAXDOP 1);
And we get the same plan:

But very different results:

Now there is only one scan, because the seeks are all singleton seeks, and execution time has improved from 9472ms to 6096ms.
There are lots of good reasons to enforce uniqueness where you can (and a few not to). For further information see:
Is it a Seek or a Scan?
Sneaky Reads with Unique Indexes
Unique Indexes with GROUP BY (Rob Farley)
Scan Count Meaning in STATISTICS IO Output – Part 1 (Amit Banerjee)
Scan Count Meaning in STATISTICS IO Output – Part 2 (Amit Banerjee)
© 2011 Paul White
email: SQLkiwi@gmail.com
twitter: @SQL_Kiwi