In my last post, I showed how using a index unique could speed up equality seeks by around 40%.
For today’s entry, I’m going to use the same tables as last time (single BIGINT column, one table with a non-unique clustered index, and one table with a unique clustered index):
CREATE TABLE dbo.SeekTest
(
col1 BIGINT NOT NULL,
);
GO
-- Non-unique clustered index
CREATE CLUSTERED INDEX cx
ON dbo.SeekTest (col1)
GO
CREATE TABLE dbo.SeekTestUnique
(
col1 BIGINT NOT NULL
);
GO
-- Unique clustered index
CREATE UNIQUE CLUSTERED INDEX cuq
ON dbo.SeekTestUnique (col1)
Test Data
This time, instead of filling the tables with all the numbers from 1 to 5 million, we’ll add just the even numbers from 1 to 10 million. At the risk of stating the slightly obvious, this results in tables with the same number of rows as previously, just no odd numbers:
INSERT dbo.SeekTest WITH (TABLOCKX)
(col1)
SELECT TOP (5000000)
2 * ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM
master.sys.columns AS c,
master.sys.columns AS c2,
master.sys.columns AS c3;
GO
INSERT dbo.SeekTestUnique WITH (TABLOCKX)
(col1)
SELECT TOP (5000000)
2 * ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM
master.sys.columns AS c,
master.sys.columns AS c2,
master.sys.columns AS c3;
Test Run: Non-Unique Index
The test is the same as before: join the test table to itself using a nested loops join, and count the rows returned.
Don’t be fooled by the simplistic nature of the test; I realize you rarely use loops join to self join all the rows in a table. The point here is to check how long it takes to perform 5 million row joins – something that probably happens quite often in most production systems, either as 5 million lookups into a single very much larger table, or perhaps by running a query that does 50,000 row-joins 100 times. Anyway, here’s the query:
SELECT
COUNT_BIG(*)
FROM dbo.SeekTest AS st WITH (TABLOCK)
INNER LOOP JOIN dbo.SeekTest AS st2 WITH (TABLOCK)
ON st.col1 = st2.col1
OPTION (MAXDOP 1);
And the ‘actual’ query plan:

I get these results:

Notice the 5,000,001 scan count showing that we are performing 5 million range scans (not singleton seeks). As far as performance goes, running this query with all the data in cache uses 9,251ms of CPU.
Test Run: Unique Index
Cool. Now we know from last time that we can improve on this result by making the index unique, and performing singleton seeks instead of range scans. Let’s do that then:
SELECT
COUNT_BIG(*)
FROM dbo.SeekTestUnique AS stu WITH (TABLOCK)
INNER LOOP JOIN dbo.SeekTestUnique AS stu2 WITH (TABLOCK)
ON stu.col1 = stu2.col1
OPTION (MAXDOP 1);
Actual plan:

And the results are:

As expected, the scan count confirms we are now doing singleton seeks, and the CPU time has improved to 16,005ms.
Er, Hang On…
If you think that going from 9,251ms of CPU to 16,005ms of CPU is not exactly an improvement, you’d be right. There’s no typo there, no inadvertent switching of the unique and non-unique examples, and no tricks.
Making the index unique really has slowed down this query by around 70%.
The explanation is so interesting, it deserves a full post of its own…so stay tuned for that. Feel free to speculate about the cause in the meantime… in the comments below, by email, or on Twitter.
There is some evidence coming through in the comments that this behaviour is specific to x64 installations. I’d love to get some more validation of this if you have time to test on x86 and/or x64. Thanks.
Disclaimer:
In general, you will want to specify an index as UNIQUE wherever you can. Many queries will benefit from a unique index rather than a non-unique one. This post is very much to show that “It Always Depends” and to set the stage for the next post in this series. Don’t run with scissors. Warning filling may be hot. May contain nuts… 
© 2011 Paul White
email: SQLkiwi@gmail.com
twitter: @SQL_Kiwi