THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul White: Page Free Space

A technical SQL Server blog from New Zealand.

Sorting, Row Goals, and the TOP 100 Problem

When you write a query to return the first few rows from a potential result set, you’ll often use the TOP clause.  To give a precise meaning to the TOP operation, it will normally be accompanied by an ORDER BY clause.  Together, the TOP…ORDER BY construction can be used to precisely identify which top ‘n’ rows should be returned.

The ‘Top N’ Sort

Thinking about how this requirement might be implemented in an executable query plan, we might expect to see a Sort iterator followed by a Top.  In reality, the query optimizer can often collapse these two related operations into a single iterator: a Sort iterator running in Top N Sort mode:

Top-N-Sort

That’s an idea you might find familiar if you read my previous post on Row Goals and Grouping.  In that entry, we saw how a Sort followed by a Stream Aggregate can sometimes be collapsed into a Sort iterator running in Sort Distinct mode.

The General Sorting Algorithm

SQL Server’s normal sorting algorithms are suited to a very wide range of ordering requirements.  They work extremely well regardless of the data types involved, the size of data to be sorted, or the number of sort keys specified.  They also make good use of available memory resources, and can spill to tempdb if required.

It is a common misconception that SQL Server will try to perform a sort entirely in memory if it can.  In fact the algorithms used are much more complex: they aim to achieve a balance between memory usage, average response time, while maintaining high levels of resource concurrency.  Memory is a precious resource in the server, so SQL Server may spill a sort to tempdb, even if sufficient main memory is available.

Sorting with a Row Goal

As sophisticated and highly-tuned as the main sorting algorithms are, they do assume that the full set of sorted rows is always required.  When we use a TOP expression, the FAST N query hints or even an EXISTS clause, we indicate that we would prefer a plan (or part of a plan) optimized to produce the first ‘n’ rows quickly.

Say we have a million-row table, and we just want the first ten rows (in some particular order).  It seems wasteful to fully sort all one million rows when we know that a maximum of ten rows will ultimately be needed.  SQL Server addresses this problem by providing a second algorithm, optimized to return a small number of rows quickly.  This second method must still examine all candidate rows of course, but it knows only to keep the ‘n’ highest-ranked rows as it goes.

It is important to realize that the two sorting approaches are complimentary.  The second method only works well in a fairly narrow set of circumstances, and is particularly unsuitable for large sorts since it cannot spill to tempdb.  This alternative algorithm can be used by a Sort iterator running in Top N Sort mode – normal Sort iterators always use the default sorting method.  To be clear, the Top N Sort iterator may use either algorithm.  Sadly, the query plan does not currently expose any information to identify which algorithm was used in a given query execution.

Achieving the Right Balance

As you might imagine, determining the optimal algorithm to use in a particular circumstance depends on some very complex considerations.  You might further imagine that SQL Server performs a number of fairly hairy calculations to determine the optimal choice, and those calculations might depend on heuristics as well as statistical information.  No doubt, you may reason, many extremely clever people have given the best years of their careers to find a great way to make this important choice.

You would be quite wrong.  SQL Server always uses the alternative algorithm where the TOP 100 (or fewer) rows are requested.  In all other cases, the normal sorting routines are used.  It really is as simple as that.

In many cases, this simple heuristic works well, though it is not at all difficult to engineer situations where it comes unstuck.

Coming Unstuck - 1

This first example is based on a question originally asked on the MSDN forums in December 2009.  At the end of this post, you will find links to the original forum thread (with important contributions from Brad Schulz among others), and subsequent blogs on the subject by Adam Haines and Fabiano Amorim.

As usual, we will need a test table and some sample data:

Table-1

The table consists of a clustered primary key on the identity column RowNum, an integer SomeId column containing random numbers, and a  CHAR(1000) padding column named SomeCode.  Here’s the code necessary to create the above table and populate it with 50,000 rows:

IF      OBJECT_ID(N'dbo.TestData', N'U')
IS NOT NULL
DROP TABLE dbo.TestData;
GO
CREATE TABLE dbo.TestData
(
RowNum INTEGER IDENTITY NOT NULL,
SomeId INTEGER NOT NULL,
SomeCode CHAR(1000)
COLLATE LATIN1_GENERAL_CI_AS
NOT NULL DEFAULT (REPLICATE('X', 1000)),

CONSTRAINT [PK dbo.TestData RowNum]
PRIMARY KEY CLUSTERED (RowNum)
);
GO
INSERT TOP (50000)
dbo.TestData WITH (TABLOCKX)
(
SomeId
)
SELECT SomeId = CHECKSUM(NEWID())
FROM master.sys.columns C1,
master.sys.columns C2,
master.sys.columns C3;
GO
UPDATE STATISTICS dbo.TestData ([PK dbo.TestData RowNum]) WITH FULLSCAN;
CREATE STATISTICS [stats dbo.TestData SomeId] ON dbo.TestData (SomeId) WITH FULLSCAN;
CREATE STATISTICS [stats dbo.TestData SomeCode] ON dbo.TestData (SomeCode) WITH FULLSCAN;

We are asked to take a range of rows (based on RowNum) and return the first ‘n’ rows in SomeId sorted order.  We’ll run two tests, which are identical except one specifies TOP 100, and other TOP 101:

CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO
-- Test 1
DECLARE @RowNum INTEGER,
@SomeId INTEGER,
@SomeCode CHAR(1000);

SELECT TOP (100)
@RowNum = TD.RowNum,
@SomeId = TD.SomeId,
@SomeCode = TD.SomeCode
FROM dbo.TestData TD
WHERE TD.RowNum < 30000
ORDER BY
TD.SomeId ASC
OPTION (MAXDOP 1);
GO
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO
-- Test 2
DECLARE @RowNum INTEGER,
@SomeId INTEGER,
@SomeCode CHAR(1000);

SELECT TOP (101)
@RowNum = TD.RowNum,
@SomeId = TD.SomeId,
@SomeCode = TD.SomeCode
FROM dbo.TestData TD
WHERE TD.RowNum < 30000
ORDER BY
TD.SomeId ASC
OPTION (MAXDOP 1);
GO

Both produce identical query plans, which use a range seek to find the rows, and a Top N Sort to perform the sort:

QP-1

Despite this, we find that the two methods have very different performance:

CU1[6]

In this particular test, the TOP(101) query is over six times slower than the TOP (100) version.

Analysis

One clear difference between the two runs is that the 101-row example spilled to tempdb (as indicated by the sort warning), whereas the TOP (100) version (using the alternative algorithm) did not.  In fact, the TOP (101) example will always spill to tempdb with the sample data set, regardless of the amount of memory available to SQL Server.

This behaviour is by design.  As I mentioned earlier, SQL Server does not always attempt to grant enough memory to a sort to ensure it occurs entirely in memory.  The behaviour seen is a result of a careful compromise between the performance of one individual query, and the needs of the SQL Server instance as a whole.

The exact formula used to compute the memory grant for a sort operation is quite complex (and undocumented, so the details may change at any time).  I’ll explore the precise formula currently employed by SQL Server in my next post.

It is interesting that if we change the test to use a CHAR(100) padding column instead of CHAR(1000), the spill to tempdb does not occur, and the performance difference between TOP (100) and TOP (101) disappears.

Workarounds

Since it seems that spilling to tempdb is the problem, we might think of artificially modifying the query to fool the server into giving it a larger memory grant.  This is usually a misguided approach in my view.

From the actual execution plan, we can see that the intermediate result set flowing into the Sort iterator is 29MB (30,000 rows of 1015 bytes each).  Just twenty concurrent executions of that query would require 580MB of (non-AWE) memory.  If you enjoy debugging RESOURCE_SEMAPHORE waits and unpredictable query performance, larger memory grants might be the option for you :)

That aside, it turns out that SQL Server gives even the standard-algorithm query a memory grant of almost 11MB.  This seems excessive considering the task at hand: we are only sorting on 30,000 integer SomeId column values, which ought to fit within 120KB or so.  This implies that the normal sort algorithms require a memory grant at least as big as the entire data set (not just the sort keys) to avoid spilling to disk.

Adding Lookups

By excluding the large padding column from the sort, we can reduce the memory grant to just 2.75MB.  We can modify our query to execute in two stages:

  1. Find the top ‘n’ rows sorted by SomeId
  2. Join back to the table to fetch the padding column values for just those rows (using the primary key values from step 1)

In the sample implementation below, I have used a common-table expression to find the 101 rows:

WITH    Top101
AS (
-- Top 101 rows, excluding the padding
SELECT TOP (101)
TD.RowNum,
TD.SomeId
FROM dbo.TestData TD
WHERE TD.RowNum < 30000
ORDER BY
TD.SomeId ASC
)
SELECT Top101.RowNum,
Top101.SomeId,
TD2.SomeCode
FROM Top101
JOIN -- Fetch the padding column for just 101 rows
dbo.TestData TD2
ON TD2.RowNum = Top101.RowNum
ORDER BY
Top101.SomeId ASC
OPTION (MAXDOP 1);

The query plan produced shows a Top N Sort of the key values, followed by a Nested Loops lookup to fetch the SomeCode column values:

Top-101-Plan

There are many ways to rewrite the query following the same general idea (using APPLY, a subquery, or a ranking function like ROW_NUMBER).  Brad Schultz suggested one particularly compact approach (though with slightly different semantics that are not important in this case):

SELECT  TOP (101) 
TD.RowNum,
TD.SomeId,
(SELECT TD2.SomeCode FROM dbo.TestData TD2 WHERE TD2.RowNum = TD.RowNum)
FROM dbo.TestData TD
WHERE TD.RowNum < 30000
ORDER BY
TD.SomeId ASC
OPTION (MAXDOP 1);

Avoiding the Sort

A further solution is to create an index to avoid the sort altogether:

CREATE  UNIQUE INDEX [IX dbo.TestData SomeId, RowNum]
ON dbo.TestData (SomeId, RowNum);

This is a very compact index – just four bytes (plus overhead) to store each SomeId value.  The original TOP (101) query now produces this sort-free plan:

No-Sort[6]

 

Rows are acquired from the index in SomeId order, and the RowNum predicate is applied as a residual.  The first 101 rows that match result in a Key Lookup to fetch the padding column, and we’re done.

Each of these solutions has its advantages and disadvantages, so each might be considered ‘best’ depending on the wider circumstances.  For example, the index-based query is the only version which does not require a memory grant; if the design goal is predictable (rather than absolute best) performance, only the last version can guarantee that it will not wait on a memory grant.

Coming Unstuck – 2

It seems only fair to present a case where TOP (100) performs significantly worse than TOP (101).

It turns out that the alternative sort algorithm’s worst case performance occurs with long similar sort keys which are presented in exactly reversed order.  To demonstrate this, we’ll use the CHAR(100) version of the test rig:

CREATE  TABLE dbo.TestData
(
RowNum INTEGER IDENTITY NOT NULL,
SomeId INTEGER NOT NULL,
SomeCode CHAR(100)
COLLATE LATIN1_GENERAL_CI_AS
NOT NULL DEFAULT (REPLICATE('X', 100)),

CONSTRAINT [PK dbo.TestData RowNum]
PRIMARY KEY CLUSTERED (RowNum)
);

The test queries are modified to reflect the change to CHAR(100), and to sort by SomeCode ascending, and then by RowNum descending.  This provides the long keys needed, and ordering by RowNum descending helps ensure that the rows arrive at the sort in exactly reversed order:

SELECT  TOP (100) 
@RowNum = TD.RowNum,
@SomeId = TD.SomeId,
@SomeCode = TD.SomeCode
FROM dbo.TestData TD
WHERE TD.RowNum < 30000
ORDER BY
TD.SomeCode ASC,
TD.RowNum DESC
OPTION (MAXDOP 1);

Again, both queries produce exactly the same query plan:

CU2-Plan

But the Profiler results are very different:

Unstuck-2

This time, the TOP (100) query is more than ten times slower than TOP (101).

As an aside, notice that the cost of key comparisons is crucial.  I deliberately chose a compound key which differs only in the final few bytes, and which requires comparison using the expensive Windows collation rules.

Summary

SQL Server provides a secondary sort algorithm which is used by the Top N Sort iterator when 100 or fewer rows are requested.  This alternate algorithm uses less memory than a full sort, but can perform poorly if the sort keys are large and similar, and if the keys happen to be presented to the Sort iterator in reverse sorted order.  The nature of the algorithm makes it unsuitable for larger or more general sorting needs.

Naturally, absolute best sort performance is obtained when the sort can be performed entirely in main memory, regardless of which algorithm is used.  However, large sorts (or many concurrent smaller sorts) can quickly consume all of a SQL Server’s available query execution memory.

System designers may choose to create indexes to avoid some sorting operations altogether.  They may also choose to reduce the average row size of intermediate result sets that pass through a sort iterator to minimize memory grant requirements – though this may require careful query design, and may incur extra logical reads and higher CPU utilization on average.  For many systems, good, predictable performance with high concurrency will be preferred over a wide range of best-case/worst-case execution times, and relatively poor resource concurrency.

Next time, I’ll look more deeply into memory grants, and specifically at the details of the current algorithms used by SQL Server to calculate those grants.

Paul White
Email: SQLkiwi@gmail.com
Twitter: @SQL_Kiwi

Related Reading

The original MSDN thread
blog entry by Adam Haines
blog entry by Fabiano Amorim
A related Connect Item
Books Online: Top N Sort Iterator

Published Friday, August 27, 2010 3:34 AM by Paul White

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Roy Ernest said:

This is a great Blog. I have added your blogs to my Fav.

August 26, 2010 10:04 AM
 

Paul White said:

Thanks Roy, I'm glad you enjoyed it.

Paul

August 27, 2010 4:17 AM
 

jt1 said:

Really helpful post! thanks

Jag

August 27, 2010 6:14 PM
 

Mike Lutz said:

Great article Paul!  Thanks for your work and detailed analysis.

August 29, 2010 10:32 AM
 

Paul White said:

Thank you Robert Cook (sqlmashup), Jag Thind, and Mike Lutz.  I appreciate the encouragement :)

Paul

August 29, 2010 5:21 PM
 

Brad Schulz said:

Excellent, as always.

Thanks especially for the very clear explanation as to WHY this Top 101 phenomenon (slowdown) occurs.

That MSDN thread last December was a lot of fun.

--Brad

August 31, 2010 7:07 PM
 

Martin said:

Hello,

Did you ever get around to posting the mentioned future article on memory grants?

I can't seem to locate one?

Thanks!

January 26, 2012 6:58 AM
 

Paul White said:

Hi Martin,

No, not yet.  It's not interesting enough on its own for a whole post, so I need to tie it in with something else at some stage.

Paul

January 26, 2012 7:18 AM
 

Paul White: Page Free Space said:

This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley . The topic this month is Plan

June 11, 2013 5:00 AM
 

shailendra said:

Hi paul,

thanks a lot to keep it here..really valuable

April 23, 2014 11:37 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement