Background
One of the core assumptions made by the SQL Server query optimiser’s model is that clients will consume all of the rows produced by a query. This results in plans that favour the overall execution cost, though it may take longer to begin producing rows. Let’s look at an example:

The optimiser chooses to perform the logical join using a Hash Match physical iterator, resulting in a plan with a total estimated cost of around 1.4 units. By forcing alternative physical joins using a query hint, we see that a plan based on Sort-Merge would have an estimated cost of just under 10, and using Nested Loops would cost over 18 units. All these cost estimates are based on the assumption that all rows are required.
Hash Match
As detailed in a previous post, the Hash Match iterator starts by consuming all rows produced by its build input (the Product table) in order to build a hash table. This makes Hash Match a semi-blocking iterator: it can only start producing output rows once the build phase is complete. If we need the first few rows from the query quickly, this join type may not be optimal.
Sort-Merge
How quickly a Sort-Merge iterator starts producing rows depends on whether any of its inputs need to be explicitly sorted. If Sort-Merge can acquire ordered rows on both inputs (perhaps from an index) without sorting, it can stream rows to its parent without a start-up delay.
The Sort iterator is always a blocking operation, since we can’t know which row will sort without examining them all. In the example query, the rows from the history table do need to be explicitly sorted, so Sort-Merge would also not be a good choice to get the first rows quickly.

Our third join option is Nested Loops, which is always a streaming iterator (no start-up delay), so the first matching rows are returned quickly. Unfortunately, it also has the highest all-rows estimated cost.

The Nested Loops iterator executes its inner input once for every row produced by its outer input, making its overall cost proportional to the product of the number rows received from its child inputs. The Hash and Sort-Merge strategies, by contrast, scan each of their inputs once, resulting in a cost which is proportional to the sum of its inputs’ rows.
The TOP Clause
Returning the first few rows produced by a query is a problem easily solved from the query-writer’s perspective: just add a TOP clause. Imagine that we are just interested in seeing one row of output from our query:
SELECT TOP (1) *
FROM AdventureWorks.Production.TransactionHistory H
JOIN AdventureWorks.Production.Product P
ON H.ProductID = P.ProductID;
The original query returns 113,443 rows, and this small modification returns just one – and does so very quickly. So far so very obvious, you might be thinking – but all is not what it seems.
A Brief Digression Concerning TOP and Deterministic Results
Before I go into that, I should mention that the TOP (1) query above contains a ‘bug’: There is no ORDER BY clause to determine which row qualifies as ‘top’. In principle, the row produced by that query could be selected at random. In practice, the results depend on the query plan and internal implementation details in SQL Server. Since either of those things could change at any moment, you should almost always specify a deterministic ORDER BY clause when using TOP.
If you’re wondering what can be non-deterministic about an ORDER BY clause, consider what might happen if we added “ORDER BY P.ProductID” to the query. A difficulty arises since ProductID is not unique for each row of the result set (the transaction history table may contain many rows for each product). As a concrete example, assume that product #1 is the lowest ProductID, and there are 75 history records for that product. These 75 records all sort equally on ProductID, so which record will be returned by the TOP operation?
There are two ways to make the operation deterministic. First, we could rewrite the ORDER BY to include a combination of columns that form a unique key over the output rows; adding TransactionID from the history table would achieve that here. Second, we could add WITH TIES to the TOP specification, though this would produce more than one row.
I have omitted the ORDER BY on our sample query because I truly do not mind which row we get, and I need to avoid changing the semantics of the original query to demonstrate a point.
The TOP (1) Plan
The TOP (1) query plan looks like this:

That looks very different from the original query plan, which featured a hash join. Clearly, when we add a TOP clause to a query, the optimiser does much more than just add a TOP iterator to the root of the plan it would have produced normally.
For comparison purposes, here is a query plan produced for the TOP (1) query when a hash join is forced:

The important thing to notice is that the build input to the hash join must consume all 504 rows in the Product table, before it can start probing with rows from the TransactionHistory table to find a match. Like all hash operations, this join also requires a memory grant before the plan can start executing. The estimated cost of the hash join plan is 0.1; the loops join plan selected by the optimiser has an estimated cost of 0.01 units.
Row Goals
The challenges involved in producing an optimised query plan for row-limited queries, while retaining good general optimisation performance for full-result queries, are more complex than simply replacing hash or merge join iterators with nested loops.
It would be reasonably straightforward to cater for queries with a TOP at the root of a plan, using specific code designed to recognise specific scenarios. However, that approach would miss wider opportunities for plan optimisation in more general cases.
The TOP clause can be used multiple times, in multiple places in a query declaration: in the outermost scope (as in the example); in a subquery; or in a common table expression – all of which may be arbitrarily complex. The FAST ‘n’ query hint can also be used to ask the optimiser to prefer a plan which will produce the first ‘n’ rows quickly, while not restricting the total number of rows returned overall, as is the case with TOP. As a final example, consider that a logical semi-join (such as a subquery introduced with EXISTS) shares the overall theme: it should be optimised to find the first matching row quickly.
The SQL Server query optimiser provides a way to meet all these requirements by introducing the concept of a ‘row goal’, which simply establishes a number of rows to ‘aim for’ at a particular point in the plan.
Framework
The optimiser changes required to enable the row goal feature are very largely self-contained, sitting in a thin layer on top of the normal optimisation sub-systems. This functional separation from common components (such as the costing engine and the cardinality & selectivity estimation subcomponent) reduces the risk of breaking changes, minimises development effort, and simplifies testing.
The design allows the optimiser to re-use existing facilities to explore the space of possible plans without special handling for regions constrained by a row goal. The plan choices for the sub-tree below the row goal are generated using the common exploration rules used for full-result queries. For example, optimisation of our simple query will consider all three physical join implementations, different access paths to the physical tables (indexes), and alternative orders for the join inputs.
Costing and Cardinality
An important difference arises when the optimiser comes to decide which candidate plan is ‘best’. The normal routines produce iterator costs and inter-iterator cardinality estimates based on the query running to completion. In our example query, this results in a plan using a hash join, with the Product table chosen as the build input.
Normal cardinality & selectivity estimation starts at the plan leaves (an index scan, for example) and works up the plan modifying the statistics (frequencies and histograms) and cardinality estimates as it encounters iterators. Some iterators (like a Filter) have a simple effect on statistics and cardinalities, whereas others (such as joins, unions) may require complex calculations and histogram adjustments.
When a row goal is present, the optimiser subsequently works down the plan (from the row goal toward the plan leaves) modifying the calculated iterator costs and cardinality (row count) estimates using basic statistical information inserted in the memo structure by previous activities.
An example will help clarify the concept, so let’s run our test query again, but this time specifying TOP (50):

Working back from the Top iterator, notice that the cardinality estimate is 50, as required by the row goal. The estimate for the number of rows from each execution of the Index Seek is 1 (the ProductId column is the Primary Key). The estimated cardinality of the other join input is more interesting and requires a little more explanation.
Reverse Cardinality Estimation
One more query plan, to show the equivalent plan without a row goal:
![TOP-50-Comparison[4] TOP-50-Comparison[4]](http://sqlblog.com/blogs/paul_white/TOP50Comparison4_thumb_54B754E4.png)
The regular cardinality estimation routine (without a row goal) determines that the 113,443 TransactionHistory rows, when joined to the Product table, should produce 92,284.3 rows. Leaving aside the fact that this calculation is wrong (and ignores the PK-FK relationship between the two tables), we can see that 113,443 / 92,284.3 = 1.22928 (the calculation uses floating-point arithmetic).
That factor is used in the row goal calculation when working back past the loops join iterator. The output cardinality is 50 (the row goal), so the outer input to the loop join is required to produce 50 * 1.22928 = 61.46400 rows. That matches the estimated number of rows in the row goal plan, and illustrates the working-backwards nature of row goal cardinality estimation.
It is possible for cardinality estimation errors to produce a backwards-working cardinality estimation which is greater than the number of rows available from the source. This condition is detected by the routine, and cardinality estimations are limited to the known row count in that case.
Interesting Side-Effects
You might be wondering if a similar calculation applies to the inner side of the loops join. It does, but it is reported differently (as normal for loops join). In the row-goal plan, the inner input cardinality is estimated to be 1, but the estimated number of executions is a row-goal modified value:

Notice also that the estimated I/O and CPU costs are not scaled – they show the per-execution cost as if a row goal were not in effect. The estimated operator cost is scaled to reflect the row goal, and that value is used in the subtree cost calculations. The differences are easier to see on the outer input:

The Estimated I/O and CPU costs reflect the cost of returning all rows. The Estimated Operator and Subtree costs have been modified by the row goal. This feature of row goal plans often causes confusion, but next time someone asks you how the operator cost can be so much lower that its two components, you’ll have an answer for them!
The operator and subtree costs are used to determine which of the candidate plans is ‘best’. By performing these simple modifications, the row goal feature ensures that a plan is produced which reflects the row goal requirement.
Final Thoughts
Faced with seemingly contradictory requirements to optimise for full-query execution and early termination based on a specified row goal, the SQL Server query optimisation team have produced an elegant solution.
Queries which use a row goal (TOP, FAST N, EXISTS and so on) benefit from all the transformation and optimising tricks at SQL Server’s disposal, while still producing a cost-based plan. To illustrate that last point, our test query chooses a loops join up to TOP (97). When TOP (98) is specified, the optimiser chooses a hash join on cost grounds. (The exact tipping point may be different on different versions of the product, and is dependent on statistics).
Hopefully this post has given the reader some interesting insights into the internal workings of this very useful feature.
Paul White
email: SQLkiwi@gmail.com
twitter: @SQL_Kiwi