SQL Server uses an extensible architecture for query optimisation and execution, using ‘iterators’ as basic building-blocks.
Iterators are probably most familiar in their graphical showplan representation, where each icon represents a single iterator. They also show up in XML query plan output as RelOp nodes.
Each iterator performs a single simple function, such as applying a filtering condition, or performing an aggregation. It can represent a logical operation, a physical operation, or (most often) both.
For example, ‘Aggregate’ is a logical operation, and Stream Aggregate and Hash Aggregate are physical operations. Similarly, ‘Inner Join’ is a logical operation; Nested Loops a physical one. Both graphical plans and XML showplan output show these properties:
Logical iterators can be combined in complex ways to satisfy any requirement that can be expressed in SQL. The physical iterators combine to produce an executable tree used by the execution engine to produce results.
The logical iterators represent the original query translated into relational algebra; the physical iterators represent a physical algebra: the concrete algorithms used by the SQL Server storage engine and query processor to perform a logical task.
Those of you that followed my series of posts on optimiser internals may remember that implementation rules in the query optimiser convert a logical requirement to one of the possible physical implementations. Each round of exploration rules expand the plan search space by applying transformations, before the implementation rules run to map in the possible physical iterator forms. Once that’s done, cost-based decisions can be made to select a final plan for execution.
A Common Interface and Design
Iterators are code objects, which share a common interface of methods and properties. The most frequently used of these are the methods used to process rows, set and retrieve property information, and produce a cost estimate for the optimiser to use.
These common features (and the abstractness of the design) enable easy extensibility as new database and language features are introduced. It also allows iterators to be connected in a tree structure without any of them needing to know anything about what other iterators it might be connected to.
The executable code that forms the core of the iterator’s implementation is also written to avoid hard-wired dependencies on the parent and child iterators in the plan tree. This avoids the need to recompile the iterator base code each time it is incorporated in a new plan. Those of you that have some programming experience will recognise this arrangement as one that uses “function pointers”.
All iterators also require a small fixed amount of memory to store state (think local variables) while they are executing. This memory forms part of the compiled plan, so a formal memory grant is not required each time a plan is executed. The cached size of a plan includes this memory allocation.
The common methods most relevant to the present discussion are those used to process a set of rows. These are the Open, GetNext, and Close methods. (These names might give you a clue as to why query plan nodes are known as ‘iterators’).
Let’s look at a simple plan (the query itself isn’t important):
Have you ever wondered exactly how a plan like that is executed? Where does it start, and how does it know when it has finished? Judging from the lines and arrows that connect the iterators in a graphical plan, we might assume that execution starts with the data-producing iterators (the scans) and continues until they run out of rows.
The arrow heads on the connecting lines do indicate the direction of data flow; the iterator execution order is quite different, however. Query execution always begins with the root iterator – the hash join in the diagram above.
The execution engine only ever directly interacts with this root iterator. It calls its Open method to prepare it to return data, and then repeatedly calls its GetNext method to retrieve one row of query output at a time. As soon as the GetNext method call fails to return a new row, the engine calls the Close method, and query execution ends. Simple as.
The Ultimate Cursors
Ok, so there’s a bit more to it than that. Let’s see what happens when the Open method is called on the hash join iterator:
- The hash join calls the Open method on its build input.
- The Open method of the Scan iterator prepares the storage engine to start producing rows from the Product table. Control then returns to the hash join (which is still running its Open method, remember).
- The hash join repeatedly calls the GetRow method on the Scan iterator.
- Each GetRow call returns a single row from the Scan to the hash join. These rows are used to build the join hash table.
- As soon as a call to GetRow on the Product table Scan reports that there are no more rows, the hash join calls the Close method on the Scan.
- The Scan iterator performs its clean-up tasks, and control again returns to the hash join.
- The hash join now calls Open on its inner input (the probe input).
- The Inventory table Scan iterator prepares its connection to the storage engine, and returns control to the hash join.
- The hash join returns from its Open method call (finally!) and control returns to the execution engine.
The execution engine now repeatedly calls GetRow on the hash join, to return the query results:
- The hash join calls GetRow on its probe input.
- The Scan’s GetRow method returns a single row from the Inventory table.
- The hash join performs the join using its hash table.
- If the row does not join, go back to step 1
- If the row does join, return it to the engine
This process continues until the probe input Scan reports that no more rows are available. The hash join then also returns from its GetRow method without producing a row. When that happens, the execution engine knows that the query is complete, and calls the Close method on the hash join, to allow it to clean up:
- The hash join calls Close on its probe input.
- The Scan performs its clean-up tasks and returns control to the hash join.
- The hash join deallocates its hash table, and returns from the Close method.
This row-by-row processing (calling the GetNext method repeatedly) is why query plan nodes are referred to as iterators: they iterate over a set of input rows, returning one row at a time to its parent.
Execution of a query plan starts at the root, and flows down the plan via a series of nested method calls. Data is pulled up the plan in the reverse direction: from the leaves toward the root, a row at a time. Query plans do run backward :)
Inside an Iterator
Exactly what is done in the Open, GetNext, and Close methods depends on the type of iterator, and the mode in which it is running. The following table gives a broad outline of the work performed by different iterators when these methods are called:
There is a lot more to say about query plan execution and iterators, so I’ll have to return to this topic in a future post.
© Paul White