THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Joe Chang

Bushy Joins

A great session by Adam Machanic at SQL Saturday Boston the previous weekend on methods to influence the query optimizer while still letting it do its task. The gist of this is that while SQL Server has what are called Query Hints, there are adverse consequences. The Join Hints (Loop, Hash and Merge) "specify that the query optimizer enforce a join strategy between two tables," but also results in the query optimizer not bothering to investigate the different join orders, even though only the join type was specified. Hence the general advice is that one should not use the SQL Server Query/Join Hints unless one is prepared to completely override the query optimizer, which is essentially to say, one should almost never use join hints. Microsoft's advice is: "we recommend that hints, including , be used only as a last resort by experienced developers and database administrators." Adams' session investigated an alternative method of providing advice to the query optimizer without causing it to otherwise shutdown.

Now that we have said that the Loop, Hash and Merge Join Hints should almost never be used, and without recommending the use of hints, consider the question of how to use hints in the case of a last resort situation. Given the fact that the query optimizer disables join order optimization when hints are applied, the task is to reconstruct a good join order. It is explained elsewhere the general preference regarding join order. See either my articles on www.qdpma.com/ on the Query Optimizer (mostly I just examine the formulas, without bothering on the explanation), articles by Paul White, Benjamin Nevarez and others. Here will only examine the technique of join ordering.

In a two table join, there is only one shape, one table as the outer source in the upper right of the execution plan and the second table as the inner source in the lower left of the execution plan as in the diagram below.

Natural

We can reverse the order of the join, or change the type of join, but there is only one shape.

In a three table join, there are two possible shapes. One is linear: the first table is the outer source, joins to the second table (inner source), and then the output of this is the outer source for the final join with the third table as the inner source.

Linear

The second possible shape is that one table is the outer source in one join to another table. The output of this is now the inner source in the other join with the third table as the outer source.

Bush

From these two basic shapes, we can assemble almost any possible execution plan (sorry, but I do not have examples with the spool operation, if any one would like to comment on these).

Until a few years ago, I had always been under the impression that it was necessary to write out the full sub-query expression in order to force a bushy join, example below.

SELECT xx
FROM A
JOIN (
  SELECT xx
  FROM B
  JOIN C ON xx
) ON xx

The both join shape and order are forced with either a join hint or the OPTION (FORCE ORDER) clause. In a complex query with a long SELECT list, this style of expresssion quickly becomes cumbersome. Then one day, I needed to relax, so I read one of Itzik Ben-Gan's books and saw a style of SQL expression on joins that I had never seen before.

SELECT xx
FROM A
JOIN (
  B JOIN C ON xx
) ON xx

There is no SELECT in the sub-expression!

My heart skipped a beat.

What would be the execution plan join shape be if there were a join hint or force order hint on this expression?

Below is an SQL query example from Adam's session.

query

The execution plan for this query is below. Note that the join order is different than in the SQL.

Natural

If we forced a hash join, we would get the linear plan below.

Linear

Note that the join order is the same as in the SQL.

We could write the SQL in the form below.

query

But without a hint, the execution plan is the same as the original (natural) plan.

Now if we were to force the join order in the new SQL, as below

query

we do indeed get the bush shape with the join type.

Bush

We now have the basic techniques for writing SQL with the objective of forcing a particular join shape and order, to which we could apply join hints that also override much of the query optimizer.

Again, this is not an endorsement of using join hints. Do not use join hints without understanding that it has the effect of overriding the query optimizer on join ordering, and the implications. I do not accept any consequences on the use of join hints unless I was the consultant engaged. OK, so I just gave you a loaded gun while saying don't blame me for its improper use.

Appendix
Search Microsoft Technet for the terms Advanced Query Tuning Concepts, Understanding Nested Loops Joins, Understanding Merge Joins, and Understanding Hash Joins. I seem to have forgotten that role reversal was a feature in hash joins?

ps

Note that Adam's session is the "Gentle Art ..."
Join hints and force order is definitely the bulldozer and burn approach

Published Wednesday, April 02, 2014 3:09 PM by jchang

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

Wait -- you were in my session? And you didn't say hi?!?

April 2, 2014 5:25 PM
 

Adam Machanic said:

Three weeks from now I'll be visiting Chicago to participate in SQL Saturday. I'll offer two different

April 2, 2014 5:40 PM
 

jchang said:

I am going to SQL Sat Atlanta. I also put in for Redmond, Oslo and Paris. Three times I have been involved with Chicago ISVs, all of them are SQL disasters. What's sad is that these could be have fixed had anyone had a basic knowledge of SQL Server.

Anyways, what I posted here is the dangerous thing, as you said in your session. But I am curious to hear what other people are doing in this off the reservation topic.

Oh yeah, we can also influence the optimizer with fake statistics, if it were not for that pesky checksum. Your idea would work too, but requires a table possibly with significant data.

April 2, 2014 6:21 PM
 

Rory said:

I think that this is really awesome - I've been reading Dan Tow's book on SQL tuning and have been looking for some tips on how to force certain execution plans without using hints.

April 3, 2014 7:45 AM
 

Rob Farley said:

Hi Joe - the parentheses in

SELECT xx

FROM A

JOIN (

  B JOIN  C ON xx

) ON xx

aren't actually necessary...

You can do:

SELECT xx

FROM

A

JOIN

 B

 JOIN  

 C

 ON xx

ON xx

Of course, you might find they help readability, but they don't provide any functionality. The nesting story comes down to finding the first ON, and working back to identify the block it defines. Then repeating as necessary for each subsequent ON. Of course you don't spilt blocks once you've identified them, as they're then sets that form the left and right sides of other joins.

Rob

April 9, 2014 4:07 AM
 

jchang said:

thanks Rob. I should add that prior to this I had never written a RIGHT JOIN. I have always placed the outer join table with LEFT JOIN. It makes no difference in SQL, but if we are forcing the join order, then we would use a right join in certain situations.

April 9, 2014 9:03 AM

Leave a Comment

(required) 
(required) 
Submit

About jchang

Reverse engineering the SQL Server Cost Based Optimizer (Query Optimizer), NUMA System Architecture, performance tools developer - SQL ExecStats, mucking with the data distribution statistics histogram - decoding STATS_STREAM, Parallel Execution plans, microprocessors, SSD, HDD, SAN, storage performance, performance modeling and prediction, database architecture, SQL Server engine

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement