In this entry, we will explore a little oddity related to how we join two sets of data together. As you may already know, you can join two tables together in a variety of ways, three of which are shown here.
SELECT *
FROM Orders
WHERE OrderID IN (SELECT OrderID FROM [Order Details] Where ProductID = 6)
SELECT *
FROM Orders O
INNER JOIN (SELECT OrderID FROM [Order Details] Where ProductID = 6) OL
ON O.OrderID = OL.OrderID
SELECT *
FROM Orders O
INNER JOIN [Order Details] OL
ON O.OrderID = OL.OrderID
WHERE OL.ProductID = 6
These will all return the same results, but the last one (the one not using derived tables) actually performs worse than the other two, which perform equally (and even use the same execution plan). It has to do with the way the query optimizer decides to attack the query. So, knowing this, you might suspect the following two queries to work the same behind the scenes.
SELECT *
FROM Orders O
INNER JOIN
(SELECT CAST(ListValue AS int) AS OrderID FROM dbo.fnSplit2Table(@sOrderIDs, @sDelimeter)) OL
ON O.OrderID = OL.OrderID
SELECT *
FROM Orders O
WHERE OrderID IN (SELECT CAST(ListValue AS int) FROM dbo.fnSplit2Table(@sOrderIDs, @sDelimeter))
Alas, it isn't true, and the former, which uses the INNER JOIN, actually performs better than its counterpart.
This supports my assertion that an INNER JOIN will have equal or better performance than an equivalently stated WHERE clause using an IN clause in the criteria. There have been many other cases I have found this to be the case. The use of the table-valued UDF brings out the performance difference quite well here.