THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Aaron Bertrand

Aaron is a senior consultant for SQL Sentry, Inc., makers of performance monitoring and event management software for SQL Server, Analysis Services, and Windows. He has been blogging here at sqlblog.com since 2006, focusing on manageability, performance, and new features; has been a Microsoft MVP since 1997; tweets as @AaronBertrand; and speaks frequently at user group meetings and SQL Saturday events.

T-SQL Tuesday #23 : Well, that's a kick in the JOIN...

This month's T-SQL Tuesday is a week early due to the upcoming PASS Summit. It is being hosted by Stuart Ainsworth (blog | @codegumbo) and the topic is a general one: JOINs. I had an idea for a very brief post showing the typical join operations we use to solve common queries, and a more interesting but less common way we can solve the same queries with less code.

I often see people performing explicit outer or implicit anti semi joins to return "all of some set, except for this other set." Sometimes the query can get quite convoluted, leading to self-joins and CTEs - especially when aggregates are involved.

As a fictitious example, let's say we wanted all of the object_id values from sys.objects, except for those objects that also have a column named "id" - hey, I didn't say this was a real-world example, it's just a very simple way to demonstrate using objects I know you have installed. So you may see this type of explicit outer join:

 

SELECT o.[object_id]
    
FROM sys.objects AS o
   
LEFT OUTER JOIN sys.columns AS c
    
ON o.[object_id] = c.[object_id]
    
AND c.[name] = N'id'
    
WHERE c.[object_id] IS NULL;

You may also see these NOT EXISTS or NOT IN variants:

-- NOT EXISTS:

SELECT [object_id]
    
FROM sys.objects AS o
    
WHERE NOT EXISTS
    (
      
SELECT 1
        
FROM sys.columns
        WHERE [name] = N'id'
        
AND [object_id] = o.[object_id]
    
);


-- NOT IN:

SELECT [object_id]
    
FROM sys.objects
    
WHERE [object_id] NOT IN
    
(
      
SELECT [object_id]
        
FROM sys.columns
        WHERE [name] = N'id'
    
);

As a side note, be very careful when choosing between a correlated EXISTS and a direct NOT IN - if the subject column is nullable, the latter will produce unexpected results. To avoid confusion and to program consistently, I always choose EXISTS even when it means I need to add a clause to correlate it to the outer query.

However, since SQL Server 2005, we've been able to use EXCEPT to write a slightly simpler version of this query:

SELECT [object_id]
    
FROM sys.objects
EXCEPT
SELECT
[object_id]
    
FROM sys.columns
    
WHERE [name] = N'id';

Sure, we've only saved a line or two, but I find the syntax much less complex - even on a trivial query like this. It doesn't look or smell like a join, does it? Well, it certainly behaves like one. In this case it also performs like one (though you should always test performance when weighing queries that produce the same results - readability should only be one of your criteria).

Let's generate the actual execution plans from within SQL Sentry Plan Explorer. The runtime results are nearly identical for all four queries:


The graphical plans are almost identical. The outer join version looks like this:


While the other three are all performed using one less filter, but otherwise the same plan:

 

In a future blog post, I'll go into more detail about EXCEPT and I'll also talk about how INTERSECT can make certain inner join queries more intuitive. These are definitely two operators that should be in your toolkit if you are writing a lot of joins...
 

Published Tuesday, October 04, 2011 1:19 PM by AaronBertrand
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Blue Ray Plus - Latest Technology News said:

Thank you for submitting this cool story - Trackback from Blue Ray Plus - Latest Technology News

October 5, 2011 10:12 AM
 

Semi-joins and Anti-joins | Dale Burnett said:

October 6, 2011 6:42 PM

Leave a Comment

(required) 
(optional)
(required) 
Submit

About AaronBertrand

...about me...

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement