THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

UNION, UNION ALL and ORDER BY

At every client where I've worked, there have always been questions about UNION and UNION ALL. They may return the same results or they may not. It all depends on the data. If you don't have any duplicates, you'll get the same number of rows returned whether you use UNION or UNION ALL. If you have duplicates, UNION will return fewer rows than UNION ALL.

No matter who your database vendor is, a UNION statement follows the rules of set theory. Set theory tells us that a set only contains unique elements. In other words, there are no duplicates in a set. If you use a UNION statement, you will not see any duplicate values in your result set. To toss out the duplicates, the database engine has to do some type of sort or merge operation to identify the duplicates. The side effect of this is that your result set tends to be ordered. Notice I said tends. This is a big misconception about UNION statements. You may have always seen the results of UNION statements to be perfectly ordered. Just because you observe a query returning sorted results without an ORDER BY statement doesn't mean you can always count on sorted results. If you require the results to be ordered, you must use an ORDER BY clause. Without it, you're gambling.

I've provided some sample code that will work anywhere for you to play with and learn. Select the first three lines and execute them by themselves to see in what order the results appear. Then execute everything. Examine the output and remember what you learned for the next time when it is a real coding assignment.

select-- Select only the first three lines
UNION     -- and execute them by themselves.
select-- See what row is returned first.
UNION
select
3
UNION
select
2

 

select 7
UNION ALL
select 2
UNION ALL
select 3
UNION ALL
select 2

 

Published Friday, February 08, 2013 9:32 PM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Ian Yates said:

I'm in the habit of using UNION ALL unless I explicitly want the dedupe semantics of UNION specifically because it's faster (as you allude to).

February 8, 2013 11:56 PM

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is both a Registered Nurse and a Microsoft SQL Server MVP experienced in Microsoft SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. Experienced in systems integration and workflow analysis, John is passionate about combining his IT experience with his nursing background to solve difficult problems in healthcare. He sees opportunities in using business intelligence and Big Data to satisfy healthcare meaningful use requirements and improve patient outcomes. John graduated from Vanderbilt University with a Master of Science in Nursing Informatics and is an active member of the Sigma Theta Tau nursing honor society. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement