THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

Aggregating Aggregates and UNION vs. UNION ALL

Sometimes you need to aggregate the results of multiple queries. There are several ways to accomplish this (a CTE is particularly elegant). I decided on the following solution because it provides instruction in aggregates, aliases, and when to use UNION vs. UNION ALL.

For this example, the COUNT aggregate function is used to find the number of rows in each of several tables. To find the total number of rows for all of the tables, you need to SUM the results of all of the COUNTs. In other words, the result sets from all of the queries need to become rows in a single table. To get the desired grand total, use the SUM aggregate on this single table.

The following two queries return the total number of customers and the total number of employees:

select COUNT(*) from Sales.Customer
select COUNT(*) from HumanResources.Employee

-----------
19820

(1 row(s) affected)

-----------
290

(1 row(s) affected)

Because the queries are identical in the number of columns and data types returned, they can be combined using the UNION operator. Simplistic and naïve testing suggests that this may work:

select COUNT(*) from Sales.Customer
UNION
select COUNT(*) from HumanResources.Employee


-----------
290
19820

(2 row(s) affected)

UNION is not the correct operator to use as this test case reveals:

select COUNT(*) from HumanResources.Employee
UNION
select COUNT(*) from HumanResources.Employee

-----------
290

(1 row(s) affected)

UNION contains an implied DISTINCT operator that causes identical rows to disappear. This can lead to miscounts when SUM is used later. The correct operator to use is UNION ALL:

select COUNT(*) from HumanResources.Employee
UNION ALL
select COUNT(*) from HumanResources.Employee

-----------
290
290

(2 row(s) affected)

Now we know that we need to build on the following query:


select
COUNT(*) from Sales.Customer
UNION ALL
select COUNT(*) from HumanResources.Employee

We need to SUM the results of our UNION ALL query, but SUM requires a column name. That means aliases (nrows) must be used.

select COUNT(*) as nrows from Sales.Customer
UNION ALL
select COUNT(*) as nrows from HumanResources.Employee

nrows
-----------
19820
290

(2 row(s) affected)

To query from our new result set, we once again must use an alias (counts) for our tabular result set to make the SELECT * FROM work:

select * from
(select COUNT(*) as nrows from Sales.Customer
UNION ALL
select COUNT(*) as nrows from HumanResources.Employee) as counts

nrows
-----------
19820
290

(2 row(s) affected)

To finish up, replace the wildcard with the SUM aggregate. Another alias (GrandTotal) is used (but not required) to provide a descriptive column header.

select SUM(nrows) as GrandTotal from
(select COUNT(*) as nrows from Sales.Customer
UNION ALL
select COUNT(*) as nrows from HumanResources.Employee) as counts

GrandTotal
-----------
20110

(1 row(s) affected)

Published Tuesday, September 23, 2008 11:39 AM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

marc said:

Works fine with newest databases.. but select in subselect won't work with every database

January 12, 2012 11:08 AM
 

Orion said:

Great, very usefull, thanks

September 12, 2012 9:49 AM
 

Murlidhar said:

Very useful... and helped me improving my procedure performance... thanks

November 11, 2012 1:14 PM
 

Katlego said:

Perfect solution here

January 22, 2014 8:52 AM
 

programmerwoman said:

Tahnk You for saving my time!

March 12, 2014 5:53 AM

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is a Technology Solutions Professional for Microsoft's data platform and works out of Microsoft's Houston office. Prior to joining Microsoft, he was a Microsoft SQL Server MVP. He is experienced in Microsoft SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. John is also a Registered Nurse who graduated from Vanderbilt University with a Master of Science in Nursing Informatics and is an active member of the Sigma Theta Tau nursing honor society. He volunteers as a nurse at safety net clinics. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement