THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Multiple independent results from a single SELECT statement

You cannot output multiple independent results from a single SELECT statement. But sometimes I wish that could be done. An extremely simple case is when you need to build the initial dimensions from a very large table.

Typically, these dimension columns have very low selectivity and are often not indexed. When you want to get all the distinct values from each of these columns, you may want to do the following:

SELECT DISTINCT c1 FROM VeryLargeTable; 

SELECT DISTINCT c2 FROM VeryLargeTable; 

SELECT DISTINCT c30 FROM VeryLargeTable; 

The problem is that this would require scanning the table 30 times (assuming you have 30 such columns to query). And if each scan takes 30 minutes, these queries would 30*30 = 900 minutes or 15 hours to finish.

Wouldn’t it be nice to be able to say something like (I’m making up the syntax) the following?

(SELECT DISTINCT c1) ||

(SELECT DISTINCT c2) ||

(SELECT DISTINCT c30)

  FROM VeryLargeTable; 

And have it produce 30 independent results, one for each column, with a single scan through the large table. This could be done in 30 minutes instead of 15 hours.

Yes, there are tricks you can play with to reduce the amount of time in getting these 30 resultsets, but none of which is as elegant as producing the results with a single scan.

By the way, I didn't just make up the above example. Scanning the very large table multiple times, each for a separate column, is how SQL Server Analysis Server does the initial dimension processing (although there are options to tweak it).

Published Monday, December 12, 2011 4:39 PM by Linchi Shea
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

AlexK said:

Linchi, how about this:

WITH    VeryLargeTable

         AS ( SELECT   1 AS c1 ,

                       2 AS c2

              UNION ALL

              SELECT   3 ,

                       4

              UNION ALL

              SELECT   1 ,

                       2

              UNION ALL

              SELECT   3 ,

                       4

            ),

       Unpivoter

         AS ( SELECT   1 AS ColNum

              UNION ALL

              SELECT   2

            )

   SELECT DISTINCT CASE WHEN ColNum = 1 THEN c1

                WHEN ColNum = 2 THEN c2

           END AS ColValue ,

           ColNum

   FROM    VeryLargeTable

           CROSS JOIN Unpivoter

ColValue    ColNum

----------- -----------

1           1

2           2

3           1

4           2

December 12, 2011 4:36 PM
 

Linchi Shea said:

AlexK;

Sorry, I'm not sure I follow. Can you clarify?

December 12, 2011 4:55 PM
 

AlexM said:

Linchi, AlexK's example does what you wish to accomplish in one go through manual unpivoting. The only thing is I think it'll work only if all columns you  are unpivoting can be cast to one common data type.

December 16, 2011 11:31 AM
 

Gokhan Varol said:

What about

Select distinct columname, columnvalue

From tablea a

Cross apply( select columnname, columnvalue

From(values('columna', columna), ('columnb', columnb) ,...

)d(columnname, columnvalue))b

If all colums are similar datatype above is fine is not cast to sqlvariant.

you will scan the table once

December 19, 2011 1:20 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement