THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Tibor Karaszi

Why can't we have column alias in ORDER BY?

You probably think something like "What! Of course we can!". And so did I when I read the title for a post about this in a forum. Let us first establish that you can have a column alias in ORDER BY:

SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY pr

The problem was that a table name was also in the ORDER BY clause. And this is when it becomes weird. The problem isn't the column alias, it is the table alias. Basically the desire is to write something like:

SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.pr

Above was apparently allowed in SQL Server 2005 2000 (I never did it myself, however). (Update: Erland Sommarskog caught my error, this was allowed in 2000, not 2005.) In order to understand why above is backwards, we need to consider the logical query flow. The first part of a query which is executed is the FROM clause (logically, the optimizer can do as it pleases as long as we get the same result). In above example, I renamed the table (for the rest of the query) from "Production.Product" to "p". That means that we cannot refer to "Product" or "Production.Product" anymore in the query. However, SQL has imperfections and the topic of this blog is an attempt from MS to rectify such earlier imperfections (some of them), i.e., improving the product.

Continuing the logical query flow (WHERE, GROUP BY, HAVING), we towards the end find the SELECT clause. Here is where you can rename the columns (a.k.a. column alias) for the rest of the query. After the SELECT list, it is meaningless to talk about different tables (where the data originally comes from), all we have is the result of the query (so far). This is why, we can in the ORDER BY refer to a column alias (generally); ORDER BY comes logically after the SELECT list.

However, the original problem was that one also wanted to include a table alias, which shouldn't be allowed. So, why is it allowed, in the first place - consider it makes no sense from a logical standpoint? For that we need to look as somethining weird, which is allowed (both in T-SQL and ANSI SQL) and that can be somewhat useful:

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product
ORDER BY ModifiedDate

We don't need to return something to the client app in order to sort over it. Sure, that can be useful, but from a logical perspective, this is spooky. We sort over something which doesn't exist! So, the imperfections of the SQL language is the reason things starts to break down. Basically, what we do is to sort over something as if we would have included it in the SELECT list. We are saying, in the ORDER BY clause something like

"Hey, dbms! Can you please pretend we included Product.ModifiedDate in the SELECT list so we can sort over it. Bur don't return it to the client app, I don't want to see that column."

Where it really breaks down is when we combine table and column alias. Let me repeat the problematic query:

SELECT
p.ProductID AS pid
,p.ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.pr

The problem is that it is meaningless to specify a table alias here, the column does exist in the SELECT list. The only reason we can refer to something table-something at all, is to let the DBMS pretend we included something additional from the base tables, so we could sort over it. Such as (allowed): 

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product AS p
ORDER BY p.ModifiedDate

One could of course argue that below would be more logical (not allowed):

SELECT
ProductID AS pid
,ListPrice AS pr
FROM Production.Product AS p
ORDER BY Product.ModifiedDate

Arguing that one form of two above is more logical than the other is pretty irrelevant, houever. Both are weird constructs, and the language is what it is. Anyone feel like checking the ANSI SQL standard which of above two (first, second or both) are allowed/preferred? No me, I'm off to grab some breakfast!

 

Published Sunday, October 03, 2010 12:43 PM by TiborKaraszi
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Paul White said:

Hi Tibor,

Do you know which build of 2005 allowed ORDER BY p.pr?  It doesn't work for me in 2005 or 2008.

Using ORDER BY p.ListPrice, ORDER BY ListPrice, and ORDER BY pr all do though.

I'm sure I must be missing something here.

Paul

October 3, 2010 8:19 AM
 

TiborKaraszi said:

Hi Paul,

Apparently the behavior change was between 2000 and 2005, not 2005 and 2008 (Erland corrected me in http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/10e920a0-6d85-45ac-af77-b3a1ce58a7d1/). So, dig up a VM with 2000 or older and give it a spin.

I'm too lazy and just take what people say for granted, leading to these post-corrections... :-)

October 3, 2010 8:35 AM
 

Paul White said:

Ah right - thanks.  I see from Erland's comment that it was a bug in 2000.  Makes sense now, and reminds me a bit of:

CREATE TABLE #fish (a int);

DROP TABLE SoLongAndThanksForAllThe.#fish;

...which works in 2005 and 2008 :)

That's probably a bug too.

October 3, 2010 8:45 AM
 

TiborKaraszi said:

I didn't know you could do that, Paul. Maybe I came across it in some forum. Sometimes I wonder where all these strange uses cases comes from... One if bugs in ones T-SQL code, of course - that I can totally relate to.

October 4, 2010 3:45 AM
 

Alejandro Mesa said:

Tibor,

> but from a logical perspective, this is spooky

Good timing for this post, since "Halloween" celebration is almost here.

I had the same reaction when I first read the title of this post, but now it makes sense to me.

Thanks for sharing it.

--

AMB

October 4, 2010 9:26 AM
 

Thomas Eyde said:

I so disagree with you regarding the logical aspects. I don't pretend to know much about databases, being a C# developer, but from a pure logical standpoint, I should be able to use aliases everywhere.

The underlying problem seems to be leaky abstraction in the sql compiler, if we can call it that. Somehow we are forced to consider technical aspects when we design our query, when the compiler should be very capable to keep track of our aliases.

Also, I could join more tables where it would be natural to give the same aliases to columns in different tables. Now it do make sense to use both table and column aliases.

In my opinion, the change from 2000 to 2005 solved the wrong problem.

October 11, 2010 2:57 AM
 

TiborKaraszi said:

Thomas,

This has nothing (little) to do with compiler or parsing. It has all to do with how the SQL language is defined. The SQL language is defined as evaluation order is FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY and then TOP (TOP is MS extension). If you want to change that, you change the meaning of the language, which is no small undertaking. We recently had a huge debate over this in the MSDN TSQL forum, btw. Hugo C. (I beelieve it was) posted a simple example of how a query would not mean the same thing if we were allowed to use the column alias "prior" to the SELECT list:

SELECT ProductID, LEFT(Name, 20) AS Name, ProductNumber

FROM  Production.Product

WHERE Name IN ('Front Derailleur Linkage', 'HL Hub', 'Internal Lock Washer 7', 'Road-650 Black, 60');

If you want to "dig in", check out this thread: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/4ad14cad-1e61-43a0-96f8-70a506106a00

October 11, 2010 5:35 AM
 

paul said:

I use ordinals, its easier

November 22, 2010 5:21 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement