THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Defensive database programming: qualifying column names.

If you do not qualify columns in your query, which means that you do not specify from which tables your columns come, you may have problems if the database schema changes. For example, consider the following sample tables and a select query:
CREATE TABLE Data.Shipments(Barcode VARCHAR(30), SomeOtherData VARCHAR(100))
GO
INSERT INTO Data.Shipments(Barcode, SomeOtherData)
SELECT '123456', '123456 data' UNION ALL
SELECT '123654', '123654 data'
GO
CREATE TABLE Data.ShipmentItems(ShipmentBarcode VARCHAR(30), Description VARCHAR(100))
GO
INSERT INTO Data.ShipmentItems(ShipmentBarcode, Description)
SELECT '123456', 'Some cool widget' UNION ALL
SELECT '123456', 'Some cool stuff for some gadget'
GO
SELECT Barcode, Description
  FROM data.Shipments s JOIN data.ShipmentItems i
  ON s.Barcode = i.ShipmentBarcode

At the time of writing this select query, there is no need to specify that Barcode column comes from data.Shipments table, because there is no ambiguity. However, if you add Barcode column to data.ShipmentItems  table, as follows:
ALTER TABLE Data.ShipmentItems ADD Barcode VARCHAR(30) NULL
Your query breaks:
Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'Barcode'.

Actually, this is not too bad, because the database engine has warned you that you have a problem. Because you already know what the problem is, your troubleshooting is quick and easy. It could be much worse – your query could silently start working differently without raising any errors. Usually such subtle errors are more difficult to troubleshoot. Let me restore the original schema first:
ALTER TABLE Data.ShipmentItems DROP COLUMN Barcode
Consider the following query:
SELECT Barcode,
 (SELECT COUNT(*) FROM Data.ShipmentItems WHERE ShipmentBarcode = Barcode) AS NumItems
FROM Data.Shipments

Again, there is no immediate need to qualify Barcode column in the subquery. There is no ambiguity, and the query correctly works as is:
123456    2
123654    0
Let me try to introduce ambiguity:
ALTER TABLE Data.ShipmentItems ADD Barcode VARCHAR(30) NULL
The query still runs without errors, but it returns different results:
123456    0
123654    0
This is very bad – your query is broken and you don’t know about it. Why are the results different? Is this some bug in the database engine? In fact, this is expected behavior. When the query is parsed, the engine will first try to resolve the unqualified column Barcode in the subquery in the scope of that subquery before looking it up in scope of the whole query. So when the query is run the first time, the Barcode column name cannot be resolved in the scope of the subquery, so the query is equivalent to the following one:
SELECT s.Barcode,
 (SELECT COUNT(*) FROM Data.ShipmentItems i WHERE i.ShipmentBarcode = s.Barcode) AS NumItems
FROM Data.Shipments s

However, the next time query is run, the unqualified column Barcode  in the subquery can be resolved in the scope of that subquery, and as such the query is equivalent to a very different one:
SELECT s.Barcode,
 (SELECT COUNT(*) FROM Data.ShipmentItems i WHERE i.ShipmentBarcode = i.Barcode) AS NumItems
FROM Data.Shipments s

Note that the subquery is no longer correlated, the whole query silently gets a different meaning, and you do not know about it, you do not get an error. Typically troubleshooting such subtle bugs takes more time.
Preventing such problems is easy – just qualify your column names up front.

Next post:

Defensive database programming: adding ESCAPE clauses.

Published Saturday, October 25, 2008 2:51 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Aaron Bertrand said:

When developing stored procedures, there seems to be a lot of emphasis on "get it done fast." Which means

October 31, 2008 4:53 PM
 

Alexander Kuznetsov said:

The following pattern is quite common in database programming: IF EXISTS(some query) BEGIN DO SOMETHING;

November 27, 2008 10:29 PM
 

Alexander Kuznetsov said:

It is well known that UPDATE ... FROM command does not detect ambiguities. Also it well known that ANSI

December 8, 2008 11:04 AM
 

Alexander Kuznetsov said:

Suppose that you need to implement the following logic: IF(row exists) Update the row ELSE Insert a new

December 14, 2008 8:37 PM
 

Denis Gobo said:

Wow, it has been already a year since I wrote A year in review, The 21 + 1 best blog posts on SQLBlog

December 31, 2008 10:38 AM
 

Alexander Kuznetsov said:

Comparing SET vs. SELECT is a very popular topic, and much of what I have to say has been said before.

January 25, 2009 5:57 PM
 

Alexander Kuznetsov said:

I have been posting examples of defensive database programming for some time now. I am by no means done

March 8, 2009 9:49 PM
 

Alexander Kuznetsov said:

I have written up two examples when a SET ROWCOUNT command breaks a seemingly working stored procedure

March 21, 2009 11:05 PM
 

Alexander Kuznetsov said:

There are three kinds of triggers: those which blow up and those which fail silently ;). Seriously, there

May 11, 2009 9:19 PM
 

Alexander Kuznetsov said:

My query used to work, but it blows up after I have added an index? The following query is not safe:

July 11, 2009 11:14 PM
 

Alexander Kuznetsov said:

You cannot assume that the conditions in your WHERE clause will evaluate in the left-to-write order -

July 16, 2009 5:41 PM
 

Alexander Kuznetsov said:

The behavior of NOT IN clause may be confusing and as such it needs some explanations. Consider the following

October 9, 2009 4:34 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement