If you do not qualify columns in your query, which means that you do not specify from which tables your columns come, you may have problems if the database schema changes. For example, consider the following sample tables and a select query:
CREATE TABLE Data.Shipments(Barcode VARCHAR(30), SomeOtherData VARCHAR(100))
GO
INSERT INTO Data.Shipments(Barcode, SomeOtherData)
SELECT '123456', '123456 data' UNION ALL
SELECT '123654', '123654 data'
GO
CREATE TABLE Data.ShipmentItems(ShipmentBarcode VARCHAR(30), Description VARCHAR(100))
GO
INSERT INTO Data.ShipmentItems(ShipmentBarcode, Description)
SELECT '123456', 'Some cool widget' UNION ALL
SELECT '123456', 'Some cool stuff for some gadget'
GO
SELECT Barcode, Description
FROM data.Shipments s JOIN data.ShipmentItems i
ON s.Barcode = i.ShipmentBarcode
At the time of writing this select query, there is no need to specify that Barcode column comes from data.Shipments table, because there is no ambiguity. However, if you add Barcode column to data.ShipmentItems table, as follows:
ALTER TABLE Data.ShipmentItems ADD Barcode VARCHAR(30) NULL
Your query breaks:
Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'Barcode'.
Actually, this is not too bad, because the database engine has warned you that you have a problem. Because you already know what the problem is, your troubleshooting is quick and easy. It could be much worse – your query could silently start working differently without raising any errors. Usually such subtle errors are more difficult to troubleshoot. Let me restore the original schema first:
ALTER TABLE Data.ShipmentItems DROP COLUMN Barcode
Consider the following query:
SELECT Barcode,
(SELECT COUNT(*) FROM Data.ShipmentItems WHERE ShipmentBarcode = Barcode) AS NumItems
FROM Data.Shipments
Again, there is no immediate need to qualify Barcode column in the subquery. There is no ambiguity, and the query correctly works as is:
123456 2
123654 0
Let me try to introduce ambiguity:
ALTER TABLE Data.ShipmentItems ADD Barcode VARCHAR(30) NULL
The query still runs without errors, but it returns different results:
123456 0
123654 0
This is very bad – your query is broken and you don’t know about it. Why are the results different? Is this some bug in the database engine? In fact, this is expected behavior. When the query is parsed, the engine will first try to resolve the unqualified column Barcode in the subquery in the scope of that subquery before looking it up in scope of the whole query. So when the query is run the first time, the Barcode column name cannot be resolved in the scope of the subquery, so the query is equivalent to the following one:
SELECT s.Barcode,
(SELECT COUNT(*) FROM Data.ShipmentItems i WHERE i.ShipmentBarcode = s.Barcode) AS NumItems
FROM Data.Shipments s
However, the next time query is run, the unqualified column Barcode in the subquery can be resolved in the scope of that subquery, and as such the query is equivalent to a very different one:
SELECT s.Barcode,
(SELECT COUNT(*) FROM Data.ShipmentItems i WHERE i.ShipmentBarcode = i.Barcode) AS NumItems
FROM Data.Shipments s
Note that the subquery is no longer correlated, the whole query silently gets a different meaning, and you do not know about it, you do not get an error. Typically troubleshooting such subtle bugs takes more time.
Preventing such problems is easy – just qualify your column names up front.
Next post:
Defensive
database programming: adding ESCAPE clauses.