It is well known that UPDATE ... FROM command does not
detect ambiguities. Also it well known that ANSI standard UPDATE may perform very
poorly and may be difficult to maintain, because it does not adhere to the
fundamental DRY (Don’t Repeat Yourself) principle. I will tweak UPDATE ... FROM
in three different ways so that you can detect or handle ambiguities without
having to repeat one and the same subquery over and over again.
Prerequisites
The following tables and sample data are used in this post:
CREATE Schema
Data AUTHORIZATION dbo;
GO
CREATE TABLE
Data.Prices(ID INT NOT NULL, Price FLOAT NOT NULL);
GO
CREATE TABLE
Data.PricesStaging(ID
INT NOT NULL, Price FLOAT NOT NULL);
GO
INSERT INTO
Data.Prices(ID, Price)
SELECT 1,
10 UNION ALL SELECT 2, 10 UNION ALL SELECT 3, 10;
INSERT INTO
Data.PricesStaging(ID, Price)
SELECT 1,
14 UNION ALL SELECT 1, 15 UNION ALL SELECT 2, 16;
GO
SELECT 1 AS
Number INTO Data.Numbers
UNION ALL SELECT 2;
Problems with ANSI standard updates
Consider the following ANSI standard update:
UPDATE Data.Prices
SET Price = (SELECT s.Price FROM Data.PricesStaging
s WHERE Data.Prices.ID=s.ID)
WHERE EXISTS(SELECT * FROM Data.PricesStaging s WHERE
Data.Prices.ID=s.ID);
In order to comply with ANSI standard, I had to repeat my
subquery. Clearly this is poor programming practice. On top of that, this
subquery is also executes twice, which drags performance. However, this update
detected an ambiguity and errored out, which is usually very good:
Msg 512, Level 16, State 1,
Line 1
Subquery returned more than
1 value. This is not permitted when the subquery follows =, !=, <, <= ,
>, >= or when the subquery is used as an expression.
The statement has been
terminated.
Updating a subquery does not detect
ambiguities
Because ANSI standard UPDATE is very inconvenient to work
with, RDBMS vendors provide proprietary extensions which are more convenient.
Transact-SQL provides several ways to write an UPDATE without having to repeat
yourself. However, these extensions do not detect ambiguities the way ANSI
standard UPDATEs do. For instance, the following update silently succeeds:
-- erase
previous changes
UPDATE Data.Prices SET Price=10;
WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice
FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)
UPDATE p SET
Price=NewPrice;
It is very tempting to use such UPDATE commands, because
they are more maintainable. However, in many cases you have to detect ambiguities.
There are several ways to accomplish that, I will demonstrate some in the next
three sections.
However, I need to mention that SQL Server MVP Hugo Kornelis suggests that UPDATE .. FROM syntax is hopeless and as such
should be deprecated altogether
Hugo recommends using MERGE statement instead, and that is definitely a good option if you are already on 2008. Yet many systems out there are still on 2005, and there are a few approaches to get you going before you can upgrade to 2008.
Mimicking ANSI standard UPDATE’s
behavior.
The following UPDATE detects an ambiguity and fails:
DECLARE @ambiguityDetector INT;
-- erase
previous changes
UPDATE Data.Prices SET Price=10;
WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,
COUNT(*)OVER(PARTITION BY p.ID) AS NumVersions
FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)
UPDATE p SET
Price=NewPrice,
@ambiguityDetector =
(SELECT Number FROM Data.Numbers WHERE Number<=NumVersions);
SELECT ID,
Price FROM Data.Prices;
But this UPDATE succeeds if there are no ambiguities.
Accepting unambiguous changes, silently
ignoring ambiguities.
The following UPDATE modifies the row with ID=2 because
there is no ambiguity, and it does not modify the row with ID=1:
UPDATE Data.Prices SET Price=10;
WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,
COUNT(*)OVER(PARTITION BY p.ID) AS NumVersions
FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)
UPDATE p SET
Price=NewPrice
WHERE
NumVersions = 1;
SELECT ID,
Price FROM Data.Prices;
ID Price
-----------
----------------------
1 10
2 15
3 10
Detecting an ambiguity so that you
can handle it yourself
The following code detects an ambiguity without raising an
error - that allows you to provide your own error handling:
DECLARE @expectedRowcount INT, @actualRowcount INT;
UPDATE Data.Prices SET Price=10;
WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,
COUNT(*)OVER() AS NumUpdates
FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)
UPDATE p SET
Price=NewPrice,
@expectedRowcount =
NumUpdates;
SELECT @actualRowcount = @@ROWCOUNT;
IF (@actualRowcount
<> @expectedRowcount) BEGIN
-- do your error
handling here
PRINT 'Ambiguity detected';
END;
This post continues my series on defensive database programming.
The next post is:
Stress
testing UPSERTs
Here are the previous posts from the series:
Defensive database programming: eliminating IF statements.
Defensive database programming: fun with changing column widths.
Avoid mixing old and new styles of error handling.
Defensive database programming: adding ESCAPE clauses.
Defensive database programming: qualifying column names.
Defensive database programming: rewriting queries with NOT IN().