THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Defensive database programming: fun with UPDATE.

It is well known that UPDATE ... FROM command does not detect ambiguities. Also it well known that ANSI standard UPDATE may perform very poorly and may be difficult to maintain, because it does not adhere to the fundamental DRY (Don’t Repeat Yourself) principle. I will tweak UPDATE ... FROM in three different ways so that you can detect or handle ambiguities without having to repeat one and the same subquery over and over again.

 

Prerequisites

 

The following tables and sample data are used in this post:

CREATE Schema Data AUTHORIZATION dbo;

GO

CREATE TABLE Data.Prices(ID INT NOT NULL, Price FLOAT NOT NULL);

GO

CREATE TABLE Data.PricesStaging(ID INT NOT NULL, Price FLOAT NOT NULL);

GO

INSERT INTO Data.Prices(ID, Price)

SELECT 1, 10 UNION ALL SELECT 2, 10 UNION ALL SELECT 3, 10;

INSERT INTO Data.PricesStaging(ID, Price)

SELECT 1, 14 UNION ALL SELECT 1, 15 UNION ALL SELECT 2, 16;

GO

SELECT 1 AS Number INTO Data.Numbers

  UNION ALL SELECT 2;

 

Problems with ANSI standard updates

 

Consider the following ANSI standard update:

 

UPDATE Data.Prices

  SET Price = (SELECT s.Price FROM Data.PricesStaging s WHERE Data.Prices.ID=s.ID)

  WHERE EXISTS(SELECT * FROM Data.PricesStaging s WHERE Data.Prices.ID=s.ID);

 

In order to comply with ANSI standard, I had to repeat my subquery. Clearly this is poor programming practice. On top of that, this subquery is also executes twice, which drags performance. However, this update detected an ambiguity and errored out, which is usually very good:

Msg 512, Level 16, State 1, Line 1

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

The statement has been terminated.

 

 

Updating a subquery does not detect ambiguities

 

Because ANSI standard UPDATE is very inconvenient to work with, RDBMS vendors provide proprietary extensions which are more convenient. Transact-SQL provides several ways to write an UPDATE without having to repeat yourself. However, these extensions do not detect ambiguities the way ANSI standard UPDATEs do. For instance, the following update silently succeeds:

-- erase previous changes

UPDATE Data.Prices SET Price=10;

WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice

  FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)

UPDATE p SET Price=NewPrice;

 

It is very tempting to use such UPDATE commands, because they are more maintainable. However, in many cases you have to detect ambiguities. There are several ways to accomplish that, I will demonstrate some in the next three sections.

However, I need to mention that SQL Server MVP Hugo Kornelis suggests that UPDATE .. FROM syntax is hopeless and as such

should be deprecated altogether

Hugo recommends using MERGE statement instead, and that is definitely a good option if you are already on 2008. Yet many systems out there are still on 2005, and there are a few approaches to get you going before you can upgrade to 2008.

 

 

Mimicking ANSI standard UPDATE’s behavior.

 

The following UPDATE detects an ambiguity and fails:

 

DECLARE @ambiguityDetector INT;

-- erase previous changes

UPDATE Data.Prices SET Price=10;

WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,

           COUNT(*)OVER(PARTITION BY p.ID) AS NumVersions

  FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)

UPDATE p SET Price=NewPrice,

  @ambiguityDetector = (SELECT Number FROM Data.Numbers WHERE Number<=NumVersions);

SELECT ID, Price FROM Data.Prices;

 

 

But this UPDATE succeeds if there are no ambiguities.

 

Accepting unambiguous changes, silently ignoring ambiguities.

 

The following UPDATE modifies the row with ID=2 because there is no ambiguity, and it does not modify the row with ID=1:

 

UPDATE Data.Prices SET Price=10;

WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,

           COUNT(*)OVER(PARTITION BY p.ID) AS NumVersions

  FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)

UPDATE p SET Price=NewPrice

  WHERE NumVersions = 1;

SELECT ID, Price FROM Data.Prices;

 

ID          Price

----------- ----------------------

1           10

2           15

3           10

 

 

 

Detecting an ambiguity so that you can handle it yourself

 

The following code detects an ambiguity without raising an error - that allows you to provide your own error handling:

 

DECLARE @expectedRowcount INT, @actualRowcount INT;

UPDATE Data.Prices SET Price=10;

WITH p AS(SELECT p.ID, p.Price, s.Price AS NewPrice,

           COUNT(*)OVER() AS NumUpdates

  FROM Data.Prices p JOIN Data.PricesStaging s ON p.ID=s.ID)

UPDATE p SET Price=NewPrice,

  @expectedRowcount = NumUpdates;

SELECT @actualRowcount = @@ROWCOUNT;

IF (@actualRowcount <> @expectedRowcount) BEGIN

  -- do your error handling here

  PRINT 'Ambiguity detected';

END;

 

This post continues my series on defensive database programming.

The next post is:

Stress testing UPSERTs

 

Here are the previous posts  from the series:

 

Defensive database programming: eliminating IF statements. Defensive database programming: fun with changing column widths. Avoid mixing old and new styles of error handling. Defensive database programming: adding ESCAPE clauses. Defensive database programming: qualifying column names. Defensive database programming: rewriting queries with NOT IN().
Published Monday, December 08, 2008 10:56 AM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

December 8, 2008 1:38 PM
 

Alexander Kuznetsov said:

Adam,

I agree that a link to Hugo's post should be added, it's done. But I disagree with Hugo's opinion. I don't believe in writing ANSI standard SQL - tried to do that for a few years, and IMO it just does not work out in the real world. Even if you write ANSI standard SQL, it is in fact not fully portable:

http://www.devx.com/dbzone/Article/32852

Also sometimes I need to just ignore ambiguities, to implement an approach similar to IGNORE_DUP_KEY.

December 8, 2008 2:08 PM
 

Adam Machanic said:

Fair enough :-)

I myself also don't write ANSI Standard SQL; what I meant to point out via Hugo's post was the part about how to use MERGE to mimic row value constructors in an UPDATE.  That's certainly a useful trick to keep around for SQL Server 2008 projects...

December 8, 2008 2:28 PM
 

Alexander Kuznetsov said:

I agree MERGE is quite useful on 2008. Yet I do not always want to use it, to error out of the whole update. Suppose you have a lot of incoming data in a staging table and a slight chance of ambiguity. You can search for duplicates up front, but that's expensive. You can abandon a 500K update because of a single ambiguity, but that is also expensive. You can just accept 499,999 unambiguous values into the database right away and deal with the duplicates later - that may be faster.

December 8, 2008 2:50 PM
 

IgorMikhalyev said:

Here are my 2 cents...  According to my humble experience when you need some complex and tricky logic there is a problem with the database design.  In this particular case it looks like something wrong with the structure of Data.PricesStaging table.  I guess this table needs some additional field to define which prices actually must be used.

December 10, 2008 4:23 PM
 

Alexander Kuznetsov said:

Igor,

In this case you need tricks because built in tools are not up to the task. The ANSI standard update is neither performant nor maintainable. The proprietary update tolerates ambiguities.

I agree, you can detect and filter out ambiguities before feeding them to the update, but you should not have to - the database should be smart enough to detect them (although in some cases it is easier and or faster to detect ambiguities up front). Similarly, you do not have to eliminate duplicates before inserting - a PK or UNIQUE constraint should detect duplicates and error out, or an index with IGNORE_DUP_KEY should skip them although in some cases it is easier and or faster to detect duplicates up front). Makes sense?

December 11, 2008 3:44 PM
 

Alexander Kuznetsov said:

Suppose that you need to implement the following logic: IF(row exists) Update the row ELSE Insert a new

December 14, 2008 8:37 PM
 

Alexander Kuznetsov said:

Comparing SET vs. SELECT is a very popular topic, and much of what I have to say has been said before.

January 25, 2009 5:57 PM
 

Alexander Kuznetsov said:

I have been posting examples of defensive database programming for some time now. I am by no means done

March 8, 2009 9:49 PM
 

Alexander Kuznetsov said:

I have written up two examples when a SET ROWCOUNT command breaks a seemingly working stored procedure

March 21, 2009 11:05 PM
 

Alexander Kuznetsov said:

There are three kinds of triggers: those which blow up and those which fail silently ;). Seriously, there

May 11, 2009 9:19 PM
 

Alexander Kuznetsov said:

My query used to work, but it blows up after I have added an index? The following query is not safe:

July 11, 2009 11:14 PM
 

Alexander Kuznetsov said:

You cannot assume that the conditions in your WHERE clause will evaluate in the left-to-write order -

July 16, 2009 5:41 PM
 

Alexander Kuznetsov : Defensive database programming: eliminating IF statements. said:

October 9, 2009 4:45 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement