THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Defensive database programming: eliminating IF statements.

The following pattern is quite common in database programming:

IF EXISTS(some query) BEGIN

  DO SOMETHING;

END

When such code runs in high concurrency situations, it may not work as expected. I will provide a repro when such logic fails 40% of the time. The following script provides a test table and attempts to implement optimistic locking using rowversion columns:

 

CREATE TABLE dbo.TwoINTs(ID INT NOT NULL, i1 INT NOT NULL, i2 INT NOT NULL, version ROWVERSION)

GO

SET NOCOUNT ON;

DECLARE @i INT;

SET @i=0;

WHILE @i<1000000 BEGIN

INSERT INTO dbo.TwoINTs(ID, i1, i2) VALUES(@i,0,0);

SET @i = @i+1;

END

GO

CREATE UNIQUE INDEX UNQ_TwoInts_ID ON dbo.TwoInts(ID);

GO

 

CREATE PROCEDURE dbo.UpdateTwoINTs(@ID INT, @i1 INT, @i2 INT, @version ROWVERSION)

AS

BEGIN

      SET NOCOUNT ON;

      SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

      DECLARE @ret INT;

      BEGIN TRANSACTION

      IF EXISTS(SELECT 1 FROM dbo.TwoINTs WHERE ID=@ID AND version = @version) BEGIN

            UPDATE dbo.TwoINTs SET i1=@i1, i2=@i2 WHERE ID=@ID AND version = @version;

            SET @ret=0;

      END ELSE BEGIN

            SET @ret=1;

      END;

      COMMIT

      RETURN @ret;

END

GO

A naïve test not involving high concurrency succeeds all right. You can run a naïve test yourself. In one tab, run the following code, which mimics a user who read a row and modified column i1 in 10 seconds:

 

DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;

SELECT @i1=i1, @i2=i2+1, @version=version FROM dbo.TwoInts WHERE ID=5;

WAITFOR DELAY '00:00:10'

EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;

SELECT @ret AS ret, i1, i2 FROM dbo.TwoInts WHERE ID=5;

 

Immediately start another script, which mimics a user who modified another column, i2, also after 10 seconds of delay:

 

DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;

SELECT @i1=i1+1, @i2=i2, @version=version FROM dbo.TwoInts WHERE ID=5;

WAITFOR DELAY '00:00:10'

EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;

SELECT @ret AS ret, i1, i2 FROM dbo.TwoInts WHERE ID=5;

 

Note that only one of two updates succeeded, which is expected – the procedure detected a rowversion mismatch and did not overwrite the first update. However, a more realistic test on my laptop fails 40% of the time. First of all, some cleanup:

 

UPDATE dbo.TwoINTs SET i1=0, i2=0 WHERE ID=5;

 

 Here is the script to run in one tab:

 

DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;

SET @count = 0;

WHILE @count<100000 BEGIN

      SELECT @i1=i1+1, @i2=i2, @version=version FROM dbo.TwoInts WHERE ID=5;

    EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;

      SET @count = @count + 1 - @ret;

END;

 

Here is the script to run in another tab:

 

DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;

SET @count = 0;

WHILE @count<100000 BEGIN

      SELECT @i1=i1, @i2=i2+1, @version=version FROM dbo.TwoInts WHERE ID=5;

    EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;

      SET @count = @count + 1 - @ret;

END;

 

 If no updates were lost, I would expect the values in both i1 and i2 to be 100000. Unfortunately, this is not the case:

 

SELECT i1, i2 FROM dbo.TwoINTs  WHERE ID=5

57871    54473

 

As you have seen, more than 40% updates were lost. There are several ways to fix the problem. Of course, you can just up the isolation level to REPEATABLE READ or even to SERIALIZABLE. Unfortunately, under higher isolation levels code like this is very prone to deadlocks. You can alter the stored procedure, rerun the loops, and see for yourself. There is a much simpler and more robust way to implement optimistic locking. Let me clean up the data first and improve the procedure:

 

UPDATE dbo.TwoINTs SET i1=0, i2=0 WHERE ID=5;

GO

ALTER PROCEDURE dbo.UpdateTwoINTs(@ID INT, @i1 INT, @i2 INT, @version ROWVERSION)

AS

BEGIN

      SET NOCOUNT ON;

      SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

      DECLARE @ret INT;

      UPDATE dbo.TwoINTs SET i1=@i1, i2=@i2 WHERE ID=@ID AND version = @version;

      SET @ret= 1 - @@ROWCOUNT;

      RETURN @ret;

END

 

When you rerun those two scripts with loops from 1 to 100000 again, you can see that this time not a single update was lost:

 

SELECT i1, i2 FROM dbo.TwoINTs  WHERE ID=5

100000  100000

Note that you did not get any deadlocks at all.

As you have seen, the following pattern is very prone to errors in high concurrency OLTP environments:

 

      IF EXISTS(SELECT 1 FROM dbo.SomeTable WHERE <Some Conditions>) BEGIN

            DO Something;

      END ELSE BEGIN

            DO Something else;

      END;

 

 

If you use this pattern in your procedures and expect to run them in high concurrency OLTP environments, you need to properly stress test your procedures. While it is OK to just run scripts from SSMS tabs to demonstrate the problem, as I did in this post, I think that in real life there are more convenient ways to stress test. For instance, I was describing a C# test harness

here.

 

This post continues my series on defensive database programming.

The next post is:

Defensive database programming: fun with UPDATE.

 

Here are the previous posts  from the series:

 

Defensive database programming: fun with changing column widths. Avoid mixing old and new styles of error handling. Defensive database programming: adding ESCAPE clauses. Defensive database programming: qualifying column names. Defensive database programming: rewriting queries with NOT IN().

 

Published Thursday, November 27, 2008 6:27 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Bernd Eckenfels said:

I use this pattern oft for optimistic locking, and I always wonder what to do if rowcount is different from 0 or 1. Sure it should not happen, and shure I can log an alert if it happens. But what does my program logic do? Fail or Proceed. The first one might be a problem for availability, the second one for consitency... but I think I am worrying hypothetically only :)

November 28, 2008 1:43 AM
 

Alexander Kuznetsov said:

Bernd,

In my example there is a UNIQUE INDEX UNQ_TwoInts_ID ON dbo.TwoInts(ID), so my rowcount can only be 0 o1 1. Can you enforce uniqueness in your situation?

November 28, 2008 9:21 AM
 

Bernd Eckenfels said:

Yes it is enforced, thats why I wonder what to do if rowcount > 1 :)

November 29, 2008 8:09 PM
 

Alexander Kuznetsov said:

Bernd,

This sounds interesting. Can you post a repro script?

November 29, 2008 9:36 PM
 

Richard Howells said:

I’m not sure I understand this article properly.

At the point when it claims to stop losing updates, that’s not really true.  It just keeps trying until an update succeeds.  When I added another counter I discovered that to count to 10,000 required about 17,000 attempts.  Updates are still being lost just not reported.

It seems to me that the whole idea is wrong.  IMO this is what locking hints are for.  I think the SELECT outside the stored procedure should add WITH (UPDLOCK) and that select should be in a transaction that includes the SP call.  This allows the outer transaction to be confident that the selected values CAN’T change until the outer transaction commits.  Now I can count to 10,000 in 10,000 attempts.  This solution is also significantly faster.  The article solution takes about 10 seconds on my system.  Using the locking hint reduces it to 8 – about 20% faster.

November 30, 2008 8:55 AM
 

The Noble Savage said:

Richard - care to share your code with the crew? Your idea sounds interesting.

November 30, 2008 10:45 AM
 

Richard Howells said:

@The Noble Savage - Sure.  Here's my outer loop, makes no difference which version of the SP is used.

DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;

SET @count = 0;

WHILE @count<10000 BEGIN

begin transaction

     SELECT @i1=i1+1, @i2=i2, @version=version FROM dbo.TwoInts with (updlock)  WHERE ID=5;

   EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;

     SET @count = @count + 1 - @ret;

commit transaction

END;

November 30, 2008 12:05 PM
 

Alexander Kuznetsov said:

Richard,

To understand my article, you need to know the difference between optimistic and pessimistic locking. Suppose you are making a hotel reservation, you read a row, opened up a screen form, and you are thinking if you want to reserve this room. If the application read the row with (updlock), then other users cannot read this row until you are done with it. This is called pessimistic locking. this is what you are suggesting. It is rarely used in high concurrency OLTP because it usually results in very poor throughput - in real life there could be several minutes between a SELECT and a subsequent EXEC. I am considering a different approach in this article.

On the other hand, if the application just reads a row without holding locks, then other users cannot read and modify this row while you are making your decision. Then at the time you are saving your changes, you may overwrite other people's changes. This is called a lost update. This is why at the time your changes are being saved, the system needs to detect if the row was modified by somebody else and report an error if it was. This approach is called optimistic locking, that's what I'm saying at the very beginning: "implement optimistic locking using rowversion", with a strong emphasis on "optimistic". Each individual update is incrementing either i1 or i2 by one, so when scripts are completed, both i1 and i2 should be 100000 unless some updates were lost.

HIH

November 30, 2008 1:31 PM
 

Chris said:

Or you could just check @@ROWCOUNT.  If someone updated to a new version between the if exists and update, the version would be changed an dthe update would fail.  Even better, you probably don't need the EXISTS check at all.....

November 30, 2008 3:43 PM
 

Alexander Kuznetsov said:

Chris,

The whole point of this article is precisely this: in most cases you do not need the IF EXISTS check at all.

November 30, 2008 3:51 PM
 

Uri Dimant said:

Hi Alex.Great post

Tony also described the similar technique on his web site.

http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/30/855.aspx

December 1, 2008 12:07 AM
 

Alexander Kuznetsov said:

Hi Uri,

Yes I like Tony's posts too, I read them all. In this case mutexes are not needed - the single row UPDATE itself under READ COMMITTED works as a mutex.

December 1, 2008 9:32 AM
 

Chris Falter said:

In the original SP, many updates got lost because 2 threads both satisfied the IF EXISTS condition and then attempted competing updates.  

Given the new SP, if 2 threads are competing, one of them will fail the WHERE clause, and no update will occur.

The important thing to note, and I don't think you emphasized it enough, is that in the improved design you *must* check @@ROWCOUNT to make sure the update succeeded, and if not the system *must* take corrective action.  IMHO the best way to make sure corrective action is taken is for the improved SP to call RAISERROR if @@ROWCOUNT = 0.  I would be reluctant to trust all client code that calls my SP to check the value of @ret and take corrective action; expecting the client code to implement that correctly is an invitation to bug city.  Using RAISERROR, though, means that the client code is forced to take corrective action.  Sure, your script checks the value of @ret and essentially re-tries if it fails, but I don't know anyone who writes real systems with such simplistic error handling.

I agree that in the UPDATE case that you have addressed, mutexes are not needed.  However, in the insert case, Tony's solution looks very appealing.

December 2, 2008 5:47 PM
 

SJ said:

I am novice in database scripting.

I tried out your example and found that around 45% of the updates were lost.

But could not fgiure out the reason for the same; please offer insight!

Thanks!

December 3, 2008 1:35 AM
 

Alexander Kuznetsov said:

Chris,

I am not with you on this "client code is forced to take corrective action" suggestion. Because any database call can raise an exception, it should be wrapped in a try block anyway. A trivial catch block, such as the following:

catch(Exception e){

Console.Write(e);

}

will catch an error without any corrective action whatsoever. I don't think that you can force the client to do the right thing by just throwing exceptions. More to the point, instead of Tester-Doer Pattern I used a perfectly legitimate TryParse Pattern, both described here:

http://blogs.msdn.com/kcwalina/archive/2005/03/16/396787.aspx

Also I disagree with your distrust of client code. I think most client languages such as C# and C++ are far superior to T-SQL, easier to handle errors properly.  

December 3, 2008 3:14 PM
 

Alexander Kuznetsov said:

SJ,

Can you be more specific? Which example did you run? I provided several ones.

December 3, 2008 3:15 PM
 

JKG said:

Interesting.  I've worked on more than a few high demand over worked databases where locking is an issue. They are usually doing a lot of reads and few writes so it's less of a concern but I have to admit i use the IF EXISTS a lot.  

I mostly use it because i like to collapse "Create" and "Update" sprocs so it looks something like...

IF EXISTS(SELECT 1 FROM dbo.SomeTable WHERE <Some Conditions>)

BEGIN

UPDATE dbo.SomeTable SET .. WHERE SomeTableID = @SomeTableId;

END ELSE BEGIN

INSERT INTO dbo.SomeTable ...

END;

Any better way to handle this situation short forcing the client code to do it?  

I suppose doing the UPDATE first then checking the row count would work but the code seems ugly.

December 4, 2008 7:35 PM
 

Alexander Kuznetsov said:

JKG,

On 2008 you can use the new MERGE command. Prior to that, it is more involved than I could fit in a short answer. Let me write it up over the weekend as a separate post. One thing is clear - moving the logic to the client will not handle race conditions, you will encounter the same problems.

December 4, 2008 8:42 PM
 

Alexander Kuznetsov said:

It is well known that UPDATE ... FROM command does not detect ambiguities. Also it well known that ANSI

December 8, 2008 11:04 AM
 

Alexander Kuznetsov said:

Suppose that you need to implement the following logic: IF(row exists) Update the row ELSE Insert a new

December 14, 2008 8:37 PM
 

Alexander Kuznetsov said:

Comparing SET vs. SELECT is a very popular topic, and much of what I have to say has been said before.

January 25, 2009 5:57 PM
 

Chris Falter said:

Wow, where do I start?  First of all, I didn't say I did not trust clients to catch exceptions.  I in fact advocated the throwing of an exception from the stored proc precisely because exception handling is far better design for clients than expecting client-side code to pass around (and eventually evaluate/handle) failure codes.

Secondly, you did not implement the TryParse pattern correctly.  For starters, you would need to name your stored proc something like "TRY_UpdateTwoInts" to make it clear to the client that the client must not expect that standard exception handlers will handle the possible error condition.  In addition, when you implement the pattern you must also, according to Cwalina, implement a paired exception-throwing member for clients that do not want to make use of the TryParse pattern.

Third, I seriously doubt the wisdom of using the TryParse pattern in the vast majority of circumstances.  In typical production scenarios, it is extraordinarily rare that a race condition would yield an error on more than a few percent of attempted updates, so the performance advantages of TryParse do not outweigh the associated disadvantage of design complexity.  I am addressing the general type of situation that the reader of your blog might actually face in his/her server-side code, not an artificial circumstance of a blogger's sample code which is written with the expressed intention of manifesting a race condition.

February 10, 2009 9:02 AM
 

Alexander Kuznetsov said:

Chris,

I agree that TryUpdateTwoInts is a better name.

February 10, 2009 11:28 AM
 

Alexander Kuznetsov said:

I have been posting examples of defensive database programming for some time now. I am by no means done

March 8, 2009 9:49 PM
 

Alexander Kuznetsov said:

I have written up two examples when a SET ROWCOUNT command breaks a seemingly working stored procedure

March 21, 2009 11:05 PM
 

Alexander Kuznetsov said:

There are three kinds of triggers: those which blow up and those which fail silently ;). Seriously, there

May 11, 2009 9:19 PM
 

Alexander Kuznetsov said:

My query used to work, but it blows up after I have added an index? The following query is not safe:

July 11, 2009 11:14 PM
 

Alexander Kuznetsov said:

You cannot assume that the conditions in your WHERE clause will evaluate in the left-to-write order -

July 16, 2009 5:41 PM
 

Alexander Kuznetsov : Defensive database programming: fun with changing column widths. said:

October 9, 2009 4:44 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works at DRW Trading Group in Chicago, where he leads a team of developers, practicing agile development, defensive programming, TDD, and database unit testing.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement