THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

T-SQL Tuesday #002: patterns that do not work as expected.

Neither

UPDATE … IF (@@ROWCOUNT = 0) INSERT

nor

IF EXISTS(...) UPDATE ELSE INSERT

patterns work as expected under high concurrency. Both may fail. Both may fail very frequently. MERGE is the king - it holds up much better.Let us do some stress testing and see for ourselves.

Here is the table we shall be using:

CREATE TABLE dbo.TwoINTs
    
(
      
ID INT NOT NULL PRIMARY KEY,
      
i1 INT NOT NULL ,
      
i2 INT NOT NULL ,
      
version ROWVERSION
    
) ;
GO

INSERT  INTO dbo.TwoINTs
        
IDi1i2 )
VALUES  10) ;    

 

 

IF EXISTS(…) THEN pattern frequently fails under high concurrency.

 

Let us insert or update rows in a loop using the following simple logic: if a row with given ID exists, update it, and otherwise insert a new one. The following loop implements this logic. Cut and paste it into two tabs, switch into text mode in both tabs, and run them simultaneously.

-- hit Ctrl+T to execute in text mode

SET NOCOUNT ON ;

DECLARE @ID INT ;

SET @ID ;
WHILE @ID > -100000
    
BEGIN ;
        
SET @ID SELECT  MIN(ID)
                    
FROM    dbo.TwoINTs
                  
) - ;
        BEGIN TRY ;

            
BEGIN TRANSACTION ;
            
IF EXISTS ( SELECT  *
                        
FROM    dbo.TwoINTs
                        
WHERE   ID @ID )
                
BEGIN ;
                    
UPDATE  dbo.TwoINTs
                    
SET     i1 1
                    
WHERE   ID @ID ;
                
END ;
            
ELSE
                BEGIN 
;
                    
INSERT  INTO dbo.TwoINTs
                            
IDi1i2 )
                    
VALUES  @ID0) ;
                
END ;
            
COMMIT 
        
END TRY
        
BEGIN CATCH ;
            
ROLLBACK 
            
SELECT  error_message() ;
       
END CATCH ;
    
END 

When we run this script simultaneously in two tabs, we shall immediately get a huge amount of primary key violations in both tabs. This demonstrates how unreliable the IF EXISTS pattern is when it executes under high concurrency.

Note: this example also demonstrates that it is not safe to use SELECT MAX(ID)+1 or SELECT MIN(ID)-1  as the next available unique value if we do it under concurrency.

 

UPDATE … IF (@@ROWCOUNT = 0) BEGIN pattern is also unreliable.

 

Another common approach is to update a row first, and if no row was updated, insert it. It is also unreliable. Before demonstrating that, let us delete the rows inserted by the previous example:

DELETE  FROM dbo.TwoINTs WHERE   ID < 1 ;

Let us modify the loop which we ran in the previous example, as follows:

-- hit Ctrl+T to execute in text mode

SET NOCOUNT ON ;
DECLARE @ID INT ;
SET @ID ;
 
WHILE @ID > -100000
    
BEGIN ;
        
SET @ID SELECT  MIN(ID)
                    
FROM    dbo.TwoINTs
                  
) - ;
        BEGIN TRY ;
            
            
BEGIN TRANSACTION ;
      
            
UPDATE  dbo.TwoINTs
            
SET     i1 1
            
WHERE   ID @ID ;
            
IF @@ROWCOUNT )
                
BEGIN ;
                    
INSERT  INTO dbo.TwoINTs
                            
IDi1i2 )
                    
VALUES  @ID0) ;
                
END ;
            
COMMIT 
        
END TRY
        
BEGIN CATCH ;
            
SELECT  error_message() ;  
            
ROLLBACK ;     
        
END CATCH ;

    END ;   

 

When we run this script simultaneously from two tabs, we are getting lots of primary key violations, just like when we ran our previous example.

As we have seen, the UPDATE … IF (@@ROWCOUNT = 0) pattern is unreliable under high concurrency too.

 

MERGE holds up perfectly well

 

 Again, let us delete the rows inserted by the previous example:

DELETE  FROM dbo.TwoINTs WHERE   ID < 1 ;

 Rerun the loop using MERGE:

-- hit Ctrl+T to execute in text mode
SET NOCOUNT ON ;
DECLARE @ID INT ;
SET @ID ;
 
WHILE @ID > -100000
    
BEGIN ;
        
SET @ID SELECT  MIN(ID)
                    
FROM    dbo.TwoINTs
                  
) - ;
        
BEGIN TRY ;
       
            
MERGE dbo.TwoINTs AS target
                USING
                    
SELECT    @ID ,
                                
,
                                
0
                    
AS source IDi1i2 )
                
ON target.ID source.ID )
                
WHEN MATCHED
                    
THEN
        UPDATE           SET
                
i1 1
                
WHEN NOT MATCHED
                    
THEN 
          INSERT    
IDi1i2 )
                         
VALUES
                    
@ID ,
                      
,
                      
0
                    
) ;
 
        
END TRY
        
BEGIN CATCH ;
            
SELECT  error_message() ;       
        
END CATCH ;
    
END ;
  

In the context of our loop, MERGE always completes without a single error!

As we have seen, in this particular case MERGE holds up under high concurrency perfectly well. Of course, this does not mean that we can always use this new command without stress testing. Yet we should at least consider using it whenever we insert or update under high concurrency, with usual precautions and after thorough testing.

It is fun to be blogging again. I kind of neglected blogging for some time, because I am writing a book on defensive programming, and that requires a huge amount of time. Anyway, it feels good to be back. The book should be finished soon, and I'll start contributing here more.

 

 This post participates in the second SQL Tuesday

Published Tuesday, January 12, 2010 9:38 AM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

Hi Alex,

You seem to have dismissed the two most important keywords in all of SQL: "BEGIN TRANSACTION". Surely that would fix the issue with both approaches? Doesn't MERGE work better here only because it does all operations as a single transaction?

January 12, 2010 10:17 AM
 

Alexander Kuznetsov said:

Hi Adam.

No, BEGIN TRANSACTION will not fix anything, not under READ COMMITTED. I have modified my post. Thanks for the feedback!

January 12, 2010 10:43 AM
 

Alexander Kuznetsov said:

Adam,

I see that my changes do not show up yet.

I reran the loops under SERIALIZABLE - in that case they complete without errors.

January 12, 2010 10:52 AM
 

Alexander Kuznetsov said:

However, if in one tab I run the first loop (IF EXISTS) under SERIALIZABLE,

and in another tab I run the second loop (UPDATE ... IF @@ROWCOUNT),

I am getting a lot of deadlocks.

January 12, 2010 10:54 AM
 

noeldr said:

Sorry to spoil you conclusion but MERGE *DO* have the same issues if locking hints aren't applied properly under high concurrency.

Take a look:

http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx

January 12, 2010 11:31 AM
 

Alexander Kuznetsov said:

Noel,

My conclusion was not a blanket statement: "in this particular case MERGE holds up under high concurrency perfectly well. Of course, this does not mean that we can always use this new command without stress testing".

I made sure to be specific: "in this particular case", and in this case no hints were needed.

You can just run all the loops yourself and see how it works out for you in your environment.

Of course, any recommendations should be used with usual precautions and after thorough testing.

January 12, 2010 11:53 AM
 

noeldr said:

Alex,

I didn't mean to make it sound so bad. My apologies,I was just trying to convey the idea that MERGE has been found not to be as reliable as many of us would think.

I totally agree with you that each case is different and they should be tested as such.

January 12, 2010 4:04 PM
 

Paul White said:

Inside the transaction:

INSERT  dbo.TwoINTs (ID, i1, i2)

SELECT  @ID, 0 ,0

WHERE   NOT EXISTS

       (

       SELECT  *

       FROM    dbo.TwoINTs WITH (UPDLOCK, HOLDLOCK)

       WHERE   ID = @ID

       );

UPDATE  dbo.TwoINTs

SET     i1 = 1

WHERE   ID = @ID

AND     @@ROWCOUNT = 0;

...doesn't seem to deadlock or result in primary key violations.  Am I missing anything?

January 25, 2010 4:36 AM
 

Alexander Kuznetsov said:

Hey Paul,

Yes, if both loops run the your query with these hints there are no errors and no deadlocks. However, just like with SERIALIZABLE, you may get deadlocks if two different patterns run in two loops.

Thanks!

January 26, 2010 11:48 AM
 

Paul White said:

You're welcome Alex :c)

The key difference between the UPDLOCK, HOLDLOCK hints and the SERIALIZABLE example is that the update lock prevents deadlocking due to lock conversion, and the HOLDLOCK hint applies SERIALIZABLE locking semantics to the *specified table only* - which makes a big difference to the potential for deadlocks.

I cannot see an easy way to get this code to deadlock - even using one of the other examples in a competing session...any insight?

Thanks for another thought-provoking blog entry!

January 28, 2010 12:09 AM
 

Adam Machanic said:

According to T-SQL Tuesday rules as ratified by me in the first and second T-SQL Tuesday posts, the T-SQL

February 8, 2010 2:15 PM
 

Naomi said:

Adam,

Your last post is truncated and no more comments are seen.

January 6, 2011 7:16 PM
 

Naomi said:

Paul/Alex,

So, what is the solution for SQL 2005 to use for such situation?

Thanks in advance.

April 5, 2012 12:32 PM
 

AlexK said:

Naomi,

I don't have SQL 2005 running, so I cannot verify, but can you try out sp_getapplock, as described here: http://www.simple-talk.com/sql/t-sql-programming/developing-modifications-that-survive-concurrency/

If this does not work for you, you can use something like this:

CREATE TABLE dbo.Toggle(ID INT NOT NULL PRIMARY KEY, TOGGLE INT NOT NULL);

GO

INSERT INTO dbo.Toggle(ID, toggle) values(1,1)

-- in your procedure

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

BEGIN TRAN ;

UPDATE dbo.toggle SET toggle = 1- toggle ;

-- do your modifications here

COMMIT;

April 5, 2012 2:49 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement