Neither
UPDATE … IF (@@ROWCOUNT = 0) INSERT
nor
IF EXISTS(...) UPDATE ELSE INSERT
patterns work as expected under high concurrency. Both may fail. Both may fail very frequently. MERGE is the king - it holds up much better.Let us do some stress testing and see for ourselves.
Here is the table we shall be using:
CREATE TABLE dbo.TwoINTs
(
ID INT NOT NULL PRIMARY KEY,
i1 INT NOT NULL ,
i2 INT NOT NULL ,
version ROWVERSION
) ;
GO
INSERT INTO dbo.TwoINTs
( ID, i1, i2 )
VALUES ( 1, 0, 0 ) ;
IF EXISTS(…) THEN pattern frequently fails under high concurrency.
Let us insert or update
rows in a loop using the following simple logic: if a row with given
ID exists, update it, and otherwise insert a new one. The following
loop implements this logic. Cut and paste it into two tabs, switch into
text mode in both tabs, and run them simultaneously.
-- hit Ctrl+T to execute in text mode
SET NOCOUNT ON ;
DECLARE @ID INT ;
SET @ID = 0 ;
WHILE @ID > -100000
BEGIN ;
SET @ID = ( SELECT MIN(ID)
FROM dbo.TwoINTs
) - 1 ;
BEGIN TRY ;
BEGIN TRANSACTION ;
IF EXISTS ( SELECT *
FROM dbo.TwoINTs
WHERE ID = @ID )
BEGIN ;
UPDATE dbo.TwoINTs
SET i1 = 1
WHERE ID = @ID ;
END ;
ELSE
BEGIN ;
INSERT INTO dbo.TwoINTs
( ID, i1, i2 )
VALUES ( @ID, 0, 0 ) ;
END ;
COMMIT ;
END TRY
BEGIN CATCH ;
ROLLBACK ;
SELECT error_message() ;
END CATCH ;
END ;
When we run this script
simultaneously in two tabs, we shall immediately get a huge amount of
primary key violations in both tabs. This demonstrates how unreliable
the IF EXISTS pattern is when it executes under high concurrency.
Note: this
example also demonstrates that it is not safe to use SELECT
MAX(ID)+1 or SELECT
MIN(ID)-1 as the next
available unique value if we do it under concurrency.
UPDATE … IF (@@ROWCOUNT = 0) BEGIN pattern is also unreliable.
Another common approach
is to update a row first, and if no row was updated, insert it. It is
also unreliable. Before demonstrating that, let us delete the rows inserted
by the previous example:
DELETE FROM dbo.TwoINTs WHERE ID < 1 ;
Let us modify the
loop which we ran in the previous example, as follows:
-- hit Ctrl+T to execute in text mode
SET NOCOUNT ON ;
DECLARE @ID INT ;
SET @ID = 0 ;
WHILE @ID > -100000
BEGIN ;
SET @ID = ( SELECT MIN(ID)
FROM dbo.TwoINTs
) - 1 ;
BEGIN TRY ;
BEGIN TRANSACTION ;
UPDATE dbo.TwoINTs
SET i1 = 1
WHERE ID = @ID ;
IF ( @@ROWCOUNT = 0 )
BEGIN ;
INSERT INTO dbo.TwoINTs
( ID, i1, i2 )
VALUES ( @ID, 0, 0 ) ;
END ;
COMMIT ;
END TRY
BEGIN CATCH ;
SELECT error_message() ;
ROLLBACK ;
END CATCH ;
END ;
When we run this script simultaneously from two tabs, we are getting lots of primary key
violations, just like when we ran our previous example.
As we have seen, the UPDATE
… IF (@@ROWCOUNT = 0) pattern is
unreliable under high concurrency too.
MERGE holds up perfectly well
Again, let us delete the rows inserted
by the previous example:
DELETE FROM dbo.TwoINTs WHERE ID < 1 ;
Rerun the loop using MERGE:
-- hit Ctrl+T to execute in text mode
SET NOCOUNT ON ;
DECLARE @ID INT ;
SET @ID = 0 ;
WHILE @ID > -100000
BEGIN ;
SET @ID = ( SELECT MIN(ID)
FROM dbo.TwoINTs
) - 1 ;
BEGIN TRY ;
MERGE dbo.TwoINTs AS target
USING
( SELECT @ID ,
0 ,
0
) AS source ( ID, i1, i2 )
ON ( target.ID = source.ID )
WHEN MATCHED
THEN
UPDATE SET
i1 = 1
WHEN NOT MATCHED
THEN
INSERT ( ID, i1, i2 )
VALUES
( @ID ,
0 ,
0
) ;
END TRY
BEGIN CATCH ;
SELECT error_message() ;
END CATCH ;
END ;
In the
context of our loop, MERGE always completes without a single error!
As we have seen, in
this particular case MERGE holds up under high concurrency perfectly
well. Of course, this does not mean that we can always use this new
command without stress testing. Yet we should at least consider using
it whenever we insert or update under high concurrency, with usual precautions
and after thorough testing.
It is fun to be blogging again. I kind of neglected blogging for some time, because I am writing a book on defensive programming, and that requires a huge amount of time. Anyway, it feels good to be back. The book should be finished soon, and I'll start contributing here more.
This post participates in the second SQL Tuesday