The following pattern is quite common in database
programming:
IF EXISTS(some query) BEGIN
DO SOMETHING;
END
When such code runs in high concurrency situations, it may
not work as expected. I will provide a repro when such logic fails 40% of the
time. The following script provides a test table and attempts to implement
optimistic locking using rowversion columns:
CREATE TABLE dbo.TwoINTs(ID INT NOT NULL, i1 INT NOT NULL, i2 INT NOT NULL, version ROWVERSION)
GO
SET NOCOUNT ON;
DECLARE @i INT;
SET @i=0;
WHILE @i<1000000 BEGIN
INSERT INTO dbo.TwoINTs(ID, i1, i2) VALUES(@i,0,0);
SET @i = @i+1;
END
GO
CREATE UNIQUE INDEX
UNQ_TwoInts_ID ON dbo.TwoInts(ID);
GO
CREATE PROCEDURE dbo.UpdateTwoINTs(@ID INT, @i1 INT, @i2 INT, @version ROWVERSION)
AS
BEGIN
SET NOCOUNT
ON;
SET TRANSACTION
ISOLATION LEVEL
READ COMMITTED;
DECLARE @ret INT;
BEGIN TRANSACTION
IF EXISTS(SELECT 1 FROM dbo.TwoINTs WHERE ID=@ID AND version = @version) BEGIN
UPDATE dbo.TwoINTs SET i1=@i1, i2=@i2 WHERE ID=@ID AND version = @version;
SET @ret=0;
END ELSE
BEGIN
SET @ret=1;
END;
COMMIT
RETURN @ret;
END
GO
A naïve test not involving high concurrency succeeds all
right. You can run a naïve test yourself. In one tab, run the following code,
which mimics a user who read a row and modified column i1 in 10 seconds:
DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;
SELECT @i1=i1, @i2=i2+1, @version=version FROM dbo.TwoInts WHERE ID=5;
WAITFOR DELAY '00:00:10'
EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;
SELECT @ret AS ret, i1, i2 FROM dbo.TwoInts WHERE ID=5;
Immediately start another script, which mimics a user who
modified another column, i2, also after 10 seconds of delay:
DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;
SELECT @i1=i1+1, @i2=i2, @version=version FROM dbo.TwoInts WHERE ID=5;
WAITFOR DELAY '00:00:10'
EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;
SELECT @ret AS ret, i1, i2 FROM dbo.TwoInts WHERE ID=5;
Note that only one of two updates succeeded, which is
expected – the procedure detected a rowversion mismatch and did not overwrite
the first update. However, a more realistic test on my laptop fails 40% of the
time. First of all, some cleanup:
UPDATE dbo.TwoINTs SET i1=0, i2=0 WHERE ID=5;
Here is the script to
run in one tab:
DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;
SET @count = 0;
WHILE @count<100000 BEGIN
SELECT @i1=i1+1, @i2=i2, @version=version FROM dbo.TwoInts WHERE ID=5;
EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;
SET @count = @count + 1 - @ret;
END;
Here is the script to run in another tab:
DECLARE @i1 INT, @i2 INT, @version ROWVERSION, @count INT, @ret INT;
SET @count = 0;
WHILE @count<100000 BEGIN
SELECT @i1=i1, @i2=i2+1, @version=version FROM dbo.TwoInts WHERE ID=5;
EXEC @ret=dbo.UpdateTwoINTs 5, @i1, @i2, @version;
SET @count = @count + 1 - @ret;
END;
If no updates were
lost, I would expect the values in both i1 and i2 to be 100000. Unfortunately,
this is not the case:
SELECT i1, i2 FROM dbo.TwoINTs WHERE ID=5
57871 54473
As you have seen, more than 40% updates were lost. There are
several ways to fix the problem. Of course, you can just up the isolation level
to REPEATABLE READ or even to SERIALIZABLE. Unfortunately, under higher
isolation levels code like this is very prone to deadlocks. You can alter the
stored procedure, rerun the loops, and see for yourself. There is a much
simpler and more robust way to implement optimistic locking. Let me clean up the
data first and improve the procedure:
UPDATE dbo.TwoINTs SET i1=0, i2=0 WHERE ID=5;
GO
ALTER PROCEDURE dbo.UpdateTwoINTs(@ID INT, @i1 INT, @i2 INT, @version ROWVERSION)
AS
BEGIN
SET NOCOUNT
ON;
SET TRANSACTION
ISOLATION LEVEL
READ COMMITTED;
DECLARE @ret INT;
UPDATE dbo.TwoINTs SET i1=@i1, i2=@i2 WHERE ID=@ID AND version = @version;
SET @ret= 1 - @@ROWCOUNT;
RETURN @ret;
END
When you rerun those two scripts with loops from 1 to 100000
again, you can see that this time not a single update was lost:
SELECT i1, i2 FROM dbo.TwoINTs WHERE ID=5
100000 100000
Note that you did not get any deadlocks at all.
As you have seen, the following pattern is very prone to
errors in high concurrency OLTP environments:
IF EXISTS(SELECT 1 FROM dbo.SomeTable WHERE <Some
Conditions>) BEGIN
DO Something;
END ELSE
BEGIN
DO Something else;
END;
If you use this pattern in your procedures and expect to run
them in high concurrency OLTP environments, you need to properly stress test
your procedures. While it is OK to just run scripts from SSMS tabs to
demonstrate the problem, as I did in this post, I think that in real life there are more convenient ways
to stress test. For instance, I was describing a C# test harness
here.
This post continues my series on defensive database programming. Here are the previous posts from the series:
Defensive database programming: fun with changing column widths.
Avoid mixing old and new styles of error handling.
Defensive database programming: adding ESCAPE clauses.
Defensive database programming: qualifying column names.
Defensive database programming: rewriting queries with NOT IN().