THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Reads involving UDFs under READ_COMMITTED_SNAPSHOT may seem inconsistent.

When a select uses scalar or multi-statement UDFs under READ_COMMITTED_SNAPSHOT, we might not get consistent results as of the time our select began - I will provide simple repro scripts.

At the time of this writing MSDN clearly states the following: "Read committed isolation using row versioning provides statement-level read consistency". However, if our select invokes scalar or multi-statement UDFs, this consistency is ensured per a statement in the body of the UDF, not per the whole statement that invokes the UDF(s).

Prerequisites

Before we run the repro scripts, we need the following setup:

 ALTER DATABASE MyTest SET READ_COMMITTED_SNAPSHOT ON ;
GO

USE MyTest ;
GO

CREATE TABLE dbo.Messages
    
(
      
MessageID INT NOT NULL
                    
CONSTRAINT PK_Messages PRIMARY KEY ,
      
MessageBody VARCHAR(50) NOT NULL
    ) ;
GO

INSERT  INTO dbo.Messages
        
( MessageID, MessageBody )
VALUES  ( 1, 'Hello world' ) ;
GO

Getting inconsistent results from a multi-statement UDF

First, we need to create this UDF:

CREATE FUNCTION dbo.AllMessagesTwice ( )
RETURNS @ret TABLE
    
(
      
MessageBody VARCHAR(50) NOT NULL ,
      
SelectedAt DATETIME2(5) NOT NULL
    )
AS
    BEGIN
;
        
INSERT  @ret
                
( MessageBody ,
                  
SelectedAt
                
)
                
SELECT  MessageBody ,
                        
GETDATE()
                
FROM    dbo.Messages ;
        
DECLARE @d DATETIME2(5), @i INT ;
        
SET @d = DATEADD(SECOND, 30, GETDATE()) ;
        
WHILE GETDATE() < @d
            
BEGIN ;
              
SET @i = 1;
            
END ;
        
INSERT  @ret
                
( MessageBody ,
                  
SelectedAt
                
)
                
SELECT  MessageBody ,
                        
GETDATE()
                
FROM    dbo.Messages ;
        
RETURN ;
    
END ;
GO
 

Next, let us open a tab and cut and paste the following script, but not run it yet:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
SELECT * FROM dbo.AllMessagesTwice ( )

DELETE FROM dbo.Messages WHERE MessageId = 2 ;
 

In another tab, cut and paste the following script, but not run it either:

INSERT  INTO dbo.Messages
        
( MessageID, MessageBody )
VALUES  ( 2, 'Hi there' ) ;

Run the select script in the first tab - it takes 30 seconds to complete. Immediately return to the second tab and insert the 'Hi there' message.

When the select in the first tab completes, we shall clearly see the 'Hi there' message, which definitle was inserted after the select began, as well as the timestamp that shows when the row was selected, 30 seconds after the beginning of the statement:

Hello world 2011-08-02 20:56:44.70000
Hello world 2011
-08-02 20:57:14.74000
Hi there    2011
-08-02 20:57:14.74000

As we have seen, a multi-statement UDF can return data that was committed after the time when the select using the UDF began.

Getting inconsistent results from a scalar UDF

Here is the scalar UDF we shall be using:

 CREATE FUNCTION dbo.MessageWithLastId ( )
RETURNS VARCHAR(50)
AS
    BEGIN
;
        
DECLARE @d DATETIME2(5) ,
            
@i INT ;
        
SET @d = DATEADD(SECOND, 30, GETDATE()) ;
        
WHILE GETDATE() < @d
            
BEGIN ;
                
SET @i = 1 ;
            
END ;
        
RETURN (SELECT TOP(1) MessageBody FROM dbo.Messages
        
ORDER BY MessageID DESC) ;
    
END ;
GO

When we have created the scalar UDF, let us cut, paste, and run the following script:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;

DELETE FROM dbo.Messages WHERE MessageId = 2 ;
PRINT 'Data as of the beginning of the dbo.MessageWithLastId ( ) below'
SELECT MessageBody FROM dbo.Messages
        
ORDER BY MessageID ;

PRINT 'Data returned by dbo.MessageWithLastId ( )'
        
SELECT dbo.MessageWithLastId ( ) ;

DELETE FROM dbo.Messages WHERE MessageId = 2 ;

In the second tab we need to immediately run the following, we have 30 seconds to do so:

INSERT  INTO dbo.Messages
        
( MessageID, MessageBody )
VALUES  ( 2, 'Hi there' ) ;

When the select in the first tab completes, we shall see that the scalar UDF returned the value that was not in the table when the select has started:

Data AS OF the beginning OF the dbo.MessageWithLastId ( ) below
MessageBody
--------------------------------------------------
Hello world

(1 row(s) affected)

Data returned BY dbo.MessageWithLastId ( )

--------------------------------------------------
Hi there
 

As we have seen, a scalar UDF can read data that was committed after the time when the select using the UDF began.

Snapshot isolation guarantees consistency

We can rerun both previous examples under snapshot isolation. All we need to do is add the following line immediately before the select that invokes our UDF:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT ;

In both cases we shall see that under snapshot isolation our selects return consistent data as of the time the select started.

Conclusion

As we have seen, if we are using scalar or multi-statement UDFs, and we need statement level consistency, we need to use snapshot isolation - READ_COMMITTED_SNAPSHOT may fail to provide the consistency we need.

Naturally, inline UDFs are esentially macroes - they are flattened into the select they are invoked from. As such, READ_COMMITTED_SNAPSHOT does provide statement level consistency as long as we are only using inline UDFs, and do not use scalar and multi-statement ones.

I would like to thank Erland Sommarskog, whose feedback inspired this post - I am posting this content with his permission.

Published Tuesday, August 02, 2011 9:33 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Paul White said:

This is an excellent post, Alex!  Another reason to avoid functions that perform data access - what a delightfully subtle side-effect.

Paul

August 3, 2011 8:41 AM
 

Alexander Kuznetsov said:

Glad you liked it, Paul!

August 3, 2011 10:44 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works at DRW Trading Group in Chicago, where he leads a team of developers, practicing agile development, defensive programming, TDD, and database unit testing.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement