THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

How the rowversion datatype works when adding and deleting columns

For years, I had thought (and was probably taught in SQL.AlongTimeAgoInAPlaceFarFarAway) that the timestamp column (well before rowversion was a thing,) was not guaranteed to be an ever increasing value. But this is not the case.

In BOL (https://docs.microsoft.com/en-us/sql/t-sql/data-types/rowversion-transact-sql)  it states:      

"The rowversion data type is just an incrementing number..."

This makes it useful for determining rows that have changed, because it it automatic and the user cannot override the value in the column. However, there is a major concern when you use rowversions, and that is what happens when you change the structure of the table, and expect the consumer to see that change. The problem is that when you change the structure of the table, the rowversion will not be set (except when adding a new rowversion column.)

So let's create a new database and a set of tables:     

CREATE DATABASE TestRowVersion;
GO
USE TestRowVersion;
GO      
       
Now, lets check the status of the two functions that we can use to see  the status of the rowversion values: 

SELECT @@DBTS AS DBTS, --the last rowversion that has been used
       --the next rowversion that will be used
       MIN_ACTIVE_ROWVERSION() AS MIN_ACTIVE_ROWVERSION;

This returns and will always on a new database, at least always has every time I have ever built a new db and checked (which over the years is more often than I care to remember.)

     
DBTS               MIN_ACTIVE_ROWVERSION
------------------ ---------------------
0x00000000000007D0 0x00000000000007D1

Next, let us create a few new tables, two with rowversion columns:

CREATE TABLE TestRowversion1
(
    TestRowversion1 int CONSTRAINT PKTestRowversion1 PRIMARY KEY,
    RowVersion rowversion NOT NULL
);
CREATE TABLE TestRowversion2
(
    TestRowversion2 int CONSTRAINT PKTestRowversion2 PRIMARY KEY,
    RowVersion rowversion NOT NULL
);      
CREATE TABLE TestNoRowversionYet
(
    TestNoRowversionYet int CONSTRAINT PKTestNoRowversionYet PRIMARY KEY
);     
GO      
      
The rowversion values have not changed (only using the @@DBTS from now on because it is usually what we want to see):      
 
SELECT @@DBTS AS DBTS;

You will see no change from the first execution.

Now, lets add a few rows to the table with rowversions.     

INSERT INTO dbo.TestRowversion1(TestRowversion1)
VALUES(1),(2),(3);
GO
INSERT INTO dbo.TestRowversion2(TestRowversion2)
VALUES(10),(20),(30);     
GO      
 
And now check the values:      
 
SELECT *
FROM    dbo.TestRowversion1;
SELECT *
FROM    dbo.TestRowversion2;

SELECT @@DBTS AS DBTS;

This will return:
      
TestRowversion1 RowVersion
--------------- ------------------
1               0x00000000000007D1
2               0x00000000000007D2
3               0x00000000000007D3

TestRowversion2 RowVersion
--------------- ------------------
10              0x00000000000007D4
20              0x00000000000007D5
30              0x00000000000007D6

DBTS              
------------------
0x00000000000007D6

Next step, let's add a new column to both tables to see what the effect is. In the first table, the column will be NOT NULL, with a default, and the second will be null.

ALTER TABLE dbo.TestRowversion1
    ADD NewNotNullColumn varchar(20) NOT NULL
        CONSTRAINT DFLTTestRowversion1_NewNotNullColumn DEFAULT ('Not Null'); 
    
ALTER TABLE dbo.TestRowversion2
    ADD NewNullColumn varchar(20) NULL;

    
Now, let's look at the data:

SELECT *
FROM    dbo.TestRowversion1;
SELECT *
FROM    dbo.TestRowversion2;

SELECT @@DBTS AS DBTS; 

You will see the following:      

TestRowversion1 RowVersion         NewNotNullColumn
--------------- ------------------ --------------------
1               0x00000000000007D1 Not Null
2               0x00000000000007D2 Not Null
3               0x00000000000007D3 Not Null

TestRowversion2 RowVersion         NewNullColumn
--------------- ------------------ --------------------
10              0x00000000000007D4 NULL
20              0x00000000000007D5 NULL
30              0x00000000000007D6 NULL


DBTS               MIN_ACTIVE_ROWVERSION
------------------ ---------------------
0x00000000000007D6 0x00000000000007D7


Note that, from a RowVersion standpoint, the values have not changed.

So, if you are using this value in some form of ETL, this is something you will need to be aware of, particularly for columns that are declared as NOT NULL. You may need to tweak the rowversion values, using a query such as:      
      
UPDATE dbo.TestRowversion1
SET    NewNotNullColumn = NewNotNullColumn;            
      
Note that if your software keeps the lastrowversion per table, and not at the full database level, you might just set that value back to the start of time 0x0000000000000000, which will perform a lot better! In fact, for a very large table, you may need to do a "chunked" update, just updating rows where the rowversion value is still NULL.

Checking the data:     

SELECT *
FROM    dbo.TestRowversion1;      
       
You will see the rowversion values have now been incremented:

TestRowversion1 RowVersion         NewNotNullColumn
--------------- ------------------ --------------------
1               0x00000000000007D7 Not Null
2               0x00000000000007D8 Not Null
3               0x00000000000007D9 Not Null

Now, let's drop the new column, and see if there is any change:     
 
ALTER TABLE dbo.TestRowversion1
    DROP DFLTTestRowversion1_NewNotNullColumn;

ALTER TABLE dbo.TestRowversion1
     DROP COLUMN NewNotNullColumn;      
       
Now, check the table:      
 
SELECT *
FROM   dbo.TestRowversion1;       
       
And you will see that the rowversions still match:

TestRowversion1 RowVersion
--------------- ------------------
1               0x00000000000007D7
2               0x00000000000007D8
3               0x00000000000007D9

This is, from a performance standpoint, expectable. You would not expect that they would want to change every single row in the table when adding or deleting a column. Especially for a nullable column I was not surprised, that the rowversion stayed the same. It is however, just something you need to realize when using rowversions for ETL (it would be the same if you created your own time based ETL datetime value as well).

Finally, what about when you add the RowVersion column to the table? This one is a bit more obvious than the previous case, since it obviously needs to grab a value to add it, but it never hurts to check it out.      
      
SELECT @@DBTS;      
      
INSERT INTO dbo.TestNoRowversionYet(TestNoRowversionYet)
VALUES(1),(2),(3);     

SELECT @@DBTS;      

Which returns the same value twice, since there is no change:

     
------------------
0x00000000000007D9

------------------
0x00000000000007D9

Now we add the rowversion:     
 
ALTER TABLE dbo.TestNoRowversionYet
    ADD RowVersion rowversion NOT NULL; --you can declare NULL, but it still behaves the same      

And it has changed for the database, and the columns have rowversion values:

SELECT @@DBTS;

SELECT *
FROM   dbo.TestNoRowversionYet;     

 

------------------
0x00000000000007DC

TestNoRowversionYet RowVersion
------------------- ------------------
1                   0x00000000000007DA
2                   0x00000000000007DB
3                   0x00000000000007DC

Using rowversion is a very easy way to build a fast, and reliable method of change detection, but it is not without its caveats, which could bite you hard if you do not realize what is occurring.      

Published Tuesday, September 26, 2017 7:49 PM by drsql
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Links to my other sites

Archives

Privacy Statement