THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Geek City: Lost Identity

Way long ago, when Sybase first added the IDENTITY property, it took people quite a while to get the hang of using it. Along with being able to automatically generate sequential numbers for each row, we had to tune a configuration option with the geekiest of names: “Identity Burning Set Factor”.  And figuring out how to use this value seemed to be one of the dark arts, so almost everyone left it at the default. (If you really want to know how to use the burning set, you can read the Sybase docs here.) The question of WHY we needed to even think about a burning set was never even an issue for most people.

So when Microsoft included IDENTITY columns with no requirement of defining a burning set, it was a big relief. This isn’t to say that there were no issues with using IDENTITY columns at all. I still had to explain to students over and over that even with IDENTITY there was no guarantee of no gaps and no guarantee that you would never get duplicates. Basically, gaps could happen if you rolled back INSERT operations (or explicitly deleted rows) and duplicates could occur if you set the table property IDENTITY_INSERT to ON.  But once people understood this, everything else was pretty straightforward. Until now, with SQL Server 2012.

I had actually started hearing strange reports of unexpected gaps in IDENTITY values quite a while ago, but the cases seemed to be not reproducible and certainly not predictable. But just recently, I came across the explanation and realized the reason it seemed so unpredictable was because there is a combination of conditions that both have to occur to see this particular behavior.

It is true that there is a change of IDENTITY behavior in SQL Server 2012, to allow for greater throughput in heavy insert environments. In earlier versions, as each identity value was generated, it had to be logged separately. This meant that even for minimally logged operations, like SELECT INTO, every row inserted had to be logged separately. I blogged about this behavior here. But in 2012, identity values are generated in batches and only the maximum value of the IDENTITY column in the batch is logged. So minimally logged operations on tables with IDENTITY columns could be more efficient, but I still haven’t got around to actually testing that yet. But a related change is that if there is a server failure, SQL Server has to be sure it won’t reuse an IDENTITY value. If SQL Server was stopped without a CHECKPOINT, on restart the engine will add to the maximum value of the IDENTITY that it is aware of to set a new starting point for each table, thus potentially leaving a gap in the sequence. The size of the gap is dependent on the data type of the column. The value will be increased by 10 for tinyint, 100 for smallint, 1000 for int, and 10000 for bigint. Numeric and decimal increase their values based on the precision, but it seems the maximum is 10,000.

If SQL Server is stopped ‘gracefully’, with a CHECKPOINT performed, there should be no gap. Ideally, a checkpoint is always issued for a service shutdown, unless you use the T-SQL command SHUTDOWN WITH NOWAIT, or if there is a crash, but apparently, there is a bug. It seems that stopping your SQL Server using Configuration Manager, or even the service control in SQL Server Management Studio will not perform a checkpoint. To make sure you get the checkpoint you can use SQL Server’s SHUTDOWN command (without the WITH NOWAIT option!) or of course, manually issue a CHECKPOINT before shutting down. So only crashes could cause these gaps.

---------------------------------------

The gaps on restart only happen because SQL Server assigns IDENTITY values in blocks. You can choose to revert to the old behavior that allocated IDENTITY values one at a time, and logged each row even with minimally logged operations. To do this, you need to add trace flag 272  to your SQL Server startup parameters, set through the Configuration Manager.  Unofficial reports state that you have to use a lower case t to specify the flag, i.e. –t272 and NOT –T272.

enable traceflag

 

 

If you want to observe this gappy behavior yourself, you can run the following code. I tested it on SQL Server 2012 SP1, running on Windows Server 2008R2 SP1. I noticed that I didn’t get any gaps with only a single row in the table, after I restarted. I needed at least two rows. I used a database called testdb, but you can use any testing database you have.

 

USE testdb;
SET NOCOUNT ON;
GO

IF OBJECT_ID('ident1') IS NOT NULL DROP TABLE ident1;
GO
IF OBJECT_ID('ident2') IS NOT NULL DROP TABLE ident2;
GO
IF OBJECT_ID('ident3') IS NOT NULL DROP TABLE ident3;
GO
IF OBJECT_ID('ident4') IS NOT NULL DROP TABLE ident4;
GO
IF OBJECT_ID('ident5') IS NOT NULL DROP TABLE ident5;
GO

-- Create 5 tables with different data types for the identity column
CREATE TABLE ident1
(col_tinyint tinyint IDENTITY);
GO
CREATE TABLE ident2
(col_smallint smallint IDENTITY);
GO
CREATE TABLE ident3
(col_int int IDENTITY);
GO
CREATE TABLE ident4
(col_bigint bigint IDENTITY);
GO
CREATE TABLE ident5
(col_numeric30 numeric (30,0) IDENTITY);
GO

-- Insert 2 rows into each table
INSERT INTO ident1 DEFAULT VALUES;
INSERT INTO ident2 DEFAULT VALUES;
INSERT INTO ident3 DEFAULT VALUES;
INSERT INTO ident4 DEFAULT VALUES;
INSERT INTO ident5 DEFAULT VALUES;
GO
INSERT INTO ident1 DEFAULT VALUES;
INSERT INTO ident2 DEFAULT VALUES;
INSERT INTO ident3 DEFAULT VALUES;
INSERT INTO ident4 DEFAULT VALUES;
INSERT INTO ident5 DEFAULT VALUES;
GO

-- uncomment the checkpoint for the second test;
-- CHECKPOINT;

--- Shutdown your SQL Server using the SHUTDOWN WITH NOWAIT command, the configuration manager, or SQL Server Management Studio.

-- Restart your SQL Server

INSERT INTO ident1 DEFAULT VALUES;
INSERT INTO ident2 DEFAULT VALUES;
INSERT INTO ident3 DEFAULT VALUES;
INSERT INTO ident4 DEFAULT VALUES;
INSERT INTO ident5 DEFAULT VALUES;
GO

SELECT * FROM ident1;
SELECT * FROM ident2;
SELECT * FROM ident3;
SELECT * FROM ident4;
SELECT * FROM ident5;
GO

And here are the results I got when I restarted after shutting down using SHUTDOWN WITH NOWAIT:

Gaps

 

You can run the test a second time, running a checkpoint right before shutdown to see that no gaps are left.

Of course, the issue of WHY a checkpoint is automatically performed before a controlled service stop is another issue, and hopefully there will be a fix for that soon.

But even once it’s fixed, uncontrolled shutdowns will still leave gaps and you need to be aware of this and not expect your IDENTITY values to always be perfectly contiguous.

 

I hope this is useful!

~Kalen

Published Tuesday, June 17, 2014 5:14 PM by Kalen Delaney
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Chris Wood said:

Ran into this situation not long ago. Had a process use the identity and then save it as a key. Unfortunately the key in the table only allowed 9999 so jumping by 1000 caused a lot of concern. Changed the process to look at the last saved key and then add 1 to that value for the new row.

Chris

June 17, 2014 9:35 PM
 

Adam Machanic said:

@Chris

I hope you didn't introduce a potential concurrency issue (or race condition) with that solution. To properly implement it and avoid the race condition you need to be very careful with locks and serialization. This might inhibit the ability to scale your workload.

Of course if you only have 9999 values, as I think you've indicated, then maybe it's a nonissue. But just wanted to put it out there as a "word to the wise."

--Adam

June 18, 2014 12:45 PM
 

Sergey Berezniker said:

I see the same behavior of IDENTITY next value jumping 1000 (int) for AlwaysON Availability Group manual failover. I would expect the same result for auto failover as well. I'm wondering if it's intended behavior or a bug?

June 19, 2014 7:32 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement