THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

A Race Condition in the Simple Recovery Mode: Why is my log autogrowing?

“My database is in the simple recovery mode. No, I don’t have any large, long-running, or replication transactions, and yes, I have enough number of VLFs to go around, about 250 of them to be precise. Why does my transaction log keep autogrowing?”

 

As documented in SQL Server Books Online, when a database is the simple recovery mode, its transaction log is truncated (i.e. all the virtual log files that do not contain any active transaction record are marked for reuse) whenever the checkpoint thread is run. SQL server wakes up the checkpoint process when, among other things, (1) there are enough log records that the recovery interval threshold is crossed, (2) you manually run the checkpoint command, or (3) the transaction log gets 70% full.

 

Even without any large, long-running, or replication transactions, it is quite possible for a database in the simple recovery mode to run out of the log space, or cause the log to autogrow if it’s enabled, when the checkpoint process can’t be woken up fast enough to truncate the log in time.

 

For instance, even though there are not enough changes to cross the recovery interval threshold, but if the changes after the log is 70% used are fast, SQL Server may not have enough time to run a checkpoint and complete the log truncation before the log space is fully used up. In a way, this is a race condition in that whether the log space will be used up (or autogrow) depends on which one gets there first:

 

  1. If SQL Server can truncate the log first, the log space will be freed up and reused, and there will be no autogrow or failed transaction,
  2. But it is also possible that the data changes are fast enough to use up the log space before the log is truncated. 

Let’s illustrate this behavior with a script. First run the following setup script to create a test table and populate some data in a test database (1) that is already set to the simple recovery mode, (2) the log file size is 100MB, (3) and the log file autogrow is disabled (it's just easier to see the out-of-log-space error than check for autogrowth):

 

set nocount on

go

drop table test

go

create table test(i int, c char(1000))

go

declare @i int

set @i = 1

while @i < 10000

begin

      insert test values(@i, 'abc')

      set @i = @i + 1

end

 

To see the race condition, run the following script;

 

set nocount on

go

declare @change_size int

set @change_size = 100  -- adjust this value

declare @i int

set @i = 1

 

while @i < 100

begin

     if @i % 2 = 0

       update test set c = replicate('a', @change_size)

     else

       update test set c = replicate('b', @change_size)

     select @i = @i + 1

end

 

You may need to adjust the value of variable @change_size several times. Start a small value (say 40) that enables the script to run to completion, then pick a large value (say 400) that causes the script to fail. And work towards to the middle. In some of my tests, when I set @change_size 100, I can run the above script to completion without running out of the log space. But if I change it to 120, I get an Msg 9002 error telling me that the transaction log for the database is full.

 

When I set @change_size to 116, sometimes the script will complete, and other times it'll fail. But this is a rough threshold, it’s not precise or stable across tests or environments. For instance, in my other tests, the threhold can as low as 50. The point is that sometimes SQL Server can truncate the log fast enough, and other times it can't.

 

Note that even though each of the two UPDATE statements modifies 9999 rows, the amount of the log space used by the modification is modest at about 2~3MB or 2~3% of the total log space. They are not tiny transactions, but they are not large, long-running transactions either.

 

What are the implications for databases that are in the simple recovery mode?

 

First of all, it’s good to leave ample space in the transaction log. This may give SQL Server a chance to wake up the checkpoint process to truncate the log in time. Secondly, you may have to issue BACKUP LOG WITH NO_LOG manually to help it out.

 

Note that in SQL Server 2008, BACKUP LOG WITH NO_LOG and BACKUP LOG WITH TRUNCATE_ONLY are removed. If you must help truncate the log, you probably would have to resort to issuing a checkpoint which does much more than just truncating the log.

Published Sunday, April 12, 2009 12:51 PM by Linchi Shea
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Mike Walsh said:

Great Post, Linchi -

I am a big fan of properly sizing objects at their onset. Better on a lot of file systems, no cost of growth (even with Instant File Initialization, which doesn't help transaction log growth anyway) and better planning always leads to a better design and less reactionary problems. Good illustration of how growth can still happen.

I wonder if even in 2005 it would be fine to just do a checkpoint instead of a truncate? Yes Checkpoint is doing more and checkpoints aren't "free" but the backup log truncate_only or no log isn't free either, right? Ideally you have enough space and the detrimental growth doesn't happen but it still can occur.

April 12, 2009 2:47 PM
 

Linchi Shea said:

In an earlier post , I showed that you could run out the log space or force the log file to autogrow

May 21, 2009 12:50 AM
 

jackson christian said:

cool article ....

June 19, 2009 3:40 AM
 

Csaba Toth said:

The cause of log growth can be open transaction, and in some cases leftover CDC objects:

http://www.mssqltips.com/sqlservertip/3003/manual-cleanup-change-data-capture-for-a-sql-server-database/

February 14, 2014 1:08 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement