THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Did You Know: Specifying Checkpoint Duration

I realize probably no one reads Books Online cover to cover, so there are probably lots of little details that easily slip by you. I was introduced to one of the hidden 'features' just last week. Here it is:

CHECKPOINT can take a parameter!

It's right there in BOL:

 CHECKPOINT [ checkpoint_duration ]

This parameter allows you to specify how much time (in seconds) SQL Server should take to complete the checkpoint. If you specify a smaller value, SQL Server will apply more resources to completing the checkpoint, and if you specify a larger number, SQL Server will apply fewer resources. At first glance, this seems like a very cool feature. But then I realized there was very little practical use for it, for these reasons at least: 

1) There is no way to know how much time a checkpoint should normally take, so giving it a number to use more or fewer resources is just shooting in the dark.

2) This only applies to the manual checkpoint, not SQL Server's automatic checkpoint. So if you're experiencing a slight decline in throughput every minute or so when the automatic checkpoint starts running, you can't just ask SQL Server to slow down and release some of the resources.

Maybe I'm missing something, but this seems like one of those features that was added 'just because they could'.  I'd be interested in hearing your ideas regarding the potential usefulness of this capability.

But really, this post was about the fact that I'd never looked at the page for the CHECKPOINT command, and never realized there were any options at all.

 

~Kalen

Published Friday, August 29, 2008 8:55 PM by Kalen Delaney

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Karol Papaj said:

Kalen, first time I heard about tuning checkpoint process by reading whitepaper "High Availability for SQL Server 2005 using array-based replication and host-based mirroring technologies" produced by HP.

You can read it at

http://wp.bitpipe.com/resource/org_1000733242_857/HP_High%20Availability%20for%20SQL%20Server_Asset%209.pdf

August 30, 2008 6:17 AM
 

jchang said:

I suppose this is for the command queuing capability of storage systems. The more that is dumped into the queue, the higher the throughput to a point, at the expensive of higher latency. The trick is then to not swamp the storage system with such high read latency that transactions are effectively shutdown or degraded beyond a tolerance point. I thought SQL 2005 and later automatically throttled this, a MS rep told me this was in 2000, but if so, it was badly implemented. One matter with trying to throttle based on latency is that the throughput vs latency curve might have a sharp knee, so if you are over alittle too far, latency incurs a huge spike. My thought is the parameter should be queue depth. The simple rule is set queue depth to 1X the number of disks for minimum transaction impact, but poor checkpoint speed, 2X for so impact, but decent checkpoint speed and 4X for good checkpoint speed but significant tx impact. Another possibilty is to make the parameter in MB, ie, set it to just below the storage system write cache. Then SQL knows it can dump so much, let the storage RAID controller write it, and dump more. Give me the parameter options and I can measure which is more effective

August 30, 2008 2:12 PM
 

James Luetkehoelter said:

Hi Kalen,

Would you agree that the Checkpoint command, or changing the recovery interval should generally be avoided unless you're dealing with a specific issue?

That's what I usually tell people, especially those with an Oracle background where the temptation to "tune" checkpointing is very common.

August 30, 2008 9:24 PM
 

Kalen Delaney said:

Karol and Joe...thanks for the comments! I had thought since you could only specify the parameter with a manual checkpoint, and since automatic checkpoint normally ran every minute, there wasn't much point. But if you change your recovery interval to something much less frequent, and then create a job to run manual checkpoint with a long duration, you can really spread the work of checkpoint out.

James, I think this is true for most configuration options. Leave them at the default, unless you have a reason to change them!

~Kalen

August 31, 2008 9:30 PM
 

Marcos Kirchner said:

Hi Kalen,

I have already seen this parameter in BOL, it's documented starting on SQL 2K5.

I believe it was around at 2K, since I saw some older TPC-C disclosure reports and remembered seeing something like

CHECKPOINT 1740

TPC-C tests require that at least 4 checkpoint completed during the measurement interval (usually 2h), and they used the parameter to throttle the process.

Also, they set recovery interval to the maximum allowed value, so that automatic checkpoint wouldn't kick in.

I'm not sure about new TPC-E tests, but I think the manual checkpoint is also used.

September 1, 2008 7:26 AM
 

Gaurav Bindlish said:

This parameter should be used with caution.

In one of my internal tests, with a high load of short running transactions, I noticed that specifying a interval of 60 seconds pegged CPU for 60 seconds and was responsible for increasing latency.

However when I used 1 second interval, the checkpoint finished much quickly and didn't cause any more latency issues than 60 seconds scenario, with the difference that the checkpoint and hence latency issues lasted for much less time.

September 2, 2008 2:13 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement