THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Why I don't use SSIS checkpoint files

In a recent discussion in regard to general ETL best practises the subject of checkpoint files as a means for package restartability came up and I stated that I was dead against using them. For anyone that may care, here is why:

  • Configuring them is distinctly unintuitive (that's a matter of opinion but if you follow the link I'll wager that you will agree)
  • they don't make any allowance for loop iterations
  • they cannot store variables of type Object
  • they are limited in ability. There are many scenarios where you may want to execute certain containers regardless of whether the package is started from a checkpoint file but the current usage model does not allow for this.
  • they are ignored by eventhandlers, which wouldn't be so bad if there were a way to toggle this behaviour
  • in certain scenarios they dont work properly

I'll expand on the last bullet point. I have encountered situations where the behaviour for tasks executing concurrently is unpredictable. That is, sometimes the completion of a task that executes concurrently with a failed/failing task will make it into the checkpoint file and sometimes it won't. This is near-impossible to reproduce but it does happen as my good friend John Welch will hopefully concur (if he is reading).

Is anyone out there making successful use of checkpoint files within SSIS? I would be interested in knowing about that if so.

@Jamiet

Published Thursday, March 10, 2011 5:27 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Doug Lane said:

I gave checkpoint files a try several months ago and the deal breaker for me was its failure to integrate well with event handlers.  I did a lot of searching for how to get the two to function together but found no success stories anywhere.  I gave up on the checkpoint files, figuring it was better to have to restart the package altogether than to guess where it went wrong.

March 10, 2011 1:06 PM
 

Todd McDermid said:

I'd be interested to know - and I'm sure the SSIS team would be too.  It'd be much better to deprecate that feature entirely if it can't be made full-featured and robust in order to remove the incorrect perception of those adopting SSIS that it is...

March 10, 2011 1:07 PM
 

Steve said:

Yes, we use them successfully, in a few areas. One in particular is where we have an ETL that is just step by step by step in a package, no branching, etc. Works well, we use it with retries up to 10 times and the fail as the process we are doing is dependent on an external file/exe in some cases. Works well.

Other areas, use it to retry on failure and pick up at the place we left off, seems to work well for a few of the ETL's we have.

March 10, 2011 1:11 PM
 

Sudeep Raj said:

I was just thinking of posting a discussion on SSIS MSDN forum on the very same topic. I do not see much use of Check point as it cannot cater to all the scenarios. And for production jobs it is not really worth while.

March 10, 2011 1:16 PM
 

John Welch said:

Completely agree with everything Jamie posted above, and I can collaborate his experiences with concurrent tasks.

March 10, 2011 1:37 PM
 

Rob Farley said:

If I use checkpoint files (and it's got to be very specific), then I never use CheckpointUsage set to IfExists. I set it to be either Never or Always (set at run-time).

Never for the first attempt of the night. Always for subsequent ones. If the package finished happily, I don't want it to restart on the retries. Also, if it didn't finish properly the day before, I (generally) want it to start from the beginning.

It's the one setting that I feel is okay to use with /SET.

But yes, they need work to make them a proper feature.

March 10, 2011 6:13 PM
 

Eric Wisdahl said:

I'll also agree with Jamie.  For a long time I advocated checkpoint usage.  Now, after working several projects where there was just not any way to get them to work, I can't say that I would recommend using them at all without some major changes to the core of how it is that they work.  Of course, the main projects that used them and failed miserably had complex logic, with branching, object variables, and loops.  For simple ETL they may work for you.  But, in my opinion, you don't want to standardize on something that only works in some of the time.

March 11, 2011 8:58 AM
 

andyleonard said:

+1 Jamie,

  I find checkpoint files a cool idea, but implementing them is tricky. If you have a single-thread, straightforward Control Flow, they work well. Anything parallel or looping, results are unpredictable.

:{>

March 12, 2011 9:53 AM
 

Andy P said:

I have also had the same experience as Jamie, i.e. Checkpoints behave unpredictably with concurrent tasks or loops. Like Jamie and others above, I decided simply to avoid ever using them.

March 15, 2011 7:02 AM
 

P Williams said:

Agreed Jamie.

When I first saw checkpoint files in SSIS I thought what a great feature and used them a lot. Then as my packages started to become more complex and larger I found they didn't work as I expected, particularly with regards to transactions and looping.

Some of the functionality is completely unintuitive and I soon stopped using them and haven't used them since.

This is a shame as the concept particularly for large data loads is a good idea but MS really need to re-evaluate how they work within SSIS (someone wrote a wonderful article a while back for sqlservercentral.com I believe which highlighted a lot of the drawbacks).

March 18, 2011 6:49 AM
 

Matt said:

Agree with Steve, seems to work well for those packages that execute a single sequence of steps - nothing in parallel.  In that situation, they are really useful.  Hopefully a better implementation will be included in the next release...  I certainly would not use them in any other type of package given the current limitations.

June 3, 2011 10:31 AM
 

Martina White said:

I came across a good article on a workaround for using Checkpoints with Sequence Containers.  If you want to restart the container if any of the tasks within the container fail (including concurrent tasks) you can  get around the funky Checkpoint behaviour by using a For Loop that has only one iteration.

http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/65809/

June 3, 2011 2:14 PM
 

Munish Bansal said:

Also behaviour is completely unpredictable while using transactions along with checkpoints. For example, it does not allow using transactionoption as Required on Package level while using checkpoints.

Also it restarts the package from the very begining if any of the task having transactionoption as required fails. Again very hard to reproduce all these.

July 29, 2011 12:46 AM
 

SSIS Junkie said:

In December 2012 Microsoft published a whitepaper entitled SSIS Operational and Tuning Guide which you

January 4, 2013 8:28 AM
 

naresh said:

can any one pls answer my question alternative of checkpoints

April 29, 2013 3:41 AM
 

Jamie Thomson said:

On 31st March 2014 I released version 1.2.0.0 of SSIS Reporting Pack , my open source project that aims

April 12, 2014 5:28 PM
 

Jamie Thomson said:

I have maintained a watching brief on what I refer to as “cloud ETL”, that is the ability build ETL routines

April 24, 2014 7:42 AM
 

Azure Automation ??? the beginning of cloud ETL on Azure? | TGS said:

May 5, 2014 3:19 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement