In a recent discussion in regard to general ETL best practises the subject of checkpoint files as a means for package restartability came up and I stated that I was dead against using them. For anyone that may care, here is why:
- Configuring them is distinctly unintuitive (that's a matter of opinion but if you follow the link I'll wager that you will agree)
- they don't make any allowance for loop iterations
- they cannot store variables of type Object
- they are limited in ability. There are many scenarios where you may want to execute certain containers regardless of whether the package is started from a checkpoint file but the current usage model does not allow for this.
- they are ignored by eventhandlers, which wouldn't be so bad if there were a way to toggle this behaviour
- in certain scenarios they dont work properly
I'll expand on the last bullet point. I have encountered situations where the behaviour for tasks executing concurrently is unpredictable. That is, sometimes the completion of a task that executes concurrently with a failed/failing task will make it into the checkpoint file and sometimes it won't. This is near-impossible to reproduce but it does happen as my good friend John Welch will hopefully concur (if he is reading).
Is anyone out there making successful use of checkpoint files within SSIS? I would be interested in knowing about that if so.