THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is an author and engineer who enjoys building and automating data integration solutions. Andy is co-host of the Data Driven podcast. Andy is no longer updating this blog. His current blog is AndyLeonard.blog.

Introducing Change Data Capture, SSIS, and SQL Server 2008 CTP5 (Nov 2007)

This blog has moved! You can find this content at the following new location:

http://andyleonard.blog/2008/02/09/introducing-change-data-capture-ssis-and-sql-server-2008-ctp5-nov-2007/

Published Saturday, February 9, 2008 6:57 AM by andyleonard

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Tod McKenna said:

Great post Andy. CDC is a critical part of data integration and it is good to see that SQL Server 2008 has methods to implement it. As MS has been bragging about this for a while, its pretty cool to see an actual example using the technology. I'll have to try this out later...

For years, I've used an approach that relies heavily on staging tables, SQL, and even the file system. I have posted a lot of this code/theory on my blog Tod means Fox (blog.todmeansfox.com). It will be quite interesting for me to go back and revisit some of my old techniques to see how the new CDC abilities of SQL Server 2008 can help. The reality is that source systems tend not to be SQL Server, so it may be necessary to first import all source data into a SQL Server staging database and then process CDC between the stage and the destination. In some of my scenarios, this would be an extra step.

Anyway, thanks for the time/info/example!

-Tod

February 9, 2008 8:20 AM
 

Shan said:

Have you managed to get this to run on RC0 yet. MS seems to have changed the SP names etc.

July 9, 2008 1:42 AM
 

andyleonard said:

Hi Shan,

  Not yet. I am working on it this week.

  I'm not surprised they renamed the stored procedures. They were renamed between CTP5 and CTP6 too.

:{> Andy

July 9, 2008 5:31 AM
 

Dodiet said:

I want to ask you, how about if we have 3 primary key on our table AccountNo, Month, Year...if there is a row with same AccountNo, Month, Year, SSIS will update it, and if there is a row with same AccountNo but with different month and year it will insert how write the "conditional split" condition...???

Thx

Dodiet

August 4, 2008 3:26 AM
 

andyleonard said:

Hi Dodiet,

  Compound primary keys like the one you describe are common in source data. Although I do not demonstrate it here, the principle for detection and loading is the same.

  Together, these fields comprise the "business key" and must be treated - together - like the source's ContactID in this post. There are a few ways to accomplish this. The quickest (and easiest to understand) is to convert the values to varchar and concatenate the values, separated by a delimiter (I usually use a pipe character "|").

  You would then use this field to uniquely identify rows in the source and destinations: left joining on <AccountNo>|<Month>|<Year>. You would also leave these columns out of the update statement's SET clause. You do not update business keys. You update everything except business keys. This is true in the example: note the ContactID is used in the WHERE clause but not in the list of fields in the SET clause.

  One thing to note: concatenation is the simplest method to understand compound unique keys - but it is also the worst performing method. For larger data loads you will want to explore other means of hashing the data. I encourage you to search for articles about data profiling as this is one use of data profiling.

Hope this helps,

Andy

August 4, 2008 9:40 AM
 

Andrei Ignat said:

See also CDC Helper  - a GUI for CDC at

http://www.codeplex.com/CDCHelper

August 6, 2008 4:36 AM
 

Andrew said:

I have a similar setup to this, however I am using an xml file as my input prior to the conditional split. In my case the xml rows are presorted into inserts, updates and deletes and then bulk fired through the conditions. This has an undesired effect in that if I Insert a REcord, Delete that record and then reinsert it again, the SSIS Package deals with all Inserts first and then all deletes meaning you lose the final insert.  My question is, can you instruct it to process by row and not resort into groups?

July 2, 2009 5:52 AM
 

Nirmal Kumar said:

I need a sample dtsx package for the above instructed package so that I can check where went wrong.

Regards,

Nirmal

June 14, 2011 9:12 AM
 

HsInAz said:

I am planning to use this excellent post for a current project I am working on. My platform is still going to be SQL Server 2008, but given that this post wans written in 2008, I am wondering if all of this is still valid and up to date.

June 1, 2012 4:51 PM
 

Gul said:

I have same situation  for example   i have   Table A which is source

if A table have new Records .. i want something like Email notification when package fail or Run . something like that when new data is coming in source i received email that new data is inserted  

may be my question is clear .

Thanks & Regards

January 14, 2015 9:25 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

News

My Latest Book:

Community Awards



Friend of Red Gate

Contact Me

Archives

Privacy Statement