THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jorg Klein

Microsoft Data Platform MVP from the Netherlands

SSIS - Package design pattern for loading a data warehouse - Part 2

You can read this blog post at this link: https://jorgklein.com/2010/02/02/ssis-package-design-pattern-for-loading-a-data-warehouse-part-2

This blog has moved to www.jorgklein.com There will be no further posts on sqlblog.com. Please update your feeds accordingly.

You can follow me on twitter: http://twitter.com/jorg__klein

Published Sunday, February 28, 2010 6:36 PM by jorg

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Marco Russo (SQLBI) said:

Jorg, did you take a look at the SQLBI Methodology (http://www.sqlbi.com/sqlbimethodology.aspx)?

We use the concept of OLTP Mirror as a first stage of data extraction and the staging is just a temporary area for processing data. We don't necessarily use SSIS to extract data, but for example we use SqlBulkTool (http://www.sqlbi.com/sqlbulktool.aspx - it's a free tool provided with source code), which is faster and handles partitioning too.

I'd like to get your feedback if you had a chance to read it.

Marco

March 1, 2010 2:55 AM
 

jorg said:

Hi Marco,

That looks very promising. I will definitely take a look at it and give you my feedback.

Thanks,

-Jorg

March 1, 2010 3:43 AM
 

Jorg Klein : SSIS ??? Package design pattern for loading a data warehouse said:

March 11, 2010 8:46 AM
 

Andre Vella said:

Hi Jorg,

Very interesting article. I found it very useful and I am designing an ETL process I am currently responsible of carrying out around this principle. Just one question or rather a different approach with regards to this second part...(or maybe i just misunderstood you completely)

What if instead of creating a seperate package for each source table, you use constraints in the appropriate and necessary manner to process some of the dimensions or fact tables which require the same source tables in a sequential manner. One might argue that it removes the 'parallelism feature' but I guess its just a different approach which I am considering.

If for example dimPlayer and dimCountry both require source table tPlayerDetails you set a dataflow in the EXTRACT container for the dimPlayer table and a sequence constraint to dimCountry. You omit the extract from tPlayerDetails in the dimCountry package  completely since it was carried out in the dimPlayer package.

Best Regards,

Andre

July 8, 2010 10:45 AM
 

jorg said:

Hi Andre,

Good to hear that you benefit from this approach!

I think your approach works fine when executing the complete ETL process but what if you want to unit test dimCountry, how do you extract the data from tPlayerDetails?

If I understand what you are saying this will not be possible without running the extract of dimPlayer by hand before running dimCountry. Of course this isn't a big problem, it's just a detail and your approach will work fine when deploying your solution to a production environment!

--Jorg

July 15, 2010 5:57 AM
 

Vinod said:

What is the purpose of staging data into tables, when you can directly Extract, data cleansing, Transformations and Load to Dims / Facts in a single Data Flow Task ?

December 31, 2010 6:17 AM
 

jorg said:

@Vinod:

Because you usually want to minimize the pressure on the source systems and want to keep the load window as small as possible. Next to these reasons it's usually easier to extract only the changed records when using a staging area that's identical to the source system.

But in some cases you can do everything in a single DFT, in other cases you want to use one or more staging areas. It just depends.

January 4, 2011 7:18 AM
 

Haseeb said:

I design Package which have many dimension whos data is very less can we club them into single package is that will be good approach.

Thanks,

Haseeb

February 27, 2012 1:09 PM
 

_steviec said:

I've been thinking about Vinod's comment about the purpose of going to the trouble of creating staging tables since I'm considering that too.

As a developer I'm really keen on TDD in that it's a documentation of intent and is a protection against future breaking changes. I think that both of those are hugely important in ETL and staging tables would allow a controlled set of data to be created and then transformed and loaded to a destination. The destination could then be inspected to see that the developers intentions had been met.

For instance, a slowly changing dimension could be tested by loading the staging tables, executing the T and L parts of a package, change the staging data and then re-running the package. A simple Sql script could inspect the target to ensure that the data has been loaded correctly.

Steve

May 1, 2012 10:48 AM
 

Roger said:

If your load into the DWH needs to update rows it will cause fragmentation in the destination table. Better is to prepare the data in the staging and merge source and destination together into a new table or partition which will be at the end 'switched in'.

November 25, 2012 11:34 AM
 

db042188 said:

As reported to our CIO who saw Jorg's solution:

His solution is intriguing and supposedly recommended by Kimball.  If Kimball agrees, we need to re read that chapter in the tool kit and perhaps take Jorg's recommendation seriously.

Unwieldiness will be a concern in real life scenarios.  Economies of scale too.  

Our ultimate goal of completely automating ETL code generation seems less (or at least equally "un") attainable in this approach.  But it seems that as Jorg gets more and more feedback, he is moving in a traditional direction anyway.

One concern here is a paradigm in which dims are extracted just like but separately from facts. Dims are a byproduct of facts when it comes to relevance and etl.  They are only business keys in the extract phase of etl.  They are easily extracted with new (and old if necessary) facts.  

It appears that maintenance is not diminished nor is the challenge involved when more than 1 person is making a change.  

March 5, 2013 1:04 PM
 

SImon said:

Do you have updated template with the new source tables package for me look at please.

February 26, 2014 3:59 PM
 

Perl said:

I have to voice my passion for your kindness giving support to those people that should have guidance on this important matter.

https://www.youtube.com/channel/UCbkuvjG855iiTjXioavxV4Q

March 16, 2017 10:32 AM
 

best apk download said:

thanks

April 8, 2017 3:38 AM
 

UFC 210 Live said:

April 8, 2017 3:59 PM
 

hdpunjab said:

April 10, 2017 7:31 AM
 

sony said:

April 19, 2017 12:45 AM
 

mothers said:

With a mom so generous and loving as you how could we ask for more! Thanks for always giving us the best. also check http://ihappymothersday2017.com/

April 20, 2017 10:06 AM
 

Eid Mubarak said:

Best Outfits For Young ladies , Womens  And Girls

We Have Some For You In Your Budget For more…..

<a href="https://eidmubarak.uk/"> Eid Mubarak </a>

April 25, 2017 7:22 AM
 

gREAT said:

<a href="http://ipliccchampionstrophy2017schedule.in/">ICC Champions Trophy 2017 Schedule</a>

<a href="http://ipliccchampionstrophy2017schedule.in/">Champions Trophy Timetable 2017</a>

<a href="https://firmwareflashfile.in/">Firmware Flash Files Download</a>

April 25, 2017 9:58 AM
 

neet result 2017 said:

May 3, 2017 12:28 PM
 

dooiitt said:

Exclusive Collection of Salwar Suit and Many More…

We Have Some For You In Your Budget For more…

Plz visit:- <a href= "www.dooiitt.com">Designer Sarees</a>

May 12, 2017 8:36 AM
 

debugging said:

May 13, 2017 9:17 AM
 

debugging said:

May 16, 2017 9:09 PM
 

cbse result 2017 said:

<a href="http://www.memorialdayquotesx.com/2017/05/memorial-day-quotes-by-ronald-reagan-best-collections.html">memorial quotes</a>

<a href="http://www.memorialdayquotesx.com/2017/04/memorial-day-thank-you-quotes-and-sayings.html">memorial day quotes</a>

<a href="http://www.memorialdayquotesx.com/2017/04/memorial-day-thank-you-quotes-and-sayings.html">Happy Memorial Day Quotes 2017</a>

<a href="http://www.memorialdayquotesx.com/2017/04/memorial-day-thank-you-quotes-and-sayings.html">memorial day images</a>

<a href="http://www.memorialdayquotesx.com/2017/04/best-memorial-day-poems-quotes-2017.html">memorial day poems</a>

<a href="http://www.memorialdayquotesx.com/2017/05/memorial-day-quotes-by-ronald-reagan-best-collections.html">memorial day parade</a>

<a href="http://www.memorialdayquotesx.com/2017/04/memorial-day-thank-you-quotes-and-sayings.html">memorial day pictures</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-10th-results-2017-with-marks.html">cbse 10th result 2017</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-10th-results-2017-with-marks.html">cbse result</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-10th-results-2017-with-Marks.html">10th result 2017</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-results-2017-cbse-examination.html">cbse 12th result 2017</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-board-10-and-12-class-exam-result-2017.html">cbseresults nic in 2017</a>

<a href="http://www.cbseboardresultnic.in/2017/03/cbse-results-2017-cbse-examination.html">cbse result 2017</a>

May 21, 2017 9:40 AM
 

aaaa said:

June 21, 2017 10:09 PM
 

varsha khatri said:

nice post....<a title="xat 2018 hall ticket" href="http://gogyani.com/xat-admit-card/">xat 2018 hall ticket</a>

<a title="exam name admit card" href="http://gogyani.com/xat-admit-card/">exam name admit card</a>

<a title="xat 2018 result" href="http://gogyani.com/xat-result/">xat 2018 result</a>

<a title="exam name result" href="http://gogyani.com/xat-result/">exam name result</a>

July 22, 2017 6:44 AM
 

Classy Bajuband said:

Excellent blog.thank you.

<a href="https://www.jewellerkaka.com/">Online Bajuband</a>

July 24, 2017 7:25 AM
 

KUSERA said:

What is the purpose of staging data into tables, when you can directly Extract, data cleansing, Transformations and Load to Dims / Facts in a single Data Flow Task ?IF YOU LIKE TO SHARE FRASES HERE IT IS FOR YOU: frases-de-canserbero.blogspot.com

July 28, 2017 2:26 PM

Leave a Comment

(required) 
(required) 
Submit

About jorg

Jorg Klein, Microsoft Data Platform MVP from the Netherlands.
Privacy Statement