THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

Alternative approach for SCD dimensions in SSIS?

Tonight I've studied how to improve performance of SCD transformation task of SSIS. I started from the consideration that in an actual SSIS package, the dimension processing is at least one magnitude slower than fact table processing; most of the time is consumed by SCD transformation.

A simple SQL Profiler session showed me how SCD works processing data: for each processed row, a sp_executesql is called with a query that have business key fields as parameter. While the query is executed very fast on the server, the throughput is not very good because we have the latency of communication between different processes (and may be different servers). Client CPU is consumed more than server CPU, probably due directly or indirectly to the sql query string construction (it's a SELECT field, field, field, ... FROM table WHERE businesskey = @P1".

Now, when you have a SCD you could also have a timestamp column on original data that helps you to extract and process only good changing candidates for dimension processing. But when this is not the case, the whole dimension is reprocessed against the last processed dimension, just to find new or changed attribute. This scenario is very common when data are provided by legacy system that are out of our control (otherwise we would have added the timestamp column months before...). It could be wonderful if SCD transformation would have an optional input flow to provide "lookup" sorted data, working in a way similar to the merge task. The wizard could use the business keys as lookup/join condition and SCD would not have to use sql query anymore just to solve the SCD condition to process. Unfortunately this is not the case, so by now the only option is to make your own SCD processing by hand using basic components. But this topic deserves attention by SSIS development team, because this kind of operations are very very common in a data warehouse environment.

Published Wednesday, February 08, 2006 6:33 PM by Marco Russo (SQLBI)

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement