THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Michael Coles: Sergeant SQL

SQL Server development, news and information from the front lines

High Performance Dimensional Data Loads With SSIS Presentation

Just finished giving the SSIS High-Performance Dimensional Data Load presentation at SQLSaturday #59 NYC.  Here are the slides in PDF format.  I'll upload the Try-N-Save code and sample data later for attendees to play with.

Thanks to everyone who attended my session and thanks to Melissa D. and NJSQL for putting this together.  For those who are interested in Alejandro Mesa's composable DML solution to the problem of Type 2 dimension updates, here's the complete statement from the demo:

INSERT INTO Dim.Geography_Hash

 CityName, CountyFIPS, CountyName, StateFIPS,
 StateUSPS, StateName, ZIP, TimeOffset,
 DaylightSavingTime, StartDateID, CurrentFlag, BatchID,
 LineageID, CubeInd, SortOrder, Hash
)
SELECT CityName, CountyFIPS, CountyName, StateFIPS,
 StateUSPS, StateName, ZIP, TimeOffset,
 DaylightSavingTime, StartDateID, CurrentFlag, BatchID,
 LineageID, CubeInd, SortOrder, Hash
FROM
(
 MERGE INTO Dim.Geography_Hash AS Target
 USING Staging.Geography_Hash AS Source
 ON Target.ZIP = Source.ZIP
  AND Target.CurrentFlag = Source.CurrentFlag
 WHEN MATCHED AND Target.Hash <> Source.Hash
  THEN UPDATE SET CurrentFlag = 'N'
 WHEN NOT MATCHED
  THEN INSERT
  (
   CityName, CountyFIPS, CountyName, StateFIPS,
   StateUSPS, StateName, ZIP, TimeOffset,
   DaylightSavingTime, StartDateID, CurrentFlag, BatchID,
   LineageID, CubeInd, SortOrder, Hash
  )
  VALUES
  (
   Source.CityName, Source.CountyFIPS, Source.CountyName, Source.StateFIPS,
   Source.StateUSPS, Source.StateName, Source.ZIP, Source.TimeOffset,
   Source.DaylightSavingTime, Source.StartDateID, Source.CurrentFlag, Source.BatchID,
   Source.LineageID, Source.CubeInd, Source.SortOrder, Source.Hash
  )
 OUTPUT $action, inserted.CityName, inserted.CountyFIPS, inserted.CountyName, inserted.StateFIPS,
   inserted.StateUSPS, inserted.StateName, inserted.ZIP, inserted.TimeOffset,
   inserted.DaylightSavingTime, inserted.StartDateID, inserted.CurrentFlag, inserted.BatchID,
   inserted.LineageID, inserted.CubeInd, inserted.SortOrder, inserted.Hash
)
AS T
(
 action, CityName, CountyFIPS, CountyName, StateFIPS,
 StateUSPS, StateName, ZIP, TimeOffset,
 DaylightSavingTime, StartDateID, CurrentFlag, BatchID,
 LineageID, CubeInd, SortOrder, Hash
)
WHERE action = 'UPDATE';

As mentioned, the OUTPUT clause on the inner MERGE statement feeds the outer INSERT clause.  Next stop is SQLSaturday #61 in DC at the beginning of December.

Published Saturday, November 20, 2010 2:25 PM by Mike C

Attachment(s): High-Speed-Dimensional-SSIS-One-Hour.zip

Comments

 

Gary Mazzone said:

Michael

A good presentation on SQL Staturday NYC.  I enjoyed and learned some new things.

Any change you could post the SSIS package you were using as the demo?

Thanks for the book also.

Gary

November 22, 2010 11:43 AM
 

Michael said:

Glad you found it useful.  The packages and some sample data are available at http://cid-8f7e6c950afc6b3e.office.live.com/self.aspx/.Public/Presentation/TryNSave.zip

The Try-N-Save database I used is a little more complicated to upload.  I'll have to play around with scripting it out when I have time (in addition to DDL there are some tables that need to be prepopulated).

November 28, 2010 11:51 PM
New Comments to this post are disabled

This Blog

Syndication

News

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement