<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tags 'ETL' and 'T-SQL Tuesday'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=ETL,T-SQL+Tuesday&amp;orTags=0</link><description>Search results matching tags 'ETL' and 'T-SQL Tuesday'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>T-SQL Tuesday: Aggregations in SSIS</title><link>http://sqlblog.com/blogs/andy_leonard/archive/2011/03/08/t-sql-tuesday-aggregations-in-ssis.aspx</link><pubDate>Tue, 08 Mar 2011 19:00:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:33966</guid><dc:creator>andyleonard</dc:creator><description>&lt;A href="http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/come-one-come-all-to" target=_blank&gt;&lt;IMG border=0 src="http://blogs.lessthandot.com/media/blogs/DataMgmt/olap_1.gif"&gt;&lt;/A&gt;&amp;nbsp; 
&lt;P&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Jes Borland (&lt;A href="http://blogs.lessthandot.com/index.php?disp=authdir&amp;amp;author=420" target=_blank&gt;Blog&lt;/A&gt; | &lt;A href="http://www.twitter.com/grrl_geek" target=_blank&gt;@grrl_geek&lt;/A&gt;) is hosting this month's &lt;A href="http://sqlblog.com/blogs/adam_machanic/archive/2009/11/30/invitation-to-participate-in-t-sql-tuesday-001-date-time-tricks.aspx" target=_blank&gt;T-SQL Tuesday&lt;/A&gt; - started by SQLBlog's own Adam Machanic (&lt;A href="http://sqlblog.com/blogs/adam_machanic" target=_blank&gt;Blog&lt;/A&gt; | &lt;A href="http://www.twitter.com/AdamMachanic" target=_blank&gt;@AdamMachanic&lt;/A&gt;) - and it is about aggregation.&lt;/P&gt;
&lt;P&gt;I thought I'd show a couple ways to do aggregation using SSIS. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The Aggregate&amp;nbsp;Transformation&amp;nbsp;in SSIS&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The Aggregate transform in SSIS is &lt;EM&gt;fast&lt;/EM&gt;. I built an SSIS package (AggregateScripts.dtsx)&amp;nbsp;with two Data Flow Tasks (Using the Aggregate Transform and Using a Script Component).&amp;nbsp;&amp;nbsp;Using the Aggregate Transform looks like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;IMG style="WIDTH:151px;HEIGHT:281px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_1.jpg" width=151 height=281&gt;&lt;/P&gt;
&lt;P&gt;The OLE DB Source is pulling data from the AdventureWorks (2005) Sales.SalesOrderDetail table. The Aggregate transform is configured to Sum the LineTotal amounts:&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:309px;HEIGHT:493px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_2.jpg" width=309 height=493&gt;&lt;/P&gt;
&lt;P&gt;When the package executes, the Sum is the only output from the Aggregate Transformation:&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:152px;HEIGHT:276px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_3.jpg" width=152 height=276&gt;&lt;/P&gt;
&lt;P&gt;And the package executes in 0.421 seconds in BIDS. This rocks if all I'm after is the Sum but let's face it - I can get the Sum of the LineTotal column pretty quickly in T-SQL: SELECT Sum(LineTotal) FROM Sales.SalesOrderDetail.&lt;/P&gt;
&lt;P&gt;Also, what if I want a running total? One way to achieve this is...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Using a Script Component&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;For the Using a Script Component data flow task, I select from the same source (AdventureWorks.Sales.SalesOrderDetail). I use a data flow path to pipe the rows into a Script Component. Accepting the default (synchronous) relationship between the input and output buffers, I select the LineTotal on the Inputs page:&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:228px;HEIGHT:262px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_4.jpg" width=228 height=262&gt;&lt;/P&gt;
&lt;P&gt;On the Inputs and Outputs page, I create a new output column named RunningTotal (Decimal, 38, 6):&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:169px;HEIGHT:86px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_5.jpg" width=169 height=86&gt;&lt;/P&gt;
&lt;P&gt;On to the script! I see folks struggle with scripting in SSIS. Variable scope can be tricky. Let's see what happens when we put all the code in the ProcessInputRow method:&lt;/P&gt;&lt;FONT face="Courier New"&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;Public Overrides Sub&lt;/FONT&gt; Input0_ProcessInputRow(&lt;FONT color=blue&gt;ByVal&lt;/FONT&gt; Row &lt;FONT color=blue&gt;As&lt;/FONT&gt; Input0Buffer)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;Dim&lt;/FONT&gt; dTotal &lt;FONT color=blue&gt;As Decimal&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dTotal = dTotal + Row.LineTotal&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Row.RunningTotal = dTotal&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;End Sub&lt;/FONT&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P&gt;Let's see what we get when we execute in BIDS:&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:381px;HEIGHT:389px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_6.jpg" width=381 height=389&gt;&lt;/P&gt;
&lt;P&gt;The LineTotal value is merely copied into the RunningTotal column each time. That's not a running total. What's the problem? Variable scope. The script variable dTotal is being created and populated for &lt;EM&gt;each row&lt;/EM&gt; flowing through the Script Component. &lt;/P&gt;
&lt;P&gt;We need for it to be created &lt;EM&gt;once&lt;/EM&gt; and aggregated each time a row flows through the Script Component. To accomplish this, we need to move the Dim statement from inside the buffer's ProcessInputRows method - we need to change the scope from ProcessInputRows to the ScriptMain class:&lt;/P&gt;&lt;FONT face="Courier New"&gt;
&lt;P&gt;&lt;FONT color=blue&gt;Public Class&lt;/FONT&gt; ScriptMain&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;Inherits&lt;/FONT&gt; UserComponent&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;Dim&lt;/FONT&gt; dTotal &lt;FONT color=blue&gt;As Decimal&lt;/FONT&gt;&lt;BR&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P&gt;...&lt;/P&gt;&lt;FONT face="Courier New"&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;Public Overrides Sub&lt;/FONT&gt; Input0_ProcessInputRow(&lt;FONT color=blue&gt;ByVal&lt;/FONT&gt; Row &lt;FONT color=blue&gt;As&lt;/FONT&gt; Input0Buffer)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dTotal = dTotal + Row.LineTotal&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Row.RunningTotal = dTotal&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;FONT color=blue&gt;End Sub&lt;/FONT&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P&gt;Now when we execute it, the script variable dTotal is created once at the class-scope and poopulated (repeatedly) for each row that flows through the buffer's&amp;nbsp;ProcessInputRow method. The results:&lt;/P&gt;
&lt;P&gt;&lt;IMG style="WIDTH:381px;HEIGHT:389px;" src="http://vsteamsystemcentral.com/images/ext/AggregateSSIS_7.jpg" width=381 height=389&gt;&lt;/P&gt;
&lt;P&gt;A running total!&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Conclusion&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As always, there are lots of cool ways to get at the data you seek. This isn't the only way and I'm sure there are better ways to achieve the same results.&lt;/P&gt;
&lt;P&gt;Happy T-SQL Tuesday!&lt;/P&gt;
&lt;P&gt;:{&amp;gt;&lt;/P&gt;</description></item></channel></rss>