<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx</link><description>Many SSIS transformations, such as the Pivot and the Data Profiling Tasks, require that the data being fed into them be sorted. Without first sorting, some transformations will throw an error and not run, while others will run but the results will not</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#21770</link><pubDate>Wed, 03 Feb 2010 14:15:29 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:21770</guid><dc:creator>Andrew Ingram</dc:creator><description>&lt;p&gt;One point to note on sorting using an order by in the original select is that if you need to sort later in the transform and join the results, the way the source database sorts must be the same as the way SSIS sorts - letters before numbers and vice versa being the prime example.&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#25626</link><pubDate>Thu, 27 May 2010 14:01:02 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:25626</guid><dc:creator>Josh Grant</dc:creator><description>&lt;p&gt;Expanding on Andrew's point, I've found that it is UNSAFE to rely entirely upon database sorting, and I am now avoiding one particular data flow component which requires sorted inputs.&lt;/p&gt;
&lt;p&gt;Here's an example you can build yourself...&lt;/p&gt;
&lt;p&gt;- Locate (or create) a SQL Server database with the SQL_Latin1_General_CP1_CI_AS (case insensitive) collation&lt;/p&gt;
&lt;p&gt;- Create a data flow with two OLEDB sources, pointing to these two queries. &amp;nbsp;Note that query #2 is identical to query #1, with one extra row.&lt;/p&gt;
&lt;p&gt;-- Query #1&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'ABb' AS Col1,&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1 AS Query1RowNumber&lt;/p&gt;
&lt;p&gt;UNION ALL&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'AbB',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2&lt;/p&gt;
&lt;p&gt;ORDER BY Col1&lt;/p&gt;
&lt;p&gt;-- Query #2&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'ABb' AS Col1,&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1 AS Query2RowNumber&lt;/p&gt;
&lt;p&gt;UNION ALL&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'AbB',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2&lt;/p&gt;
&lt;p&gt;UNION ALL&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'Ab',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3&lt;/p&gt;
&lt;p&gt;ORDER BY Col1&lt;/p&gt;
&lt;p&gt;- Using the advanced editor, set the &amp;quot;IsSorted&amp;quot; property on each data source, and set the &amp;quot;SortOrder&amp;quot; to 1 for the Col1 column&lt;/p&gt;
&lt;p&gt;- Add a merge join component, join type &amp;quot;full outer join&amp;quot;. &amp;nbsp;Make &amp;quot;Col1&amp;quot; the join key, and allow Query1RowNumber and Query2RowNumber to pass through.&lt;/p&gt;
&lt;p&gt;- Add a destination of some kind, and put a data viewer on the output of the merge join&lt;/p&gt;
&lt;p&gt;The expected result is this, right?&lt;/p&gt;
&lt;p&gt;Query1RowNumber Query2RowNumber&lt;/p&gt;
&lt;p&gt;--------------- ---------------&lt;/p&gt;
&lt;p&gt;1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/p&gt;
&lt;p&gt;2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2&lt;/p&gt;
&lt;p&gt;NULL &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3&lt;/p&gt;
&lt;p&gt;What you actually get is:&lt;/p&gt;
&lt;p&gt;Query1RowNumber Query2RowNumber&lt;/p&gt;
&lt;p&gt;--------------- ---------------&lt;/p&gt;
&lt;p&gt;NULL &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3&lt;/p&gt;
&lt;p&gt;NULL &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2&lt;/p&gt;
&lt;p&gt;1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/p&gt;
&lt;p&gt;2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; NULL&lt;/p&gt;
&lt;p&gt;Row number 2 (Col1 = 'AbB') isn't matched in the full outer join, and it is returned as two &amp;quot;unique&amp;quot; rows!&lt;/p&gt;
&lt;p&gt;My conclusion from this is that SSIS components which require sorted inputs actually require &amp;quot;SSIS-sorted&amp;quot; inputs, or they can return unexpected results!&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#29283</link><pubDate>Sun, 10 Oct 2010 15:39:13 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:29283</guid><dc:creator>Mike</dc:creator><description>&lt;p&gt;Microsoft is promoting the idea that doing everything the memory is great. I does definitely speed up the process plus it help large companies to sell hardware and memory. Sorting algorithms are very well documented and it is difficult to invent something new.&lt;/p&gt;
&lt;p&gt;When working with large datasets there are ways of sorting data without buying new hardware&lt;/p&gt;
&lt;p&gt;More information&lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://www.etl-tools.com/etl-tools/processing-data/sorting.html"&gt;http://www.etl-tools.com/etl-tools/processing-data/sorting.html&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#29912</link><pubDate>Wed, 27 Oct 2010 23:22:05 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:29912</guid><dc:creator>justins707</dc:creator><description>&lt;p&gt;I think the diff is that you are using a UNION ALL. &amp;nbsp;If you change the Source SQL to use UNION (without ALL). &amp;nbsp;Then the results look correct,&lt;/p&gt;
&lt;p&gt;null &amp;nbsp; 3&lt;/p&gt;
&lt;p&gt;1 &amp;nbsp; &amp;nbsp; &amp;nbsp;1&lt;/p&gt;
&lt;p&gt;2 &amp;nbsp; &amp;nbsp; &amp;nbsp;2&lt;/p&gt;
&lt;p&gt;Sort Order is null, Ab, ABb, AbB.&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#33562</link><pubDate>Thu, 17 Feb 2011 05:49:21 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:33562</guid><dc:creator>Chris Nyberg</dc:creator><description>&lt;p&gt;With Ordinal Technology's Nsort for SSIS sort component, you can quickly sort data sets even if they do not fit in memory. The Nsort component uses a temporary file to avoid the page faulting coma the standard SSIS component can fall into. &amp;nbsp;See &lt;a rel="nofollow" target="_new" href="http://www.ordinal.com/ssis.html"&gt;http://www.ordinal.com/ssis.html&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#42358</link><pubDate>Thu, 15 Mar 2012 21:11:18 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:42358</guid><dc:creator>Jemsy Villanera</dc:creator><description>&lt;p&gt;Good Post, thanks It help me a lot :)&lt;/p&gt;
</description></item><item><title>re: The Trouble with SSIS Sorting</title><link>http://sqlblog.com/blogs/eric_johnson/archive/2010/02/03/the-trouble-with-ssis-sorting.aspx#46282</link><pubDate>Mon, 19 Nov 2012 11:14:45 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46282</guid><dc:creator>Emil Glownia</dc:creator><description>&lt;p&gt;Interesting post and comments.... I was looking to do some research before I create some more videos for our SSIS Transformations Tutorial and got more ideas than I expected.&lt;/p&gt;
&lt;p&gt;I got also interested in why removing ALL would make a difference... and that is only because Case Insensitive collation was used so it is &amp;quot;random&amp;quot; sorting or more precisely UNION and UNION ALL uses different sort for &amp;quot;the same&amp;quot; values.&lt;/p&gt;
&lt;p&gt;with Case Insensitive this query:&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'ABb'AS Col1, &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 AS Query2RowNumber&lt;/p&gt;
&lt;p&gt;UNION ALL&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'AbB',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2&lt;/p&gt;
&lt;p&gt;UNION all&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'Ab',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; 3&lt;/p&gt;
&lt;p&gt;ORDER BY Col1&lt;/p&gt;
&lt;p&gt;Returns 3,2,1 but with DESC at the end&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'ABb'AS Col1, &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 AS Query2RowNumber&lt;/p&gt;
&lt;p&gt;UNION ALL&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'AbB',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2&lt;/p&gt;
&lt;p&gt;UNION all&lt;/p&gt;
&lt;p&gt;SELECT &amp;nbsp;'Ab',&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; 3&lt;/p&gt;
&lt;p&gt;ORDER BY Col1 desc&lt;/p&gt;
&lt;p&gt;Result is not reverse 1,2,3 but 2, 1, 3... if I take out ALL than result is 1, 2, 3 because 1 and 2 is the same value (with case insensitive collation), obviously Case Sensitive Collation should be used instead but this is a good starting point for me to make some SSIS test ;)&lt;/p&gt;
&lt;p&gt;Regards&lt;/p&gt;
&lt;p&gt;Emil&lt;/p&gt;
</description></item></channel></rss>