<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tags 'SSIS', 'Elegant Design', and 'Operations'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=SSIS,Elegant+Design,Operations&amp;orTags=0</link><description>Search results matching tags 'SSIS', 'Elegant Design', and 'Operations'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>How Many SSIS Packages Should I Create?</title><link>http://sqlblog.com/blogs/andy_leonard/archive/2009/12/28/how-many-ssis-packages-should-i-create.aspx</link><pubDate>Mon, 28 Dec 2009 12:00:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:20109</guid><dc:creator>andyleonard</dc:creator><description>&lt;DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Introduction&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;I received the following questions (paraphrased) from a friend:&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;EM&gt;I'm building an&amp;nbsp;ETL process with&amp;nbsp;SSIS in which&amp;nbsp;I pull from about 40 tables in DB2.&amp;nbsp;I put the data into staging tables in a&amp;nbsp;SQL Server&amp;nbsp;database. &lt;/EM&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;EM&gt;Is it better to have one package with 40 data flows, 40 packages with one&amp;nbsp;data flow each or something in between? Or should there be one data flow with a lot of source-&amp;gt;destination modules? What the advantages and disadvantages of each approach?&lt;/EM&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Excellent Questions!&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&amp;nbsp; The answer&amp;nbsp;is: "It depends." That's one of the reasons you're having trouble finding a definitive answer online. It's not a bad question, so let me walk through the factors that would drive my design decisions:&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&amp;nbsp; First, are there dependencies? Are there foreign key relationships in the source database? Is referential integrity enforced (keep in mind there are non-database ways to enforce RI)? If there are dependencies or referential integrity exists (whether it's enforced or not), I load the parents first and then the children.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&amp;nbsp; Second, I&amp;nbsp;think about Operations: How much visibility do I want into this process? If it's going to take three hours to load, I probably want to break that down into several packages so I can at least see steps in the process completing. This will also assist in troubleshooting: "Package1.dtsx failed" isn't a pleasant message to troubleshoot if it contains 40 Data Flow Tasks. "Load Orders and OrderDetails.dtsx failed" is a much better starting place.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&amp;nbsp; You've inspired another blog post - thank very much! Please keep the questions coming. And let me know what you decide.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;:{&amp;gt; Andy&lt;/DIV&gt;</description></item></channel></rss>