<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx</link><description>Although Cartesian product as a concept is essential to the relational database theory, it is often a dirty phrase – something that is associated with bad performance and therefore should be avoided. But there are cases where a Cartesian product is highly</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40426</link><pubDate>Thu, 15 Dec 2011 13:11:10 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40426</guid><dc:creator>GrumpyOldDBA</dc:creator><description>&lt;p&gt;Very interesting, when tuning I often I use temp tables to replace parts of joins to achieve similar performance improvements - this is something I'd not considered and I will certainly look into it. Just to explain although I'm not working on a DW, many queries I attempt to tune contain over 16 joins, many of these are &amp;quot;business rules&amp;quot; , small tables joining against much larger tables - your approach may well work wonders - cool!&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40427</link><pubDate>Thu, 15 Dec 2011 13:51:42 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40427</guid><dc:creator>Whipper Snapper</dc:creator><description>&lt;p&gt;Minor complaint. &amp;nbsp;Your chart doesn't compare the total elapsed times, which should be 740 versus 249 (248+1) to include populating the temp table.&lt;/p&gt;
&lt;p&gt;Cool concept though and brings into question when a star-schema is appropriate or not. &amp;nbsp;It could be interesting to see where the star-schema makes more sense with lots of WHERE criteria on dimension fields, which should be appropriately indexed.&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40431</link><pubDate>Thu, 15 Dec 2011 16:35:32 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40431</guid><dc:creator>Wiseman82</dc:creator><description>&lt;p&gt;Interesting idea. :-) &amp;nbsp;The point of the article was to demonstrate the use of a cartesian product, but my instinct from a performance point of view would be to aggregate the data in the fact table first then join to the dimension tables (assuming I'm not allowed to make any schema modifications).&lt;/p&gt;
&lt;p&gt;This approach seems to work better for me on a small scale test (I didn't have the patience to load the full data set and I was running it on a slow client PC):&lt;/p&gt;
&lt;p&gt;select server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; cpus, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; COUNT(*) as cnt &lt;/p&gt;
&lt;p&gt;INTO #T&lt;/p&gt;
&lt;p&gt;from tran_stats t1 &lt;/p&gt;
&lt;p&gt;group by server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; cpus&lt;/p&gt;
&lt;p&gt; SELECT id_server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_cpus,&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;SUM(cnt) as cnt&lt;/p&gt;
&lt;p&gt; FROM #T T1 &amp;nbsp; &amp;nbsp; &lt;/p&gt;
&lt;p&gt; join Dim_server_config t2 on t1.server_config = t2.server_config &lt;/p&gt;
&lt;p&gt; join Dim_thread_name t3 on t1.thread_name = t3.thread_name &lt;/p&gt;
&lt;p&gt; join Dim_tran_type t4 on t1.tran_type = t4.tran_type &lt;/p&gt;
&lt;p&gt; join Dim_tran_status t5 on t1.tran_status = t5.tran_status &lt;/p&gt;
&lt;p&gt; join Dim_users t6 on t1.users = t6.users &lt;/p&gt;
&lt;p&gt; join Dim_cpus t7 on t1.cpus = t7.cpus &lt;/p&gt;
&lt;p&gt; GROUP BY id_server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_cpus&lt;/p&gt;
&lt;p&gt;drop table #t&lt;/p&gt;
&lt;p&gt;Cartesian products can be useful. &amp;nbsp;They are usually a problem when they are unintentional...&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40432</link><pubDate>Thu, 15 Dec 2011 16:37:40 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40432</guid><dc:creator>Wiseman82</dc:creator><description>&lt;p&gt;Interesting idea. :-) &amp;nbsp;The point of the article was to demonstrate the use of a cartesian product, but my instinct from a performance point of view would be to aggregate the data in the fact table first then join to the dimension tables (assuming I'm not allowed to make any schema modifications).&lt;/p&gt;
&lt;p&gt;This approach seems to work better for me on a small scale test (I didn't have the patience to load the full data set and I was running it on a slow client PC):&lt;/p&gt;
&lt;p&gt;select server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; cpus, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; COUNT(*) as cnt &lt;/p&gt;
&lt;p&gt;INTO #T&lt;/p&gt;
&lt;p&gt;from tran_stats t1 &lt;/p&gt;
&lt;p&gt;group by server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; cpus&lt;/p&gt;
&lt;p&gt; SELECT id_server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_cpus,&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;SUM(cnt) as cnt&lt;/p&gt;
&lt;p&gt; FROM #T T1 &amp;nbsp; &amp;nbsp; &lt;/p&gt;
&lt;p&gt; join Dim_server_config t2 on t1.server_config = t2.server_config &lt;/p&gt;
&lt;p&gt; join Dim_thread_name t3 on t1.thread_name = t3.thread_name &lt;/p&gt;
&lt;p&gt; join Dim_tran_type t4 on t1.tran_type = t4.tran_type &lt;/p&gt;
&lt;p&gt; join Dim_tran_status t5 on t1.tran_status = t5.tran_status &lt;/p&gt;
&lt;p&gt; join Dim_users t6 on t1.users = t6.users &lt;/p&gt;
&lt;p&gt; join Dim_cpus t7 on t1.cpus = t7.cpus &lt;/p&gt;
&lt;p&gt; GROUP BY id_server_config, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_thread_name, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_type, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_tran_status, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_users, &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;id_cpus&lt;/p&gt;
&lt;p&gt;drop table #t&lt;/p&gt;
&lt;p&gt;Cartesian products can be useful. &amp;nbsp;They are usually a problem when they are unintentional...&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40435</link><pubDate>Thu, 15 Dec 2011 19:50:53 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40435</guid><dc:creator>Jason Kohlhoff</dc:creator><description>&lt;p&gt;Linchi,&lt;/p&gt;
&lt;p&gt;I've looked at your create table script. &amp;nbsp;None of your tables have Clustered Indexes (PKs), and your tran_stats table does not have any Nonclustered Indexes on its foreign key columns.&lt;/p&gt;
&lt;p&gt;I agree that a Cartesian product is actually useful (i.e. the &amp;quot;Report Writer's Magic Formula&amp;quot;), but all that you've successfully proven here is that heap (non-indexed) tables are a lot slower than indexed tables because of all the resulting tables scans and hash matches.&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40436</link><pubDate>Thu, 15 Dec 2011 20:40:29 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40436</guid><dc:creator>Linchi Shea</dc:creator><description>&lt;p&gt;Hi Jason Kohlhoff;&lt;/p&gt;
&lt;p&gt;The large table has a clustered index. The following statement is in one of the scripts:&lt;/p&gt;
&lt;p&gt;create clustered index cix_tran_stats on tran_stats(tran_begin)&lt;/p&gt;
&lt;p&gt;Correct, none of the dimension columns has an index. Not that the selectivities on these columns are extremely low for any index to be useful. The real world scenario that triggered this sample query is a very similar query that replaces the dimension values into their id's in the large table. &lt;/p&gt;
&lt;p&gt;BTW, I actually tested with the large table being a heap, and that did not change the observation.&lt;/p&gt;
</description></item><item><title>Performance impact: parallelism comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40441</link><pubDate>Thu, 15 Dec 2011 22:14:39 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40441</guid><dc:creator>Linchi Shea</dc:creator><description>&lt;p&gt;For the specific scenario discussed in my previous post ,&amp;amp;#160; we observed that using an intermediate&lt;/p&gt;
</description></item><item><title>Performance impact: diminishing marginal return on the degree of parallelism</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40460</link><pubDate>Fri, 16 Dec 2011 17:24:40 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40460</guid><dc:creator>Linchi Shea</dc:creator><description>&lt;p&gt;In commenting on my previous post , Greg Linwood and GrumpyOldDBA raised questions about various implications&lt;/p&gt;
</description></item><item><title>re: Performance impact: Cartesian product comes to the rescue</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40565</link><pubDate>Wed, 21 Dec 2011 08:02:45 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40565</guid><dc:creator>Alain Krikilion</dc:creator><description>&lt;p&gt;Really interesting post.&lt;/p&gt;
&lt;p&gt;But I have some questions.&lt;/p&gt;
&lt;p&gt;1) Can creating a clustered index on the temptable speed the query up even more? I can't test it because I don't have the necessary hardware for it.&lt;/p&gt;
&lt;p&gt;2) Can using a table variable instead of a temptable speed up the query. (I guess that, as always, &amp;quot;it depends&amp;quot; on the size of the resulting table).&lt;/p&gt;
&lt;p&gt;3) Can using a CTE instead of a temptable speed up the query.&lt;/p&gt;
</description></item><item><title>Performance impact: hyperthreading for reporting queries</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2011/12/15/performance-impact-cartesian-product-comes-to-the-rescue.aspx#40818</link><pubDate>Thu, 05 Jan 2012 05:36:51 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40818</guid><dc:creator>Linchi Shea</dc:creator><description>&lt;p&gt;There are a lot of questions on hyperthreading, but not a lot of answers.&amp;amp;#160; There is no shortage&lt;/p&gt;
</description></item></channel></rss>