THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Performance impact: parallelism comes to the rescue

For the specific scenario discussed in my previous post,  we observed that using an intermediate Cartesian product was about three times faster than doing joins with multiple lookup tables. To keep the discussions manageable, we limited the max degree of parallelism to 1. In other words, we did not allow query parallelism. For any query that touches a large amount of data, that was silly, especially when it’s on a server with 80 logical processors.

To be more realistic, I then looked at the following four scenarios:

  1. No parallelism, using the Cartesian product materialized into a temp table
  2. No parallelism, using joins with the six lookup tables
  3. Maxdop set to 20, using the Cartesian product materialized into a temp table
  4. Maxdop set to 20, using joins with the six lookup tables

The following chart summarizes the test results:

cartesian_parallelism

It’s clear that the queries can benefit greatly from parallelism. In fact, in the two cases where a Cartesian product was used, setting maxdop to 20 sped the query up ~13 times. The query duration dropped from 248 seconds to 19 seconds!

It’s intuitive that enabling parallelism would give us a boost.

It’s also interesting to note that the query with multi-table joins did not receive the same kind of performance boost from parallelism as did the query with the single table join using the Cartesian product temp table. The query duration dropped from 740 seconds to 80 seconds when parallelism was enabled on the query joining directly with the six lookup tables, about ~9 times speed up (compared with ~13 times speed up in case of the Cartesian product).

Even when maxdop was set to 20, going from multi-table joins to a single table join using the Cartesian product temp table still gave us a 4 times performance boost. The duration went from 80 seconds to 19 seconds.

At least for this specific scenario, parallelism plus the Cartesian product temp table gave us the best performance. In the next post, I’ll look at the impact of parallelism more closely, in particular, how different degrees of parallelism affect performance.

Published Thursday, December 15, 2011 6:14 PM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Greg Linwood said:

It's useful to know that individual queries can run faster with parallelism but it's generally more important to consider the impact of parallelism in processing concurrent workloads, as this is where SQL Server really struggles with query parallelism both on OLTP & DSS workloads

I'd be interested to see your tests re run with multiple concurrent instances of the queries fighting for time on limited CPUs & see if these results hold up?

December 16, 2011 4:04 AM
 

GrumpyOldDBA said:

out of interest did you try maxdop 0 and see what the optimiser chose? I'm led to believe, maybe wrongly, that setting a maxdop = 20 ( as a query hint ) will force the query to use 20 streams regardless of if this is actually what would be best, say 8 being more optimal.

December 16, 2011 6:05 AM
 

Linchi Shea said:

In commenting on my previous post , Greg Linwood and GrumpyOldDBA raised questions about various implications

December 16, 2011 12:24 PM
 

Scott R. said:

Linchi,

Great post and series - very useful take aways on many-way joins and parallelism impacts.

I am a bit confused in the description above where you review the gains (40x versus 3x).  Based on my read of your graph, the cartesian product model improved by using MAXDOP 20 from 248 to 19 (13x), while the traditional join model improved by using MAXDOP 20 from 740 to 80 (9.25x).  Still a better improvement - just not the 40x versus 3x as was first stated.

Please set me straight if I am misreading the results.

Thanks again for the great info.

Scott R.

December 16, 2011 12:51 PM
 

Linchi Shea said:

Wow, not sure what I was drinking! But thanks Scott! I completely messed up the math. Well, I guess math was right, I just plugged the numbers in the wrong places. I have corrected the post.

December 16, 2011 1:35 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement