A good while ago I was asked if it was possible to consistently produce a parallel plan by Tom LaRock(blog/twitter), and I recalled from reading Grant Fritchey’s(blog/twitter) book on Execution Plans that it was possible to do this by adjusting the ‘cost threshold for parallelism’ sp_configure option from the default of five to a value of one which will make even simple queries run in parallel if you have multiple logical processors.
This got me think about this option and wondering if the default is still applicable in today’s servers and database environments. If you think about when the default values for SQL Server were actually created, it was a long time ago and in today’s large commodity hardware servers some of them aren’t the best for SQL any longer. For example, most people migrating from an x86 SQL Server to a x64 server learn really quick that you have to set Max Server Memory on a 64 bit server, something that you could get away with not setting on older x86 hardware even though it was still a recommended practice.
Max Degree of Parallelism is another hot one these days with newer Xeon Quad Core and Nahelem six core processors and reduced server costs, it is easy to slam sixteen or twenty-four cores into a relatively cheap server that has 64-128GB of RAM, a configuration that only a few years ago was specialty hardware that was cost prohibitive for most shops. These kinds of servers have quickly become the answer to performance problems that aren’t necessarily hardware related but instead poor design. What you get is a person with no idea how SQL Server works, with a really powerful server that is full of performance problems because it is trying to run with the default settings which aren’t appropriate for this size of hardware.
In Bob Ward’s session on wait stats at PASS, one thing he said a couple of times is that the answer to CXPACKET waits, isn’t to reduce the ‘max degree of parallelism’ sp_configure option for the hole server. I have to admit that I am guilty of recommending this solution hundreds of times before PASS and in most cases I continue to recommend this solution to people who have ho idea what they are doing because:
A. Most seem to be to lazy to read the referenced whitepapers provided to them for troubleshooting.
B. They have no idea what they are doing.
C. They want a quick fix that gets their manager off their back.
One thing that Tom’s question lead me to thinking about was whether or not adjusting the ‘cost threshold for parallelism’ up from five to a number that still allows large queries that don’t execute frequently to work with parallelism, while minimizing the number of smaller queries that execute more frequently from running in parallel. To help with determining where this option might be set, I wrote a query to search the plan cache for existing parallel plans and see the cost associations to current plans that executed parallel.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
query_plan AS CompleteQueryPlan,
n.value('(@StatementText)', 'VARCHAR(4000)') AS StatementText,
n.value('(@StatementOptmLevel)', 'VARCHAR(25)') AS StatementOptimizationLevel,
n.value('(@StatementSubTreeCost)', 'VARCHAR(128)') AS StatementSubTreeCost,
n.query('.') AS ParallelSubTreeXML,
FROM sys.dm_exec_cached_plans AS ecp
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS eqp
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn(n)
WHERE n.query('.').exist('//RelOp[@PhysicalOp="Parallelism"]') = 1
I look at the high use count plans, and see if there is a missing index associated with those queries that is driving the cost up. If I can tune the high execution queries to reduce their cost, I have a win either way. However, if you run this query, you will note that there are some really high cost queries that you may not get below the five value. If you can fix the high use plans to reduce their cost, and then increase the ‘cost threshold for parallelism’ based on the cost of your larger queries that may benefit from parallelism, having a couple of low use count plans that use parallelism doesn't have as much of an impact to the server overall, at least based on my own personal experiences.
What are your thoughts?