THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

A Rose By Any Other Name

Regardless of the DBMS make or model, the transaction throughput curve of a database system is often shaped like a trapezoid. As the load level goes up initially, so does the transaction throughput. As the load level continues to go up, the transaction throughput will reach a plateau. After the load level is cranked up still further, the transaction throughput will inevitably start to drop and may eventually come to a  halt.

With respect to the throughput drop under severe load, two questions are interesting: (1) what is the cause of the drop? and (2) what can we do to delay the drop until an even higher load level?

The answers are of course dependent on the nature of the workload, the system configurations, and a host of other factors.

So rather than trying to address the questions in general, let's look at one specific type of tests I often perform, which is to stress the processors, memory, and their interconnect of a server with a SQL Server read-only workload that effectively caches all the data pages in memory. In other words, this workload takes the disk I/Os out of the equation.

As expected and with no exception, I would see the transaction throughput start to drop--sometimes rather dramatically--as I increase the load level beyond saturation. When this may happen often correlates nicely with a dramatic increase in waits on Thread-safe memory objects waits under SQLServer:Wait Statistics. Unfortunately, this wait counter is documented with a rather useless tautology in that the explanation more or less repeats what the name already suggests without adding useful information. (BTW, the explanation is "Statistics for processes waiting on thread-safe memory allocators"). So,  from looking at the wait stats, I can't really tell what wait or waits are killing the transaction throughput.

The point I want to illustrate, however, is the dramatic impact of a change to the workload--a change that merely gives the stored procedures different names.

Original workload: The database calls of the workload calls are even distributed between two stored procedures, spOrderStatus and spStockLevel.

Modified workload: In the same database, I created 100 stored procedures: spOrderStatus_1 through spOrderStatus_100, that are identical to spOrderStatus except the names. Similarly, I created another 100 stored procedures: spStockLevel_1 through spStockLevel_100, that are identical to spStockLevel except the names. I then modified the original workload to distribute the database calls evenly among spOrderStatus_1 through spOrderStatus_100 and spStockLevel_1 through spStockLevel_100.

With the modified workload, the same database was able to sustain the peak throughput for significantly higher load level (i.e. more users). The following chart summarizes the throughput behavior between the original workload and the modified workload on the test server:

I didn't specifically test how many identical but differently named stored procedures I had to create to see the impact on the transaction throughput. 100 for each stored procedure was picked randomly.

Note that this is a known technique and is used in some of the published SQL Server TPC-C tests. For instance, in the best SQL Server TPC-C results obtained on HP Superdome, 32 New_Order stored procedures are created with 32 different names but identical code inside. Stored procedures with different names but identical code are also created for several other transactions.

So, a rose by a different name may just smell better.

Published Tuesday, November 20, 2007 4:04 PM by Linchi Shea

Attachment(s): proc_name.gif

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Drew Mayo said:

Great research, but I'm really interested in the why.  This smells so much like something that could be transparently done by the SQL Server "compiler" that it bothers me a little that it's not done already.  ie: There must be a very good reason why it is not.  Any links for the why?

November 20, 2007 6:22 PM
 

Tony Rogerson said:

Sounds like plan recompilation could be a factor; at certain points in your test the data changes the stats enough for the plan to be recompiled thus blocking all other connections using it while that happens. Look for COMPILE (see article: http://support.microsoft.com/kb/263889).

The other benefit from having multiple procs is that you get more suitable plans for the workload - once a plan is compiled that plan is reused regardless of the shift in parameters so one good plan could be a bad plan for something else (see parameter sniffing topics).

Tony.

November 21, 2007 3:09 AM
 

Linchi Shea said:

Tony;

There shouldn't be any plan recompilation in this particular case because the procedures are read only.

November 21, 2007 10:14 PM
 

Tony Rogerson said:

Hi Linchi;

Are you seeing a waittype of COMPILE.

There are a lot of reasons for a recompile to be done, doesn't just need data to change. It would be interesting to see what the waittypes are - do you have scripts to build the test database for us to take a look?

If it's not compile - I'd lay money on it being a parameter sniffing thing; it would be interested to compare the actual execution plans between the two tests.

Tony.

November 22, 2007 3:26 AM
 

Linchi Shea said:

So Tony, how much money would you like to bet on parameter sniffing? Beers would be fine when we bump into each other at a conference.

The waits are not on compile or recompile. The waits are on Thread-safe memory objects waits. But as mentioned in the blog, this counter is not documented so I'm not sure exactly what it means. But I'm pretty sure that the problem lies with severe contention on some SQL Server internal structures. I haven't got a chance to go through the spinlock stats, which should shed some light.

November 22, 2007 4:05 PM
 

Tony Rogerson said:

Do you have the database and test scripts - if you can pass me them over I'll have a look - I can arrange an FTP if they are large.

Did you get chance to see if the execution plans on the multiple instances are all the same - if not which is my feeling, then you are looking at a parameter sniffing type performance benefit, rather, avoiding the problem using many versions of the same proc.

Beers is fine - no doubt I'll be buying a few anyway at the MVP summit next year...

Tony.

November 23, 2007 3:24 AM
 

Linchi Shea said:

I'll pass the repro to you when I get a chance. No database to carry around. All can be generated through code.

Execution plans are constant, by the way.

November 27, 2007 11:05 AM
 

Andrew Kelly said:

I know this is an old post but I just came across it. Did you schema qualify those stored procedure calls?  If not I can see this easily happening.

July 21, 2008 8:02 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement