THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Performance impact: Speeding up SMO script generation

Generating scripts through SMO can be as simple as walking down the database object tree and applying the Script() method to each scriptable object. Well, that is until you start to try it on a database that has a large number of objects (say a few thousands), and the long wait the becomes rather frustrating. I’m not talking about such silly performance problems as the horrific roundtrips caused by the IsSystemObject check (which has a simple workaround). Rather, it’s just sloooow to script out thousands (or even a few hundreds) of database objects.

 

But there is hope, and one could even argue that it’s a matter of how you use SMO.

 

One way to significantly speed up scripting the objects in a database is to do away with single-threaded programming and resort to multiple threads.

 

I have run some simple tests with a various number of threads in a C# program against a database that has 632 top-level objects, including tables, stored procedures, user-defined functions, and views, but not counting triggers, indexes, and constraints that hang off each table.

 

The chart below shows that on my test workstation, using 6~8 threads in the C# program produced the optimal performance in terms of the scripting elapsed time. When done in a single thread, the performance was the worst. With 6~8 threads, the elapsed time was almost four times shorter than with a single thread, a ~400% improvement. Not bad with a simple change!

 

 

 

Unfortunately, in 99% of cases I see people scripting the objects in a database via SMO in a single thread. If you drive it with Powershell, I'm not sure if there is a simple way to do multithreading. If you know how to do multithreading in Powershell, please share. 

 

The reason that I could dramatically speed up the script generation via SMO using multiple threads is that the resources on the workstation and on the server were greatly underutilized with a single thread. Note that when I pushed it hard with 50 threads, overhead on the workstation became a significant drag. But even in that case, it performed much better than a single-thread program.

 

It is worth reiterating that the specific numbers charted above were obtained on my workstation. In a different setting, the number of threads for the most optimal performance may vary. But I do not expect the overall message of multiple threads leading to shorter script generation time with SMO to change.

  

 

Published Sunday, December 13, 2009 6:18 PM by Linchi Shea

Attachment(s): MultiThreadSMO.GIF

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Linchi Shea said:

The attached is the C# code that I used for generating the test results charted in my previous post .

December 15, 2009 2:12 PM
 

Ben Miller said:

I don't see an attachment with the C# code.

December 15, 2009 6:08 PM
 

Linchi Shea said:

It's in the next post, Ben.

December 15, 2009 8:57 PM
 

Peter said:

One of the slowness reason for SMO scripting is that it sends tons of adhoc queries to SQL Server again and again. And lots of adhoc plans are cached in SQL server. Please check this connect item and vote: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=520260

December 21, 2009 10:36 AM
 

David BAFFALEUF said:

Hi Linchi,

In my situation generating the DDL with 2 threads is 33% faster than the single-threaded model, but I don't use the same algoritm, more types of objects are handled, I have to filter encrypted functions and sp because the script() call results in an exception, etc...

But the real gain looks to be in Peter's statement. If you try to run the scripting with an empty plan cache for your database and with PARAMETERIZATION to FORCED, it will be orders of magnitude faster, because a lot more plans will be parameterized and reused appropriately. The performance kiss of death seems to be more related to this huge amount of ad hoc compilations rather than the single vs multi-threaded app. However in general, I agree, people should more consider multithreading in their apps.

Something else is puzzling me. From your results, I see that you achieve best performance with 6-8 threads. I assume your workstation does not have 6 or 8 cores, so their is an overbooking situation and at some point, context switches should throttle the script generation. From your own experience, how can you explain running with 6-8 threads on a (assuming) 2 cores workstation could be faster than 2 threads ?

Thanks,

David B.

July 30, 2010 3:55 AM
 

Ben Thul said:

Hi David,

If I were a betting man, I'd say that the reasoning for a speedup in the situation where you run more than two threads is that each thread issues a query (or many) to the SQL server and then has to wait on the results before it can use them.  In other words, you're not bound by the CPU of the machine running the SMO, but rather by SQL returning query results.  Run enough threads though and it does shift the pressure back to some resource on the SMO-running machine.

HTH,

Ben

July 30, 2010 7:59 AM
 

David BAFFALEUF said:

Thanks Ben. You must be right. Maybe this has more to do with SQL's IO completion routines.

David B.

August 1, 2010 5:53 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement