THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Why is it good to be comfortable with multi-threading?

If you are an application developer, you should find a new job if you can't do multu-threaded programming. What if you are a DBA? Probably not, not to the same extent anyway.

Some in our community clearly realize the importance of being able to do things in parallel. Adam Machanic, for instance, has put in a lot of efforts in this area and is trying hard to spread the message on parallelism.

Unfortunately, the community in general does not seem to be as convinced. Perhaps when you've become accustomed to finding workarounds to drive nails with a wrench, you may not realize that hammers are a much better tool for that task.


Although doing parallelism does not necessarily mean that every DBA should become conversant in multi-threaded programming, I’d argue that it’s a good skill to have, and once you are comfortable with it, you’ll find plenty of opportunity to fruitfully apply it.


Here is a little anecdotal evidence from my recent experience.


So I needed to automate the removal and creation of a lot of replication setups, and for that I fully automated the generation of the replication delete and create scripts. I also automated the execution of these scripts. However, one potential issue with executing these scripts is that since these are kind of DDL scripts, and as you may know, any DDL change can easily get blocked. Thus, I had to automate the monitoring of blocking. Furthermore, I needed to automatically remove any blocking, if the execution ofthe a replication script got blocked.


Now, this handling of blocking needed to be granular and precise in that I needed to clear a spid only if it’s blocking the execution of a replication delete or create script. The solution I ended up with is to have the code that controlled the replication script execution spawn a monitoring thread for each of the servers (i.e. the publishers and subscribers), and as the main control code cycling through these servers to apply the replication scripts, it goes through the following logic:


  • The main control code connects to the server on which a replication script is to be executed, and retrieves the spid of its own connection. It passes this spid to the monitoring thread so that the monitoring thread can focus on that connection only, and will not be bothered by any other activities that may be going on the same server.
  • The monitoring thread connects to the same server and waits for the go-ahead.
  • The main control code signals the monitoring thread to start monitoring.
  • The main control code executes the replication script.
  • The main control code signals the monitoring thread to stop monitoring and wait for a new replication script to be executed.

The actual logic is slightly more complex because when we execute the replication create/delete script on a publisher, the publisher will make a connection to the subscriber and/or the distributor, we don't want those connections be blocked either. The monitoring thread must take care of that.


Although most or many of the DBA work appear to be conducive to single-threaded automation, tasks like this are not. If you think using multi-threading for this task is an overkill or there is a better way, l’d appreciate it if you eave a comment.


Published Friday, July 08, 2011 1:19 PM by Linchi Shea



Mark Broadbent said:

Very good post Linchi, totally agree that a DBA should understand in at least some capacity the effects of parallelism and application threading. I personally believe that a DBA should have a good understanding since in order to become a truly great DBA it really helps knowing about everything that can effect your Database Platform. And this is one area that can kill concurrency and scalability if done incorrectly.

Thanks again.

July 27, 2011 7:12 AM
New Comments to this post are disabled

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog


Privacy Statement