THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

A Brief History of Monitoring (A Month of Activity Monitoring, Part 1 of 30)

This post is part 1 of a 30-part series about the Who is Active stored procedure. A new post will run each day during the month of April, 2011. After April all of these posts will be edited and combined into a single document to become the basis of the Who is Active documentation.


"Hey DBA! Why is the application so slow?"

"Hey DBA! Why is my query taking, like, forever to return the results?"

"Hey DBA! Something is broken! Fix it, quick!"

If you've been a DBA for 10 minutes or more, you've no doubt heard all of the above statements and every possible transformation of the above statements. A big part of every DBA's job is to understand what's going on when things misbehave. And real-time activity monitoring plays a big part in the quest for insight: if something is broken right now, we need to understand exactly what components are malfunctioning and use that data to quickly make decisions about which course of action to take. The worst thing a DBA can do is to base a decision on an uninformed guess.

In order to avoid guesswork, a monitoring solution must provide plenty of data. Data about the component that's malfunctioning, data about what else is running on the system, and if possible contextual data to help the DBA understand how the system got into the state it's in. That's a lot of data, and over the course of the history of SQL Server the level of availability for this kind of information has ranged from totally unexposed (SQL Server 2000 and earlier) to exposed but difficult to access (SQL Server 2005 and beyond).

 

SQL Server 7.0 and SQL Server 2000: Squinting Through the Fog

Back in the bad old days of Enterprise Manager, if you wanted to know what was running on your instance you could right-click and select "Current Activity." If Enterprise Manager didn't lock up or take some other completely unpredictable route, the user was rewarded with a list of server processes identifiers, a terse column called "command" that gave a very general indication of what each of those processes was up to, and some basic metrics--CPU, reads, writes, etc--that were known to be woefully inaccurate in most cases.

More advanced users quickly learned to avoid Enterprise Manager altogether. The same information could be gleaned within Query Analyzer by using the sp_who or sp_who2 procedures, or by querying the sysprocesses view. While using Query Analyzer made data collection faster than it was from the Enterprise Manager user interface, the information was still of the same level of quality--or lack thereof. The screen shot below illustrates the state of the art information shown by these tools. Session 54 seems to be kind of busy, I guess? I wonder what it’s up to? Well, since I have no clue and the server is slow, I should probably just kill it.

F1_01_sp_who2

Users who were geeky enough to read internals books knew how to get just a bit more information about what the offending session was doing—sort of. The DBCC INPUTBUFFER command would return information about the most recent SQL batch that had been submitted to the server on behalf of the request. This is much better than simply seeing that session 54 is doing some kind of select, but it’s also quite limiting. The SQL submitted might have been a non-parameterized ad hoc batch, in which case seeing what was happening was easy. But as more and more developers learned to use stored procedures, DBCC INPUTBUFFER often returned something as simple as “EXEC SomeStoredProcedure” – and if the procedure had been called via RPC, it wasn’t even possible to see the parameters that were passed in. (And, I should add, it’s still not possible now. A lot has changed in 13 years, but not enough.)

In order to gain more visibility into what was going on on the server, many DBAs forgot about these commands altogether and employed another tool that shipped with SQL Server: Profiler. Most of the DBAs I worked with in the late '90s and early '00s kept Profiler open and attached to the various production instances, all day long. Various information would constantly scroll by and, if there was a problem, the stream could be stopped and the DBA could scroll up and down and try to figure out exactly what the situation was. This technique had its plusses and minuses, to be certain: Profiler showed a lot more information—enough to actually figure out the problem in many cases. But it also showed a lot more information—enough to be overwhelming in many cases. And then there was the fact that Profiler could slow down your entire instance of SQL Server. Which was especially problematic when a whole team of DBAs were all working on the same server, and all had Profiler attached and streaming information.

 

SQL Server 2005: Information Overload

Clearly, the monitoring situation in SQL Server 2000 was pretty bad. And luckily, Microsoft got the memo. SQL Server 2005 shipped with a set of new monitoring objects called Dynamic Management Views (DMVs). These objects returned a huge amount of information that had never been available in the SQL Server 2000 system views. Many situations that had previously been possible to debug only by getting information from Profiler or a server-side trace were suddenly possible to deal with by running a few SQL queries. The only problem was that for a long time, no one seemed to know exactly how to write the correct queries. With scores of DMVs, each with scores of columns, things were overwhelming, and busy DBAs simply didn’t have the time to properly adapt.

Even Microsoft didn’t seem to be able to leverage these powerful new views. The screen shot below is from SQL Server 2008’s Activity Monitor. The 2008 UI is a lot sharper than the SQL Server 2000 Current Activity UI, and more data is returned by Activity Monitor, but the situation is basically the same as it ever was. Session 54 is still chugging away, doing, well... something.

F1_02_Activity_Monitor

To be fair, I can now right-click on any of these rows and find out what SQL is being run by this session. But the user experience is still not even close to user-friendly, and Activity Monitor has a large number of bugs and strange behaviors. Why does it auto-refresh every five seconds? What if I was looking at something? And why do I see a million rows for session 54? There was only one request, wasn’t there?

Even today, in 2011, many DBAs I talk to are still using sp_who and sp_who2. A lot of them are still relying on streaming information from SQL Server Profiler. And while people have finally learned to leverage the DMVs, they often use small, standalone ad hoc scripts and query one or two DMVs to find a very specific bit of information. We’ve progressed from a situation where there is not enough information, to a situation where there are too many places to go for the information that really matters. A lateral move at best.

 

A Personal Journey

In 2007 I decided to get serious about the DMVs, and I began working on a script to help with monitoring. The first version was posted to my blog on December 31, 2007.

Now, four years later, I have taken that script through countless iterations and I’ve learned a tremendous amount about the various DMVs along the way. The Who is Active stored procedure correlates a large amount of data from 15 of the DMVs, to allow DBAs to get a complete picture when doing real-time activity monitoring. Although my stored procedure has been well-received and is designed to make it easy to get information from the DMVs, it has a large number of options and a few quirks. So it's no surprise that I have received numerous requests for in-depth documentation. That’s the point of this blog series: Over the course of the month I will take you through every corner of Who is Active. I will explain how I use it to do troubleshooting on a daily basis, and I'll give you some insight into how it works and why.

Tomorrow I will continue the series with an overview of the design philosophy and goals of Who is Active.

 

Homework

If you still use sp_who2, tell me in the comments below what the LastBatch column represents. I’ll reveal the answer in tomorrow’s post.

Published Friday, April 01, 2011 11:00 AM by Adam Machanic

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Sean Shanny said:

As always informative and entertaining.  I use your script a gazillion times a day and cannot imagine life without it.

April 1, 2011 12:30 PM
 

DaniSQL said:

Thanks for your work. I use your script all the time but i know it can do a lot more and I cant wait to read all the series...I even asked you on twitter for help months ago regarding documentation.

April 1, 2011 1:22 PM
 

Ken Trock said:

I have the script and am looking forward to this series.

April 1, 2011 2:44 PM
 

Adam Machanic said:

This post is part 2 of a 30-part series about the Who is Active stored procedure. A new post will run

April 3, 2011 12:58 PM
 

aspiringgeek said:

Beautiful!  Can't wait for more!

April 3, 2011 3:38 PM
 

fase hf said:

well, you are trying

April 16, 2011 2:03 PM
 

monitor said:

try some more

April 16, 2011 2:05 PM
 

Adam Machanic said:

This post is part 8 of a 30-part series about the Who is Active stored procedure. A new post will run

April 20, 2011 10:40 AM
 

Adam Machanic said:

This post is part 30 of a 30-part series about the Who is Active stored procedure. A new post will run

April 30, 2011 11:44 AM
 

Richard said:

Was this 30-part series ever consolidated into documentation?

July 27, 2013 6:03 AM
 

Adam Machanic said:

@Richard: Unfortunately, no. It's still on my potential to do list but I decided to focus my energy on something else in the meantime.

July 27, 2013 4:07 PM
 

Eva Arriaga said:

Hi Adam,

New to the SQL world.. learning SQL 2012.. I want to read all of the A Brief History of Monitoring, and to use your Script once I understand it..

I read part 1 here, but cannot find the rest.

Where is it?

thanks,

Eva

January 22, 2014 2:00 PM
 

Eva Arriaga said:

I found it.. thanks!  Does your script works for SQL 2012?

January 22, 2014 2:11 PM
 

Adam Machanic said:

@Eva

Yes, it works with all current versions of SQL Server 2005 and up.

--Adam

January 22, 2014 8:56 PM
 

Kumar Duvvuri said:

Does the Script works with SQL Server 2014

Thanks

May 7, 2014 1:05 PM
 

Mike said:

This single best free tool on the market for SQL Server DBAs. Thanks Adam. I use it every single day.

August 26, 2014 1:49 PM

Leave a Comment

(required) 
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement