THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Rob Farley

- Owner/Principal with LobsterPot Solutions (a MS Gold Partner consulting firm), Microsoft Certified Master, Microsoft MVP (SQL Server), APS/PDW trainer and leader of the SQL User Group in Adelaide, Australia. Rob is a former director of PASS, and runs training courses around the world in SQL Server and BI topics.

Waiting, waiting…

“It just runs slow these days”

I’m sure you’ve heard this, or even said it, about a computer that’s a few years old. We remember the days when the computer was new, and it seemed to just fly – but that was then, and this is now. Change happens, things erode, and become slower. Cars, people, computers. I can accept that cars get slower. They lose horsepower over time as the precision components wear and become less precise. I also know that my youth is a thing of the past. But electronics? What happens there?

Well, in my experience, computers don’t get slower. They just feel slower. I see two main reasons, and neither of them are because of ageing hardware.

Your computer might be slower than it was yesterday even. In the world of databases we might even be investigating why the computer is slower than it was five minutes ago. Again, it’s probably not because of ageing hardware.

One possible reason is that we’re simply asking systems to do more. If we’re comparing our laptops to when we bought them, we’re probably refreshing webpages more frequently (often in the background) and have installed too many utilities (hopefully not in the background, but you never know), and the system has more to get done in a given minutes compared to when it was new. With a database server, the amount of data has probably grown, there may be more VLFs in the log file to deal with, more users pushing more transactions. These are not things you want to uninstall like that annoying browser search bar on your aunt’s ageing computer, but they can be a very valid reason for things to be slower. Hopefully you are tuning your system to make sure that scalability is possible, and you’re very happy with the amount of extra work that’s being done, even if it does mean that some processes take a little longer than they once did.

This problem can be summarised by the fact that the system is having to wait for resources to be free so that it can get its stuff done.

Another reason for slowness is that the system is having to wait more for other reasons, things that you don’t want it having to wait for. An increase in busyness will cause slowness because of waiting, but you can easily make the argument that this is ‘acceptable’. It’s much more of a problem if the system is being slower without actually achieving any more than it was before.

TSQL2sDay150x150Waits are the topic of this month’s T-SQL Tuesday, hosted by Robert Davis (@sqlsoldier). Go and have a look at his post to see what other people have written about on this topic.

In the SQL Server world, this kind of problem is identified by looking at wait stats. The system records what processes are waiting for, and you can see these by querying sys.dm_os_wait_stats. It’s very useful, but querying it in isolation isn’t as useful as taking snapshots of it. If you want to store copies of it over time, you may prefer to do something along the lines of:

--A schema for monitoring data can be useful
create schema monitoring;

--Create a table that has the structure of sys.dm_os_wait_stats
select top (0) *
into monitoring.waits
from sys.dm_os_wait_stats;

--Add a column to know the the stats are collected
alter table monitoring.waits
add snapshot_time datetime default sysdatetime();

--Run this section regularly
insert monitoring.waits (wait_type, waiting_tasks_count, wait_time_ms, max_wait_time_ms, signal_wait_time_ms)
select * from sys.dm_os_wait_stats;

Regularly collecting snapshots of wait_stats like this can give you a picture of what has occurred over time. You can easily pull this data into a report, or into Excel, or even get a picture of a recent version quite easily, using a query such as:

with numbered as (
select *,
    wait_time_ms - lead(wait_time_ms) over (partition by wait_type order by snapshot_time desc) as diff_wait_time,
    waiting_tasks_count - lead(waiting_tasks_count) over (partition by wait_type order by snapshot_time desc) as diff_wait_count,
    1000 * datediff(second,lead(snapshot_time) over (partition by wait_type order by snapshot_time desc),snapshot_time) as diff_ms,
    row_number() over (partition by wait_type order by snapshot_time desc) as rownum
from monitoring.waits
)
select wait_type, snapshot_time, diff_wait_count, diff_wait_time, diff_ms
from numbered
where rownum = 1
order by diff_wait_time desc, wait_type;

This query compares the amount of wait time for each type (which is frustratingly stored as a string) since the previous one, using the LEAD function that was introduced in SQL Server 2012 (LEAD rather than LAG because we’re looking at snapshot_time desc, not ASC). Using ROW_NUMBER(), we can easily pick out the latest snapshot by filtering to rownum = 1, but if you’re just wanting to chart them, the contents of the CTE will be enough.

Make sure you keep an eye on the amount of data you’re storing, of course, and be careful of the impact of someone inadvertently clearing the stats (though as the query picks up deltas, you should be able to consider a filter that will ignore the deltas that might have spanned a period during which the stats were cleared).

This post is not going to go into all the different wait types to tell you which ones are worth worrying about and which ones are worth ignoring. But what I would suggest to you is that you track what’s going on with your environment and keep an eye out for things that seem unusual. When troubleshooting, you will find any history invaluable.

Published Tuesday, December 10, 2013 11:17 AM by Rob Farley
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Robert L Davis said:

Thanks for participating in T-SQL Tuesday #49!

December 13, 2013 4:14 PM
 

murali said:

Hey Rob,

Thanks for the code, do you mind helping the same with SQL SERVER 2008 or 2008 R2. As We donot have LEAD or LAG functions.

Thanks in advance.

Regards

Murali

December 23, 2013 10:26 AM
 

Rob Farley said:

Sorry - forgot to get back to this after Christmas.

Use this code to avoid using 2012 features:

with numbered as (

   select *, row_number() over (partition by wait_type order by snapshot_time desc) as rownum

   from monitoring.waits

)

select  w1.wait_type, w2.snapshot_time,

       w1.waiting_tasks_count - w2.waiting_tasks_count as diff_wait_count,

       w1.wait_time_ms - w2.wait_time_ms as diff_wait_time,

       1000 * datediff(second, w2.snapshot_time, w1.snapshot_time) as diff_ms

from numbered w1

join numbered w2 on w2.wait_type = w1.wait_type

where w1.rownum = 1 and w2.rownum = 2

order by diff_wait_time desc, wait_type;

January 27, 2014 4:49 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement