THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

One more reason to to understand query plans, not directly performance related

One of the things that separates a good programmer from a great one is a firm understanding about what is going on inside the computer. For some programming languages, it is very obvious what is going on inside the computer because you are working at a very low level. For example, if you are a C/C++ programmer writing an OS, you will know a lot about the hardware as you will interact with it directly. As a .NET programmer you are more encapsulated from the hardware experience, making use of the .NET framework.

None of the aforementioned programming languages comes anywhere close to the level of encapsulation that we SQL programmers work with.  When you execute a statement like:

SELECT *
FROM    Tablename

A firestorm of code is executed to optimize your query, find the data on disk, fetch that data, format it for presentation, and then send it to the client. And this is the super dumbed down version.  SQL is a declarative language, where basically we format a question or task for the system to execute without telling it how.  It is my favorite type of language because all of the pushing bits around get tedious.  However, what is important for the professional SQL programmer is to have some understanding of what is going under the covers, understanding query plans, disk IO, CPU, etc. Not necessarily to the depth that Glenn Alan Berry (http://sqlserverperformance.wordpress.com/) does, but certainly a working knowledge.

Performance is the obvious reason, since it is clearly valuable to be able to optimize a query, but sometimes it can come in handy to debug an issue you are having with a query. Today, I ran across an optimizer condition that, while perfectly understandable in functional terms, would have driven me closer to nuts if I hadn’t been able to read a query plan. The problem came in based on the number of rows returned, either it worked perfectly or it failed with an overflow condition. Each query seemingly touches the exact same rows in the table where the overflow data exists…or did it.

The setup. The real query that the problem was discovered in was our data warehouse, and was a star schema configuration with 20+ joins. In the reproduction, I will use a simple table of numbers to serve as the primary table of the query.

create table ExampleTable  -- It really doesn’t matter what this table has. The datevalue column will be used to
                           -- join to the date table, that I will load from the
(                          -- from the values I put in this table to make sure all data does exist
    i int constraint PKExampleTable primary key,
    dateValue date
)

;with digits (i) as( --(The code for this comes from my snippet page: http://www.drsql.org/Pages/Snippets.aspx).
                    select 1 as i union all select 2 as i union all select 3 union all 
                    select 4 union all select 5 union all select 6 union all select 7 union all
                    select 8 union all select 9 union all select 0)
,sequence (i) as ( 
                    select D1.i + (10*D2.i) + (100*D3.i) + (1000*D4.i) + (10000*D5.i) + (100000*D6.i) 
                    from digits as D1, digits AS D2, digits AS D3 ,digits AS D4, digits as D5, digits As D6
)
insert into ExampleTable(i, dateValue)
select i, dateadd(day, i % 10,getdate()) -- Puts in 10 different date values
from sequence
where i > 0 and i < 1000
order by i

Next I will load the date table with all of the distinct dateValue values that we loaded into the ExampleTable, plus one, which is the max date value for the datatype. In the “real” world case, this is one of our surrogate null values we use to indicate that it is the end date. (Yes, we are ignoring the Y10K problem.)

create table date
(
    datevalue date constraint PKDate primary key
)

insert into date
select distinct dateValue
from   ExampleTable
union all
select '99991231'
go

In the typical usage, the number of rows is quite small.  In our queries, we are adding 1 to the dateValue to establish a range of a day (in the real query it was actually a month). Executing the following query that returns 99 rows is successful:

select *, dateadd(day,1,date.dateValue)
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue
where  i < 100

However, remove the where clause (causing the query to return 999 rows):

select *, dateadd(day,1,date.dateValue)
from   ExampleTable
          join date
             on date.dateValue = ExampleTable.dateValue

And you will see that this results in an overflow condition...

Msg 517, Level 16, State 3, Line 2
Adding a value to a 'date' column caused an overflow.

Hmmm, this could be one of those days where I don’t get a lot of sleep :).  Next up, I check the max date value that can be returned.

--show that the top value that could be returned is < maxdate
select max(date.dateValue)
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue

At this point, I start feeling like I am going nuts. The value returned is 2013-01-30. So no data is actually returned that should be too large for our date column… So then I think, well, let's add one to that value and take the max:

select max(date.dateValue), max(dateadd(day,1,date.dateValue))
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue

This returns, mockingly:

Msg 517, Level 16, State 3, Line 2
Adding a value to a 'date' column caused an overflow.

So, since it worked with fewer rows earlier. I decide to try lowering the number of rows again, this time using a derived table, and it DOESN’T error out, even though it is obvious (because I stacked the deck…data) that the same data is just repeated for the dateValue, particularly since we get the same max dateValue as we did earlier.

select max(date.dateValue), max(dateadd(day,1,date.dateValue))
from   (select top 100 * from ExampleTable order by i) as ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue
 
       
Well, you are possibly thinking, this just doesn't make sense. It is how I felt too after trying to do the logic in my head. I will admit that I didn’t know about query plans I would have been completely lost. But alas, the answer was fairly easily located in the plan. Taking a look at the plan for the query version that returns 99 rows:

select *, dateadd(day,1,date.dateValue)
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue
where  i <= 100

We get the following estimated plan:

image

In this plan, it uses a nested loops operator, which basically will do 100 seeks from the top input (the ExampleTable), for each row fetching the date value, and then calculating the scalar value (dateadd(day,1,date.dateValue) ) on the values that match in the plan. Since the 9999-12-31 date is never used, there is no overflow.

However, when the number of rows in the when the size of the output reaches a certain tolerance (in this case 999 instead of 99) from the following query:

select *, dateadd(day,1,date.dateValue)
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue

We get a different plan, one that is causing us issues:

image

Instead of nested loops, it uses a Hash Match Join, which takes the entirety of the smaller table and builds an internal hash index (basically setting up buckets that can be scanned much faster than an entire table…in our case, probably just a single bucket), and then scan the other set checking to see if the row exists in the hash index.

It is in the process of building the hash index that our query runs into trouble. Since the date table is so much smaller, it plans to build the hash index on that table, and pre-creates the scalar values as it is doing the scan, since there are 11 rows in the date table, rather than having to calculate the value 999 times if it did it after the join. When it adds a day to the 9999-12-31 date, it fails.

I know, the question of how practical is this scenario is bound to arise. I won’t lie to you and suggest that it is likely to happen to you as it it did to me. However, the point of this blog isn’t that this one scenario is bound to happen to you, but rather that understanding how SQL Server executes queries will help to give you insight to fix problems with your system, mostly performance, but sometimes every esoteric issues that won't just leap out as being based on the query plan that was chosen. (For more reading on query plans, check out Grant Fritchey’s Simple-Talk book on query plans: http://www.amazon.com/Server-Execution-Plans-Grant-Fritchey/dp/1906434026).

In the end, the fix to my problem was simple. Make sure that the value that has meaning in the table, but not in the query, was filtered out:

select *, dateadd(day,1,date.dateValue)
from   ExampleTable
         join date
            on date.dateValue = ExampleTable.dateValue
               and date.dateValue < '9999-12-31'

Note: a commenter noted that in some cases, excluding the offensive data using the ON criteria/WHERE clause may not solve the issue. This is very true, and really will be made evident in the plan. I would expect it to be more likely to be definitely excluded in the JOIN clause, but you really can't guarantee anything that the optimizer might do without changing the source data (or representing the source data using a derived table as):

select *, dateadd(day,1,date.dateValue)
from ExampleTable
       join (select * from date where date.dateValue < '9999-12-31') as date
            on date.dateValue = ExampleTable.dateValue      
             

Looking at the different variances to the plan you should be able to diagnose a "hidden" problem such as I have described by finding the flow of data and making sure that the filtering operation happens before the calculating of the scalar that causes the overflow error. This may harm performance in my query for even the more "ideal" case where it could have used indexes, so you may yet have more work to do...But this is what makes data programming fun, now isn't it?

Published Tuesday, January 22, 2013 11:28 PM by drsql

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

SQLChap said:

I've seen that issue with dateadd happening even when you have the where clause in place. The same reason in that it was completing the dateadd function first before computing the where criteria.

January 23, 2013 7:49 AM
 

drsql said:

SQLChap: Good point that I will add tonight when I am at my writing computer :)  One thing to note is that putting the criteria in the JOIN criteria is probably a good bit safer than in the WHERE clause. In the JOIN criteria, it is defining the set and is definitely going to be applied as you expect.  In the WHERE clause, other JOINs (particularly any OUTER JOINS) could be dictating that the criteria not be applied until later in the plan.

Thanks!

January 23, 2013 11:30 AM
 

Russell Tye said:

Hi Louis,

Thanks for sharing. That was a great demo. I started reading Grant Fritchey's book about two weeks. I strongly recommend it also. It should be required reading for all SQL developers DBAs.

January 26, 2013 4:13 PM
 

Tom said:

Hi Russell,

Which of Grants book ?

January 27, 2013 7:13 AM
 

Richard said:

Couldn't you use a CASE clause instead?

select *,

case when date.dateValue < '9999-12-31' then dateadd(day, 1, date.dateValue) end

from ExampleTable

join date

on date.dateValue = ExampleTable.dateValue

January 28, 2013 1:15 PM
 

drsql said:

Richard:

Good point, that too would solve the issue with the overflow, would produce the same results, though the same query plan. I would probably try the JOIN ON criteria first if I were dealing with an issue as it hits at the root of the problem, but in any case, the plan can help guide you the "best" solution in your case.

January 28, 2013 1:58 PM
 

Don said:

A scalar function operates on 1 row at a time. It is much better to operate on sets of data instead of 1 row at time. This can be done by generating a temp or table variable with a computed column containing the data in the form you want. Then you can select off that table. By removing the row processing you increase performance.

February 13, 2013 2:02 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Links to my other sites

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement