THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andrew Kelly

When GETDATE() is not a constant

A short while ago I was collecting wait stat information at a client and ran across a very peculiar situation that I would like to share. Let me start by saying that for years I have coded with the understanding that when you include a system function in the SELECT list of a TSQL statement the function was evaluated once at the beginning and that same value was used for each row returned. I am talking about a statement such as this:

                    SELECT GETDATE(), CompanyName FROM Customers

 

The output expected looks like this:

2008-02-27 10:22:34.270    Alfreds Futterkiste
2008-02-27 10:22:34.270    Ana Trujillo Emparedados y helados
2008-02-27 10:22:34.270    Antonio Moreno Taquería
2008-02-27 10:22:34.270    Around the Horn
2008-02-27 10:22:34.270    Berglunds snabbköp
2008-02-27 10:22:34.270    Blauer See Delikatessen
2008-02-27 10:22:34.270    Blondesddsl père et fils
2008-02-27 10:22:34.270    Bólido Comidas preparadas

...

Please note that I am not talking about a User Defined Function or once that takes a column as an input to determine the result. In this case I am specifically referring to GETDATE().  As you can see all the datetime values are exactly the same as expected.

     But what I experienced the other day was not as expected and quite concerning. What I got was for a single SELECT I received several different values for the GETDATE() column in the result set. This did not happen every time but happened enough times over a few days that I certainly took note of it. Now let me give a little more background because it was not just a SELECT. It was actually an INSERT INTO with the SELECT from a DMV. Not that any of this should matter anyway but for consistency sake let me give you the actual code (with a slight enhancement for demo purposes).  I added an extra column called R_ID that is used to store the unique value of each loop and I placed the Insert in a WHILE loop so it can be exercised.  In real life the Insert was only executed several times each day.  The code below can be used to see if your system is experiencing this behavior or not. Depending on the version and service pack you may have a different number of Waits but in my case with Microsoft SQL Server 2005 - 9.00.3159.00 (Intel X86) I get 201 rows for each pass. I suspect the version has everything to do with this behavior. The system at the time was running an older version of SQL Server 2005 which was:  9.00.2047.00.  If anyone finds that their server returns different values of GETDATE() for any iteration of the select I would really be interested in what version of SQL Server you are running.  There is a LOT of code out there that relies on the value acting like a constant and having the same value in each row of a single SELECT statement. I suspect this is a bug in that particular version but who knows...  

Please note that the sole purpose of the WHILE loop is just to give you a better chance of seeing the issue if it appears. We are looking for a difference in the datetime values for each instance of the SELECT only and not from loop to loop.

SET NOCOUNT ON

IF OBJECT_ID(N'[dbo].[wait_stats]',N'U') IS NULL
    CREATE TABLE [dbo].[wait_stats]
        ([R_ID] INT not null,
        [wait_type] nvarchar(60) not null,
        [waiting_tasks_count] bigint not null,
        [wait_time_ms] bigint not null,
        [max_wait_time_ms] bigint not null,
        [signal_wait_time_ms] bigint not null,
        [capture_time] datetime not null default getdate())

DECLARE @x INT
SET @x = 1

WHILE @x < 100
BEGIN

    INSERT INTO [dbo].[wait_stats] ([R_ID], [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], [capture_time])   
        SELECT @x, [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], GETDATE()
            FROM sys.dm_os_wait_stats

    SET @x = @x + 1
END

--  Find the ones that have odd counts. If this returns any rows you had a difference in time for a single itteration.

SELECT [R_ID], COUNT(*) AS [Totals], [capture_time]
    FROM [dbo].[wait_stats]
GROUP BY [R_ID], [capture_time] HAVING COUNT(*) <> 201

 **Updates**

I have some new and very important information about this subject and chose to put it in a new blog post that can be found here:

http://sqlblog.com/blogs/andrew_kelly/archive/2008/03/01/when-a-function-is-indeed-a-constant.aspx

 

 

Published Wednesday, February 27, 2008 11:53 AM by Andrew Kelly
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Linchi Shea said:

How far apart are these different values?

February 27, 2008 11:14 AM
 

Andrew Kelly said:

I don't have the results anymore as I had to change their values to all be the same to get my report to work but I believe they were several milliseconds apart.  They should not be even microseconds apart.

February 27, 2008 11:25 AM
 

Denis Gobo said:

Don't know if this is of any help to you

Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)  

output first 10 rows from first query

72 201 2008-02-27 11:27:58.113

82 201 2008-02-27 11:27:58.160

16 201 2008-02-27 11:27:57.833

1 201 2008-02-27 11:27:57.037

61 201 2008-02-27 11:27:58.067

24 201 2008-02-27 11:29:29.957

92 201 2008-02-27 11:27:58.207

91 201 2008-02-27 11:29:30.330

5 201 2008-02-27 11:29:29.880

71 201 2008-02-27 11:29:30.223

second query returns nothing

sql server 2008 ctp 6 second query returns 99 rows. here are the first 10 (this is from a virtual machine BTW)

66 475 2008-02-27 08:37:38.577

27 475 2008-02-27 08:37:36.977

35 475 2008-02-27 08:37:37.327

41 475 2008-02-27 08:37:37.557

19 475 2008-02-27 08:37:36.557

92 475 2008-02-27 08:37:39.920

29 475 2008-02-27 08:37:37.057

10 475 2008-02-27 08:37:36.157

81 475 2008-02-27 08:37:39.450

25 475 2008-02-27 08:37:36.877

February 27, 2008 11:43 AM
 

Barry Hughes said:

getdate() will return the time to nearest millisecond as each row is returned from select.  If query takes multiple milliseconds to run then getdate() will faithfully report that.  Try setting variable to getdate() immediately before query then use variable in query - time will then be the same on each row and will be the time the query started (it takes no time to set the variable in milliseconds)

February 27, 2008 12:17 PM
 

Andrew Kelly said:

Dennis,

Yes the times are expected to be different from Loop to Loop so what you show is normal. I didn't realize we were up to 475 wait types though:).  It's only an issue if the second query with the HAVING clause returns anything.  Thanks

February 27, 2008 12:47 PM
 

Andrew Kelly said:

Barry,

No it should not, that is the whole point of this. First off just for completeness it wouldn't change every ms since the lowest level of accuracy is 3.33ms.  But in any case here is an example by Louis Davidson that shows the differences between GETDATE() and a UDF that returns datetime info in the same query.  

use tempdb

go

create function dbo.test$wait

() returns datetime

as

   begin

       declare @i int set @i = 1

       while @i < 10000

         begin

           set @i = @i + 1

         end

       return (getdate())

   end

go

select getdate() AS [Getdate],dbo.test$wait() AS [UDF]

from   master.sys.sysobjects

February 27, 2008 12:56 PM
 

Adam Machanic said:

Andy,

Why would you expect the same value for every row?  Is that documented in BOL?  I personally expected different values for every row, like Barry mentioned, if the GETDATE() is in the SELECT list.  OTOH, I expect that the same value would be used for every row if GETDATE() is in the WHERE clause.  I vaguely recall verifying the SELECT list behavior in SQL Server 7.0 for a big ETL process I was working on at the time; I don't think this is a bug at all.

February 27, 2008 1:29 PM
 

Andrew Kelly said:

I expect it that way because a long time ago the MVP's were told (can't remember who it was unforntunately) that this was an optimization technique in which GEDATE() was only evaluated once at the beginning of the query instead of for each row.  At this point I don't care if I am right or wrong in that regard but what I do want is for it to be consistant one way or the other. The example that Louis provided shows that it does not get evaluated for each row otherwise the times would be similar to the UDF's output since it does get evaluated for each row.  Again I am OK either way but it needs to be clear which is the proper and expected behavior. In my experience over the last x many years this is the only time I have ever seen GETDATE() not return the same value.  I created this blog entry for the sole purpose of brining this to light as it is very clear now that there is code out there written two ways, each expecting a different behavior.

February 27, 2008 1:43 PM
 

Denis Gobo said:

>> I personally expected different values for every row

I remember having different values for every n rows, when importing 30k of rows from a nightly job a while back. That is why I used a variable back then because it was required that the rows had the same value

February 27, 2008 2:00 PM
 

Linchi Shea said:

I believe the value should be constant, but need to find doc to back myself up on this one. The reason I remember it is that I have had long discussions with our DB2 guru on a similar topic and in DB2 you can actually specify which behavior you want (constant or change as the rows are evaluated).

BTW, I tried >insert junk select id, ..., getdate() from _3_million_row_table<, and there was only one dt value in the target table after all 3 million rows were inserted.

February 27, 2008 2:40 PM
 

Denis Gobo said:

I am getting the same values also

Now I am not sure if what I said before is true maybe I was getting the same value and I needed the inserted value

select s1.id,getdate() as TheDate

into #sysobjects

from sysobjects s1 cross join sysobjects s2

cross join sysobjects s3

select min(thedate),max(thedate),count(*) from #sysobjects

drop table #sysobjects

SQL 2000

2008-02-27 14:45:41.677 2008-02-27 14:45:41.677 14706125

sql 2005

2008-02-27 14:47:46.477 2008-02-27 14:47:46.477 1191016

sql 2005 with an additional cross join (cross join sysobjects s4)

2008-02-27 14:48:31.040 2008-02-27 14:48:31.040 126247696

you can always do something like this if you want different values

create function fnThedate()

returns datetime

as

  begin

      return (getdate())

end

go

select dbo.fnThedate(),* from master..sysobjects s1

February 27, 2008 3:02 PM
 

AaronBertrand said:

I agree with Andy that it *should* be evaluated only once, but I have long been using a variable as a "constant" since there was no official statement that the suddenly elusive optimization will always take place.

Kind of reminds me of SELECT * without the ORDER BY... it was undocumented but almost guaranteed behavior in older versions of SQL Server, but not anymore...

February 27, 2008 4:34 PM
 

AaronBertrand said:

I also agree that it should either always work one way or always work the other.  The repros in this thread show that the behavior is not consistent.

February 27, 2008 4:43 PM
 

Ted said:

We always store GetDate() in a local varaible and then use that local variable in selects or stored procedures so that all of the date values are the same for that execution.

Our concern with GetDate() and dates in general is that there is no easy way to build a date apart from either constructing a text string of a date or using dateadd().  Is there any TSQL function to build a date similar to CREATEDATE(year, month, day, hour, minute, second, millisecond, local)

where each of the parameters are integers?  ANSI SQL missed that one.

February 28, 2008 11:25 AM
 

Nick said:

What you were told was correct, you are just looking at it wrong.

The GETDATE() is evaluated once for each statement. It is NOT evaluated once for each batch. You effectively have 100 insert statements in your batch, not one insert statement.

To make this more obvious, put a waitfor delay '00:00:01' before your SET @x = @x + 1  statement

If it behaved like you imply there would be no way to determine start and end time of a stored procedure from within it, since GETDATE() would evaluate to the same value at the beginning and end of the sp no matter how many milliseconds of work it had done with that particular execuation.

February 28, 2008 11:20 PM
 

Andrew Kelly said:

Nick,

The Batch or WHILE loop was simply used to make it easier for people to run this example multiple times in order to have a better chance of seeing the anomoly. It did not happen to me at the client site each time and I didn't expect it to happen for anyone testing each time either.  Yes each iteration of the loop will potentially produce a different time, especially if you add a WAITFOR in there.  But the times for the individual SELECTs (not Inserts since the GETDATE() is actually part of the SELECT) should remain constant.  If you run the example I posted after the fact you will see this more cleary. Look for the foloow on post that has this statement in it:

select getdate() AS [Getdate],dbo.test$wait() AS [UDF]

from   master.sys.sysobjects

February 29, 2008 8:31 AM
 

Andrew Kelly said:

March 1, 2008 11:19 AM
 

Jeff Moden said:

I can't prove it anymore, but I seem to remember that SQL Server 7 would produce different dates.  That certainly doesn't appear to be the case in these later revs.  In fact, table defaults also react in the very same manner as you describe... one date per insert query.  Thanks for bringing this to my attention... it doesn't really change things but it certainly seems important to know (A Developer must not guess... A Developer must KNOW  --Segiy circa 2007).

Here's the code I used to test in both SQL Server 2000 sp3A (no hotfixes) and SQL Server 2005 sp2 (no hotfixes)... Since each Grouped SELECT returns only a single row, it's proof that GETDATE() was only calculated once...

--==============================================================================

--Proof that GETDATE() is only calculated once per query.

--==============================================================================

--===== Generate a table on the fly with 2 million rows of GETDATE() inserts

    -- and measure/display the time it took.

DECLARE @StartTime DATETIME

   SET @StartTime = GETDATE()

SELECT TOP 2000000

       GETDATE() AS TheDate,

       sc1.ID AS SomeColumn

  INTO #MyHead

  FROM Master.dbo.SysColumns sc1,

       Master.dbo.SysColumns sc2

 PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) +' Duration in MS'

--===== If this returns more than 1 row, there were different times.

SELECT TheDate, COUNT(*) AS TheCount

  FROM #MyHead

 GROUP BY TheDate

DROP TABLE #MyHead

GO

--===== This time, create a table with a default on TheDate column.

    -- Do 2 million inserts letting the default populate the column and

    -- measure/display the time it took.

CREATE TABLE #MyHead (TheDate DATETIME DEFAULT GETDATE(), SomeColumn INT)

DECLARE @StartTime DATETIME

   SET @StartTime = GETDATE()

INSERT INTO #MyHead

       (SomeColumn)

SELECT TOP 2000000

       sc1.ID AS SomeColumn

  FROM Master.dbo.SysColumns sc1,

       Master.dbo.SysColumns sc2

 PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) +' Duration in MS'

--===== If this returns more than 1 row, there were different times.

SELECT TheDate, COUNT(*) AS TheCount

  FROM #MyHead

 GROUP BY TheDate

DROP TABLE #MyHead

March 1, 2008 1:52 PM
 

Jeff Moden said:

Formatting got my post a bit... here's what I really said just before the code...

I can't prove it anymore, but I seem to remember that SQL Server 7 would produce different dates.  That certainly doesn't appear to be the case in these later revs.  In fact, table defaults also react in the very same manner as you describe... one date per insert query.  Thanks for bringing this to my attention... it doesn't really change things but it certainly seems important to know (A Developer must not guess... A Developer must KNOW  --Segiy circa 2007).

Here's the code I used to test in both SQL Server 2000 sp3A (no hotfixes) and SQL Server 2005 sp2 (no hotfixes)... Since each Grouped SELECT returns only a single row, it's proof that GETDATE() was only calculated once...

March 1, 2008 1:55 PM
 

Jeff Moden said:

Dang formatting!!!!  Words are being cutoff from the right!!! For example, I actually say "That certainly doesn't appear to be the case"... in the above, but the "n't" got cut off.  If you copy and paste to Word, the missing words show up...

March 1, 2008 1:58 PM
 

David Fine said:

These items from BOL show that GETDATE is designed to get and return a new value every time it is called.

From Books On Line (SQL-Server 2005)

GETDATE is a nondeterministic function. Views and expressions that reference this column cannot be indexed.

SQL Server built-in functions are either deterministic or nondeterministic. Functions are deterministic when they always return the same result any time they are called by using a specific set of input values. Functions are nondeterministic when they could return different results every time they are called, even with the same specific set of input values. For more information, see Deterministic and Nondeterministic Functions

October 10, 2008 11:36 AM
 

Andrew Kelly said:

David,

The question here is not if it is deterministic or not. Getdate() will indeed return a different value each time it is called. The question was does the function get evaluated with every row or once per set. The proper and intended behaviour is indeed once per set and not every row. This has been confirmed to be a bug and should not behaive in the manner listed above.

October 10, 2008 2:20 PM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement