THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Page Free Space

See also my SQL Server technical articles on SQLperformance.com

Temporary Tables in Stored Procedures

Published Wednesday, August 15, 2012 5:22 AM by Paul White

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

AlexK said:

Paul,

This is a very interesting post. I asked my team to read it. Thank you for taking time to write it up.

If you create temporary tables implicitly, via SELECT INTO, do you observe the same behavior?

AK

August 14, 2012 10:40 PM
 

SomewhereSomehow said:

Good post Paul!

Very interesting note about colmodctr and option(recompile), as I understood it is assumed to be empirical and not documented anywhere? If so, you should definitly file a connect item, I'll vote.

Because it seems to be logical that option(recompile) doesn't affects stats (or we would have a lot of update stats when using option recompile for other purpose), but it is not logical, that it starts accumulating mods and suddunly leads to update stats. It may lead to a kind of "mystery behaviour" when debugging. So thank you for pointing that!

Also I consider that explicit update stats should cause a recompilation in any way (I even tried explicitly create stats - no effect, the only issue to deal without recompile and updatestats is to put "alter table #temp rebuild;" - command, than it is effect, but that is realy nasty, especially for big tables, but hopefully may be not so harmfull for small...)

August 15, 2012 11:18 AM
 

Paul White said:

Hi Alex,

Thanks for that, I appreciate it. Yes, the same behaviour applies to temporary tables created with SELECT...INTO, as the modified version of the test procedure below demonstrates.

ALTER PROCEDURE dbo.Demo
   @StartsWith nvarchar(50)
AS
BEGIN
   SET NOCOUNT ON;

   SELECT
       p.ProductID,
       p.Name
   INTO #Temp
   FROM Production.Product AS p
   WHERE
       p.Name LIKE @StartsWith + N'%';

   SELECT
       t.Name,
       OrderCount = COUNT_BIG(DISTINCT th.ReferenceOrderID)
   FROM #Temp AS t
   JOIN Production.TransactionHistory AS th ON
       th.ProductID = t.ProductID
   GROUP BY
       t.Name;

   DBCC SHOW_STATISTICS (N'tempdb..#Temp', Name) WITH STAT_HEADER, HISTOGRAM;

   DROP TABLE #Temp;

END;
GO
DBCC FREEPROCCACHE;
EXECUTE dbo.Demo @StartsWith = N'E';
EXECUTE dbo.Demo @StartsWith = N'T';

That said, tables created this way tend to have at least one index created explicitly after the SELECT...INTO, which will prevent temporary table caching (see the White Paper for details).  That action will prevent statistics being cached across executions, but does not the solve the other issues shown in the main post.

Cheers,

Paul

August 15, 2012 12:36 PM
 

Paul White said:

Hi Dima (SomewhereSomehow),

Thanks.  I will try to put a Connect item together soon, and update the post above with the details.

An explicit CREATE STATISTICS counts as DDL on the temporary table, which prevents temporary table caching (and the caching of statistics that does with it), but as I mentioned to Alex Kuznetsov above, that does not solve the other issues (and it means we don't get the benefit of temporary object caching either, of course).  Rebuilding the table works, naturally, but it is, as you say, not a happy solution!

I'm glad you enjoyed this post; there will be more details on the caching mechanism in the next one.

Paul

August 15, 2012 12:44 PM
 

Paul White said:

August 15, 2012 2:39 PM
 

Alejandro Mesa said:

Paul,

Great post, as always!

I knew about SQL Server caching temporary tables, but I would never imagine that statistics created by AutoStats on those tables were cached too. More intriguing to me is the fact that there is no recompilation of the cached plan referencing the cached temp table, after manually updating the cached statists that were found interesting during compilation. I am also surprised about the way OPTION (RECOMPILE) carried column modifications between procedure calls.

I already added my voted.

Thanks for sharing your findings with us.

--

AMB

August 16, 2012 11:50 AM
 

Paul White said:

Hi AMB,

Do you realize you just summarized in 30 words what it took me 6,600 words to say?  LOL!  Seriously, thanks for that - it's good to hear that the points I was trying to make were received as intended, and that you found it as surprising as I did.  Thanks also for the vote.

Paul

August 16, 2012 12:14 PM
 

WIDBA said:

Wow - I think this really explains some stuff that I used to fight with in "search" procedures using temp tables.

Thanks for another fantastic read - that I will need to read one more time to digest.

August 16, 2012 2:07 PM
 

Paul White: Page Free Space said:

SQL Server 2005 onward caches temporary tables and table variables referenced in stored procedures for

August 16, 2012 5:51 PM
 

Martin Smith said:

Regarding Recompile Thresholds I've encountered a case where increasing table cardinality from 10 to 1,000,000 still doesn't seem to cause an optimality based recompile.

https://connect.microsoft.com/SQLServer/feedback/details/725697/missing-optimality-based-recompile-on-cached-temporary-table-with-clustered-pk

August 17, 2012 8:54 AM
 

Sat Pal said:

Great Post Paul!

I usually get distracted in between when there is a big post and takes me longer than usual to read through the whole post, but in this one I was hooked and read it in one go and couldn't believe what I used to beleive about the Temp Tables.

Would the bevaiour of the temp table be same if there lot of execution happens simultaneously on the same stored procedure in highly busy OLTP server (around 20,000 executions/min of a stored procedure). Would it still use the same cached temp table or multiple copies of the temp table, how the colmodctr would behave if there are multiple temp tables are created?

Sat Pal

August 17, 2012 1:57 PM
 

Paul White said:

Hi Martin,

Yes, on the face of it, I would expect that to recompile due to the change in cardinality of #T from 10 when the plan was cached to 990000 for the second execution.

I notice that it does behave as I would expect so long as there are no leading-column statistics associated with an index on column 'n' of #T.  There are two things I tried: (1) commenting out the PRIMARY KEY, so #T becomes a heap and *auto-statistics* are created for column 'n'; (2) changing the PRIMARY KEY to (s, n) - a clustered table without leading-column index statistics for 'n'.  (The odd behaviour still reproduces if the PRIMARY KEY is NONCLUSTERED on 'n'.)

It seems statistics associated with a constraint on a cached temp table might have a different behaviour from auto-statistics.  I'm about to leave for a weekend away, so I won't have chance to dig into this further for a few days.

There is no shortage of odd behaviour in this area, and I am sure I didn't cover it all even in this very long post.  One thing to try: run your repro again, but after EXEC t2 10; display an estimated plan for EXEC T2 1000000; just before running EXEC T2 1000000; 'for real'.

The table cardinality of #T in the estimated plan is zero. The cached temporary table is not reused.  A *new* temporary table briefly exists, creates *new* statistics and is then dropped.  The 'real' execution of EXEC T2 1000000; that follows recompiles with a reported reason of 'Option (recompile) requested' - even though we didn't.

Honestly one can go gradually mad trying to make sense of all this.

Paul

August 17, 2012 4:39 PM
 

Paul White said:

Hi Sat Pal,

I was a bit worried about the length of this one, so thanks for the feedback on that.

I think I answer your questions about simultaneous executions in my follow-up post (at http://bit.ly/TempCaching).

Paul

August 17, 2012 4:50 PM
 

Martin Smith said:

Strange behaviour indeed that requesting an estimated plan should affect things like that!

I've also found that the modification counter doesn't always seem to get updated.

Starting with an empty procedure cache whether or not the rcmodified column in sys.sysrscols is updated after executing the procedure seems to depend upon the value passed to the stored procedure and hence temp table size.

For me anything between 1 to 277843 (temp table size = 275065 rows) leaves rcmodified at 0. Whereas from "EXEC T2 277843;" upwards  (temp table size = 275066 rows) updates this value.

August 18, 2012 9:20 AM
 

MarkD said:

Great post Paul, thank you very much for taking the time!

August 18, 2012 12:26 PM
 

Nakul Vachhrajani said:

A very, very interesting post! Thank-you for taking the time out and writing it, Paul! I will be reading it again tomorrow, ask the team to read it and then spend a couple of hours during week-ends trying to digest everything. Thank-you very much!

August 19, 2012 2:07 AM
 

Jose Varghese said:

Hi Paul,

Excellent post!

Some time back I had my encounter with parameter sniffing. All that I could find online was just a few posts - providing just high end theoretical information.

Compared to that, your post is entirely different in content.. good job.

Thanks to my friend Shyam Viking for bringing my attention to this post.

August 19, 2012 10:50 AM
 

John Shahan said:

I have a simple question given the complexity of the subject matter.

Would it make sense in some cases to dynamically (programatically) change the name of the temporary tables everytime the stored procedure is run in order to be sure you have accurate statistics.

August 20, 2012 1:17 PM
 

Paul White said:

Martin,

Connecting via the DAC to see sys.sysrscols (or sys.sysrowsetcolumns in 2005) I have noticed that issuing a CHECKPOINT seems to cause the in-memory counters to be flushed to persistent storage, so you might want to try that.  With larger values, you might be seeing flushing based on number of changes (this is just speculation on my part) without needing CHECKPOINT.

That said, in my simplified tests I found it more convenient to query rowmodctr from sysindexes on a regular connection.  So long as one bears in mind the differences between colmodctr and rowmodctr, this can still be an effective technique.

Paul

August 20, 2012 9:39 PM
 

Paul White said:

Hi John,

Good question.  You certainly could use dynamic SQL to create temporary tables each time, and this would prevent caching as I describe in the follow-up post to this one at http://bit.ly/TempCaching

I can't think of any cases where this sort of scheme would be worth the effort though.  All it achieves is to prevent temporary table caching, so you lose the benefit of that, and statistics creation and deferred compiles will happen on each execution of the new stored procedure anyway.  It seems to me that there are better way to achieve the same effect (at least ways that would be easier to maintain in future).

My own preference at this point in time, is to use UPDATE STATISTICS and OPTION (RECOMPILE) wherever I think temporary table and statistics caching might cause me a problem.

Paul

August 20, 2012 10:00 PM
 

KKline said:

Just wanted to say that this is a fantastic post, Paul. Well researched and well written!

Are those Plan Explorer screen shots, btw?  Just curious...

Thanks again,

-Kevin

August 21, 2012 12:02 PM
 

Paul White said:

Hi Kevin,

Thanks!  Yes, the screenshots are from SQL Sentry's Plan Explorer; I find its display style convenient for blog posts.

Paul

August 21, 2012 6:49 PM
 

Paul Randal said:

Nice post Paul - will be linking to it in my class deck on tempdb and temp table abuse.

August 22, 2012 4:55 PM
 

Paul White said:

Awesome!  Thanks Paul.

August 22, 2012 6:07 PM
 

Satyender M said:

Thanks a ton Paul...

Wonderful article..

October 31, 2012 3:41 AM
 

Girijesh said:

Thanka a billion Paul.

November 27, 2012 2:22 AM
 

Karna said:

wonderful and helpful article. thanks @Paul White

January 5, 2013 6:23 AM
 

Revanth said:

Wonderful article ! Thanks Paul

February 12, 2013 11:25 PM
 

Francois said:

Very nice post. I can think of several stored procedures I have to look at now and optimize. Thank you!

February 13, 2013 11:44 PM
 

Pedro Ponce said:

Very nice post.

April 5, 2013 6:32 PM
 

Neelam said:

Awesome!!!

April 28, 2013 5:06 AM
 

Gary said:

Excellent post, really useful.  But I just checked your Connect bug submission - Microsoft have just closed it in thye past week as "Won't Fix".  Apparently it's too obscure to invest any effort in.

May 2, 2013 7:13 AM
 

Paul White said:

Hi Gary,

Thanks - and yes, I noticed *sigh* :)

Paul

May 2, 2013 11:06 AM
 

akshay said:

Excellent Post,its very useful.

I have one question-

I want to add new row in temporary table as like we add new row in data table in C#,Is it possible?

Reply..

May 21, 2013 2:50 AM
 

JRStern said:

Paul, this is fantastic stuff, I read this a couple of months ago and have been applying it to some of our SPs, and I believe it is helping.

What we had are SPs that, a few times a day, would just seem to lose their minds and run 100x slower.  The same parameters rerun would run much faster.  I'm pretty sure this is "The Paul White Bug", that's how I'm labelling it in my code!  Applying these update stats to newly created #temps, seems to stabilize the behavior.  Sometimes I get ambitious and wait until I repopulate the SPs before doing the update stats, tradeoff about cost and accuracy.

As for Microsoft closing it in Connect - I've had very poor experience with Connect for a long, long time, not sure what else to say about it.

June 17, 2013 1:39 PM
 

Paul White said:

JRStrern,

I think we interacted a couple of times on the MSDN forums, so hello again :)

I'm so glad to hear you have found this entry useful, thanks very much for letting me know of your experiences.

Paul

June 17, 2013 4:02 PM
 

JRStern said:

... I meant repopulate the #temps of course.

Sure, I often think out loud on the forums, just posting it solves the problem half the time, you know how it is! :)

Appreciate all your work, whereever it goes!

J.

June 17, 2013 5:59 PM
 

Neeraj prasad sharma said:

Hi paul,

I am(yr FAN) a regular reader of your blog and a first time commenting.

your findings and dedication to make the community stronger is phenomenal,

could you please give me some link or examples to back yhis statement

"In addition, ordinary tables do not benefit from engine optimizations that only apply to genuine temporary tables, so they tend to be less efficient, even if created in tempdb."

will be helpfull.

Thanks

October 31, 2013 3:17 AM
 

Paul White said:

Many of the differences can be found in the tempdb White Paper at http://technet.microsoft.com/library/Cc966545. You might also like to read the Storage Engine blog entries on tempdb at http://blogs.msdn.com/b/sqlserverstorageengine/archive/tags/tempdb. Finally, there is the three-hour Bob Ward video from PASS Summit at http://www.youtube.com/watch?v=SvseGMobe2w

October 31, 2013 3:36 AM
 

Kenny said:

Thanks for the fantastic post!

Is the "DROP TABLE" necessary to produce this behavior? Or will the implicit drop (when the connection on which the procedure is executed closes) also result in caching of the temp table and its auto-created statistics?

January 8, 2014 10:49 AM
 

Paul White said:

Hi Kenny,

No, the DROP TABLE isn't required - the implicit drop has the same effect.

Paul

January 8, 2014 9:04 PM
 

Alex Friedman said:

Great post, thank you!

How about using "WITH RECOMPILE" on the SP level? I noticed in my tests that it did force the creation of a correct plan and statistics every time.

January 14, 2014 3:59 AM
 

Paul White said:

Hi Alex,

Yes, though as I mentioned in the main text:

"Executing the procedure with the WITH RECOMPILE option will produce better plans, at the cost of recompiling the whole procedure, and the resulting plans will not be cached for reuse. In general, recompiling the whole procedure every time could be bad since many procedures have many more statements to recompile than just a single INSERT and SELECT."

Paul

January 14, 2014 9:31 AM
 

Alex Friedman said:

Whoops, missed that, thanks.

But what's the explanation for the SP-level RECOMPILE working differently than the statement-level RECOMPILE? I mean, even if we add OPTION(RECOMPILE) to every single statement in the SP, it would still behave differently.

My guess is that the SP-level RECOMPILE disables the temp table caching.

January 15, 2014 1:54 AM
 

Paul White said:

When RECOMPILE is specified on the procedure, there is no top-level plan cached for the statistics to be associated with.

January 15, 2014 7:17 AM
 

Alex Friedman said:

Well that's better. Interesting!

January 16, 2014 1:47 AM
 

Derek Slone-Zhen said:

Just wanted to say thanks for the write up, helped me tonnes.

Really disappointed to see Microsoft ignoring your connect item simply because the issue has been there for so long.  Some times it takes "us" (the public) this long to realise the crazy s**t that their software is doing!

Thanks again,

Derek

February 12, 2014 4:52 AM
 

Lara Rasner said:

Hey Paul:

Thanks so much for this. I've been racking my head against a wall trying to figure out why estimated and actual rows are so vastly different in my environment I inherited where the use of #temptables inside stored procedures is insanely abused.

Have you submitted this as a question of the day to SQL Server Central? If you haven't, would you give me permission to do so and I will give you credit, and link to your post?

Thanks.

March 21, 2014 6:49 PM
 

Paul White said:

Hi Lara,

Glad you found it helpful. I haven't written an SSC QotD for this - and you don't need my permission to do so yourself :) Have fun.

March 21, 2014 8:13 PM
 

Lara Rasner said:

Thanks for the permission Paul. The QotD is up today and getting lots of good feedback.

April 11, 2014 12:32 PM
 

Paul White said:

Happy to report I answered correctly and earned 2 points :)

April 11, 2014 1:01 PM
 

Bill Kline said:

An informative and incredible write up.  Great job!

April 18, 2014 11:41 AM
 

tina said:

hai paul,

is there any impact if i create many #temp table in one store procedure?

August 19, 2014 3:38 AM
 

yaroslav said:

Thanks Paul, still need to diggest all the information on this blog post and the following one about temporary tables and caching but so far it has been amazing. I'm pretty sure the perf problems we are having (search sp's with lot of temporal tables) have lot to do with all you explain here. Thanks a lot, for real...

August 22, 2014 9:55 AM
 

Derek D said:

Great post Paul!  I couldn't agree more regarding breaking out complex queries into intermediate temp tables.  Some people talk about the tempdb usage, but I've yet to find a complex query that didn't create a worktable.  Better to do it explicitly I think. Also you have to know where to break the query up.

It's also nice knowing simple plans will survive multiple upgrades of the SQL Engine.

September 22, 2014 4:42 PM
 

Paul White said:

Hi Derek,

Thanks for the comment, you make some great points.

Indeed, the skill is in knowing *when* and *where* to simplify a query using a temporary table. I occasionally come across systems where someone has just gone nuts with the idea, leading to tempdb contention or other issues, but for the most part I find even experienced SQLers underuse this idea.

Paul

September 22, 2014 5:38 PM
 

Nida said:

I really enjoy this post. Thanks so much Paul.

October 15, 2014 1:14 AM
 

sorcha said:

Very well written post Paul, thanks for going into such detail. You have just helped me reduce the execution time on my stored procedure from 44secs to 1sec by using a temporary table, amazing difference.  My SQL reports are now actually usable!  Many thanks.

January 29, 2015 10:33 AM
 

Ranga said:

Paul,

Great article...Would creating an clustered index on the temp table help in this situation ?

January 29, 2015 3:42 PM
 

Pankaj said:

great article....

March 19, 2015 7:13 AM
 

SQlBarbarian said:

So would a temporary table that stores the exact same count of rows as a parameter from the app potentially be impacted, or only if the population count goes up and down each time?

I have an app that pass through a table parameter with a fixed 500 count rows. Later this result is joined to by other tables. If the count of results is the same in the temp table, but the resulting matches in the other tables could vary with high or low  counts coming back, would the recompile impact this significantly? I don't think I'd need to update statistics in this case, would I?

July 20, 2015 1:04 PM
 

Paul White said:

If the distribution of values within the similar/same number of rows changes, and that change means the execution plan ought to be different, then yes you could see an adverse impact. For such a small number of rows, I would probably play it safe and use the techniques in the post to force a statistics rebuild and recompilation. The cost of doing so will be pretty small, so unless you're doing something extreme like running it thousands of times per second, I think I'd prefer the safer option.

August 22, 2015 8:17 AM
 

Parameswaran said:

You did good job Paul. Thank You!

October 7, 2015 2:32 AM
 

Yaroslav said:

How come I didn't saw this blog post before?! Really interesting reading, plenty of samples and details to play with and reproduce what you comment. We have couple of sp's that maybe could give it a try as it follows a similar pattern related to parameter sniffing and so far we have improved a lot but still having some random issues.

December 22, 2015 10:33 PM
 

Greg Low said:

Hi Paul,

Awesome post.

We had the 2012 version of the plan caching paper here: https://msdn.microsoft.com/en-us/library/dn148262.aspx with you are the reviewer :-)

January 3, 2016 9:53 PM
 

Simon Birch said:

Thanks very much, Paul, for this and all your other great work that's out there in the public domain!

I've successfully used this approach (update statistics and recompile) where necessary on quite large data warehouse workloads, but today a side-effect caused a performance issue in unit testing, which just illustrates the importance of understanding the internals rather than blindly applying patterns. The recompile option allowed the optimizer to use the value of a local variable in that query, triggering the ascending keys issue. I could add option (optimize for unknown), but it's starting to get a little cumbersome, and a more general purpose alternative (compared to update stats and recompile) seems to be - at the end of the proc - to truncate the temporary table (heap) and then add an index to it (the truncation just to speed up the index add). We can then keep the workaround away from the core business logic, and not have to worry about explicitly recompiling every affected statement, or variable sniffing in those recompilations. This seems to create behaviour closer to what developers might expect from temp tables in stored procs (including cached query plans and more comprehensive execution stats), although you would lose the benefits of temp table caching in high transaction workloads.

January 14, 2017 3:00 AM
 

lukas said:

hi Paul,

I did not fully understand this sentence "whereas adding RECOMPILE meant the plan would occasionally recompile (when accumulated changes passed RT)." adding recompile will make the statement recompile every execution, isn't it? would you please clarify this part to me, thanks in advance.

April 25, 2017 12:09 PM
 

Martin said:

Hi Paul,

Great post.

I was playing around with temp table statistics. Using a modified version of a stored procedure based on some techniques that I found in this article but oddly enough, when I try to pull the statistics for a temp table I get a message saying that:

Could not locate statistics '[whatever_column_name]' in the system catalogs.

Is like the statistics are not created in this case, which is odd, as my sp uses the same DBCC SHOW_STATISTICS idea to gather statistics inside an sp (just like yours does in this article). Here is a simplified example.

ALTER PROCEDURE dbo.Demo

AS

BEGIN

SET NOCOUNT ON

-- Declare table variable

CREATE TABLE #temp_table (ID INT)

DECLARE @I INT = 0

-- Insert 10K rows

WHILE @I < 100

BEGIN

INSERT INTO #temp_table VALUES (@I)

SET @I=@I+1

END

-- Display all rows and output execution plan (now the EstimateRow is just fine!)

SELECT * FROM #temp_table

-- Is the object there

SELECT OBJECT_ID('tempdb..#temp_table')

-- How about statistics

DBCC SHOW_STATISTICS ('tempdb..#temp_table', 'id')

END;

What is wrong with this? why it does not work?

May 18, 2017 12:48 PM

Leave a Comment

(required) 
(required) 
Submit
Privacy Statement