THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

The Hidden Costs of INSERT EXEC

INSERT and EXEC: relational brothers in arms, helping you get your data and then put it somewhere. But like all brothers, their relationship has its ups and downs and sometimes you need to look just below the surface to see the real issues. In this post I will cover hidden penalties that a decision--seemingly purely aesthetic--can bring to the fore.

To illustrate the first of these examples I will be using the following table:

USE tempdb
GO


CREATE TABLE dbo.MyTable
(
    [SalesOrderID] [int] NOT NULL,
    [SalesOrderDetailID] [int] NOT NULL,
    [CarrierTrackingNumber] [nvarchar](25) NULL,
    [OrderQty] [smallint] NOT NULL,
    [ProductID] [int] NOT NULL,
    [SpecialOfferID] [int] NOT NULL,
    [UnitPrice] [money] NOT NULL,
    [UnitPriceDiscount] [money] NOT NULL,
    [LineTotal] [money] NOT NULL,
    [rowguid] [uniqueidentifier]  NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
)
GO

Consider the following queries, disregarding the fact that this isn't a good dynamic SQL example (I'm trying to keep it simple), and take a moment to think about the differences between them from a behavioral point of view:

EXEC
('
    INSERT dbo.MyTable
    SELECT *
    FROM AdventureWorks.Sales.SalesOrderDetail
')
GO

INSERT dbo.MyTable
EXEC
('
    SELECT *
    FROM AdventureWorks.Sales.SalesOrderDetail
')
GO

It's quite common to insert some data into a table from dynamic SQL, and for as long as I can remember, I've believed the choice between these two forms of the query to be primarily an issue of style. My preference has been for the latter style, as I feel that it's a bit more readable. It also has the benefit of better supporting ownership chains, but that's something that, in most cases, we don't have to worry about any longer in SQL Server 2005 or 2008. But let's move beyond style and potential security issues and get to the substance. What actually happens when we run these two queries?

In the first case--insert done inside of the EXEC--the first step is that the outer statement--the EXEC itself--is parsed and validated. Now we jump down one level of context, into the string that was passed to EXEC, and that string is parsed and validated. The referenced objects are resolved, and the plan cache is checked. If there is no plan, the query is compiled. And then it's executed. Data is streamed from SalesOrderDetail into MyTable and when the query inside of the EXEC is done control returns to the calling context--EXEC--and assuming there were no errors, the EXEC is now completed.

In the second case, something quite a bit different occurs. The insert is parsed and validated, and an INSERT EXEC plan is generated. This plan does not involve SalesOrderDetail, as the query optimizer doesn't know at this point in the process where the data will be inserted from, since that step is dynamic. So the plan references something called the "Parameter Table" (plan truncated for simplicity):

|--Table Insert(OBJECT:([tempdb].[dbo].[MyTable]))
   |--Top(ROWCOUNT est 0)
        |--Parameter Table Scan


Once this plan has been generated, we again jump down one level of context, and the inner string is parsed and validated, the referenced object resolved, the cache checked, and the compiled plan executed. But what is this Parameter Table thing?

Here's where things start really diverging. Data is not, at this point, streamed from SalesOrderDetail directly into MyTable. Rather, it is streamed from SalesOrderDetail into the Parameter Table. And the Parameter Table, as it turns out, is in actuality a hidden temporary table. And not until all of the data has streamed into that hidden temporary table is control resumed by the EXEC context, and only then does the data start moving into its final home.

The natural question you might be asking yourself at this point is, just how much overhead does this Parameter Table introduce into the equation? The answer might just send you racing to check your stored procedures: the additional cost for the Parameter Table is well over 100% as compared with doing the insert inside of the EXEC. The fact that all of the data must be spooled to the Parameter Table before the insert can even begin tells us that this must be true, and it can be verified using a simple check against the sys.dm_exec_requests DMV, as in the following example:

EXEC
('
    INSERT dbo.MyTable
    SELECT *
    FROM AdventureWorks.Sales.SalesOrderDetail
')

SELECT
    r.writes AS [INSERT inside EXEC writes]
FROM sys.dm_exec_requests r
WHERE
    r.session_id = @@SPID
GO

INSERT dbo.MyTable
EXEC
('
    SELECT *
    FROM AdventureWorks.Sales.SalesOrderDetail
')

SELECT
    r.writes AS [INSERT EXEC writes]
FROM sys.dm_exec_requests r
WHERE
    r.session_id = @@SPID
GO


So that's that. We should avoid INSERT EXEC and try to do our inserts in the same context in which the SELECT is running--right?

Well, yes and no. There is another element at play here which I haven't yet mentioned. What if we were only inserting a few rows, and the table we were inserting into looked something like the following:

CREATE TABLE #MyTable
(
    [SalesOrderID] [int] NOT NULL,
    [SalesOrderDetailID] [int] NOT NULL,
    [CarrierTrackingNumber] [nvarchar](25) NULL,
    [OrderQty] [smallint] NOT NULL,
    [ProductID] [int] NOT NULL,
    [SpecialOfferID] [int] NOT NULL,
    [UnitPrice] [money] NOT NULL,
    [UnitPriceDiscount] [money] NOT NULL,
    [LineTotal] [money] NOT NULL,
    [rowguid] [uniqueidentifier]  NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
)
GO


The only difference between this table and the previous one is that this is a temporary table and the other is not. But temporary tables have their own interesting little twists, especially when it comes down to one of the key enemies in a highly-transactional system: recompilation. As it turns out, doing the insert inside the EXEC will cause the internal statement to recompile every time a new temp table is encountered. This means that if you have a stored procedure that creates a temp table, puts together a bit of dynamic SQL, and does an insert inside of that dynamic SQL, you'll now have yourself a recompilation problem.

To see this illustrated, try the following script. Here only a single row is inserted as a result of the dynamic query, but it's complex enough that the compile time more than overshadows the overhead of the Parameter Table:

USE tempdb
GO

DBCC FREEPROCCACHE
GO

CREATE TABLE #AvgTimes
(
    CPU_time DECIMAL(19,4) NOT NULL,
    insert_type VARCHAR(25) NOT NULL
)
GO

CREATE TABLE #MyTable
(
    [SalesOrderID] [int] NOT NULL,
    [SalesOrderDetailID] [int] NOT NULL,
    [CarrierTrackingNumber] [nvarchar](25) NULL,
    [OrderQty] [smallint] NOT NULL,
    [ProductID] [int] NOT NULL,
    [SpecialOfferID] [int] NOT NULL,
    [UnitPrice] [money] NOT NULL,
    [UnitPriceDiscount] [money] NOT NULL,
    [LineTotal] [money] NOT NULL,
    [rowguid] [uniqueidentifier]  NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
)

EXEC
('
    INSERT #MyTable
    SELECT TOP(1)
        sod.*
    FROM AdventureWorks.Sales.SalesOrderDetail sod
    WHERE
        sod.UnitPrice > 10
        AND sod.LineTotal > 100
        AND EXISTS
        (
            SELECT *
            FROM AdventureWorks.Sales.SalesOrderHeader soh
            JOIN AdventureWorks.Person.Contact c ON
                c.ContactID = soh.CustomerID
            WHERE
                soh.SalesOrderID = sod.SalesOrderID
                AND c.LastName LIKE ''Ab%''
        )
        AND EXISTS
        (
            SELECT *
            FROM AdventureWorks.Production.Product p
            WHERE
                p.ProductID = sod.ProductID
                AND p.Color IS NULL
        )
        AND NOT EXISTS
        (
            SELECT *
            FROM AdventureWorks.Sales.SalesOrderHeader soh
            JOIN AdventureWorks.Sales.SalesPerson sp ON
                soh.SalesPersonID = sp.SalesPersonID
            WHERE
                soh.SalesOrderID = sod.SalesOrderID
                AND sp.CommissionPct > 50
        )
')

INSERT #AvgTimes
SELECT
    r.cpu_time,
    'INSERT inside EXEC'
FROM sys.dm_exec_requests r
WHERE
    r.session_id = @@SPID

DROP TABLE #MyTable
GO 5

CREATE TABLE #MyTable
(
    [SalesOrderID] [int] NOT NULL,
    [SalesOrderDetailID] [int] NOT NULL,
    [CarrierTrackingNumber] [nvarchar](25) NULL,
    [OrderQty] [smallint] NOT NULL,
    [ProductID] [int] NOT NULL,
    [SpecialOfferID] [int] NOT NULL,
    [UnitPrice] [money] NOT NULL,
    [UnitPriceDiscount] [money] NOT NULL,
    [LineTotal] [money] NOT NULL,
    [rowguid] [uniqueidentifier]  NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
)

INSERT #MyTable
EXEC
('
    SELECT TOP(1)
        sod.*
    FROM AdventureWorks.Sales.SalesOrderDetail sod
    WHERE
        sod.UnitPrice > 10
        AND sod.LineTotal > 100
        AND EXISTS
        (
            SELECT *
            FROM AdventureWorks.Sales.SalesOrderHeader soh
            JOIN AdventureWorks.Person.Contact c ON
                c.ContactID = soh.CustomerID
            WHERE
                soh.SalesOrderID = sod.SalesOrderID
                AND c.LastName LIKE ''Ab%''
        )
        AND EXISTS
        (
            SELECT *
            FROM AdventureWorks.Production.Product p
            WHERE
                p.ProductID = sod.ProductID
                AND p.Color IS NULL
        )
        AND NOT EXISTS
        (
            SELECT *
            FROM AdventureWorks.Sales.SalesOrderHeader soh
            JOIN AdventureWorks.Sales.SalesPerson sp ON
                soh.SalesPersonID = sp.SalesPersonID
            WHERE
                soh.SalesOrderID = sod.SalesOrderID
                AND sp.CommissionPct > 50
        )
')

INSERT #AvgTimes
SELECT
    r.cpu_time,
    'INSERT EXEC'
FROM sys.dm_exec_requests r
WHERE
    r.session_id = @@SPID

DROP TABLE #MyTable
GO 5

SELECT
    AVG
    (
        CASE a.insert_type
            WHEN 'INSERT inside EXEC' THEN a.CPU_time
            ELSE NULL
        END
    ) AS [CPU time - INSERT inside EXEC],
    AVG
    (
        CASE a.insert_type
            WHEN 'INSERT EXEC' THEN a.CPU_time
            ELSE NULL
        END
    ) AS [CPU time - INSERT EXEC]
FROM #AvgTimes a
GO

DROP TABLE #AvgTimes
GO


So what have we learned today? The choice between INSERT EXEC and INSERT inside of EXEC is not purely stylistic and has definite performance implications. Here are the rules I'll be following from now on:

  • When working with permanent tables, always avoid INSERT EXEC if possible. There are some cases where it won't be possible to avoid. I only showed dynamic SQL in this post, but INSERT EXEC also applies to stored procedures. Can you safely rip apart all of the stored procedures in your system in order to avoid this issue? Maybe not quite as easily as you can rip apart the dynamic SQL within stored procedures.
  • When working with temporary tables, evaluate the complexity of the operations, the amount of data to be inserted, and most importantly, test every solution! The more rows that are inserted as a result of the INSERT EXEC, the more the overhead you'll get from the Parameter Table. On the flip side, the more complex the dynamic statement, the more overhead you'll get from recompilations. Every scenario is bound to be different and you may just learn something about your processes by doing this extra bit of analysis.
Enjoy!
Published Thursday, June 25, 2009 5:19 PM by Adam Machanic
Filed under: , , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Florian Reischl said:

Very nice Article, Adam!

Thanks

Flo

June 25, 2009 4:47 PM
 

LeoPasta said:

Great article, Adam!

Well done.

June 26, 2009 6:44 AM
 

jerryhung said:

1st try

CPU time - INSERT inside EXEC CPU time - INSERT EXEC

396.800000 43.800000

2nd & 3rd aren't as bad

87.600000 34.400000

96.800000 31.400000

But something new learned today indeed

June 26, 2009 4:10 PM
 

Bruce W Cassidy said:

Sometimes it's the things you take for granted that can turn around and bite you.  I've always tended to veer to the INSERT...  EXEC form myself, proabably for stylistic reasons.  Now I will need to challenge that.

Cheers, good little article!  Thought provoking as usual.

June 28, 2009 5:59 PM
 

ValeriyNY said:

Great post, thought provoking! I usually prefer direct INSERT SELECT, but recently realized the overhead of recompilation if temp table is used. Thus, switching to INSERT EXEC made perfect sense in my case. Of course, as usual, YMMV and the answer is "it depends, test, test, test".

July 1, 2009 9:38 AM
 

Nima said:

Hi Adam!

Great post :)

Why do I get different results from writes column in sys.dm_exec_requests?

July 3, 2009 7:47 AM
 

Nima said:

Sorry I mean: different result each time that I query sys.dm_exec_requests!

July 3, 2009 7:51 AM
 

Adam Machanic said:

Hi Nima,

I don't understand your question. Can you post a script that shows what you're seeing?

July 3, 2009 9:53 AM
 

Nima said:

Sorry for my poor explanation!

I mean when I query sys.dm_exec_requests, it returns 24, next time 26, next time 20 and ....

July 4, 2009 1:32 PM
 

Adam Machanic said:

Hi Nima,

Great question! What seems to be causing this is the partially-filled pages that remain from previous inserts. Sometimes more pages have to be allocated to make the insert work and other times some pages that were allocated on the previous run can be reused. This may translate into fewer writes for those attempts since not as many allocations need to be logged.

July 5, 2009 1:21 PM
 

Nima said:

Thanks Adam!

What about 'Reads' column? It is always 0 in my batch:

SELECT * FROM northwind..Orders

SELECT    reads

FROM sys.dm_exec_requests

WHERE     session_id = @@SPID

July 6, 2009 7:25 AM
 

Adam Machanic said:

Hi Nima,

The "reads" column in sys.dm_exec_requests appears to refer to physical reads (it's not documented in exactly that way, but that's the best I've been able to determine). Try the logical_reads column instead.

July 7, 2009 11:57 AM
 

Nima said:

Thanks indeed! :-)

July 8, 2009 3:13 AM
 

Roy Harvey said:

>>The answer might just send you racing to check your stored procedures: the additional cost for the Parameter Table is well over 100% as compared with doing the insert inside of the EXEC.<<

I suspect the percentage would fall if the table has indexes, though the absolute effect would be as described.

The same idea as INSERT/EXEC is now being written widely with an INSERT/SELECT against a Table Valued Function.  An in-line TVF would be fine, since it is just seen as a parameterized view and compiled into the base query as a view would be.  However a complex TVF has to be staged into a temporary table and probably has the same overhead as you describe for INSERT/EXEC.

July 20, 2009 12:07 PM
 

Arif said:

Great article, Adam!

i have one question

Does SQL Server cache the plan for dynamic queries ?

if yes then what is the disadvantage of a simple dynamic query

July 22, 2009 6:06 AM
 

Adam Machanic said:

Hi Arif,

Yes, plans are cached for dynamic queries. I'm not sure what you mean when you ask about the "disadvantage of a simple dynamic query." There are advantages and disadvantages, depending on what it is that you're doing. You might want to read Erland Sommarskog's article, "The Curse and Blessings of Dynamic SQL":

http://sommarskog.se/dynamic_sql.html

July 27, 2009 11:06 AM
 

Ranu Mandan said:

Do it solves nested EXEC restriction ?

October 18, 2011 4:40 AM

Leave a Comment

(required) 
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement