THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Rob Farley

- Owner/Principal with LobsterPot Solutions (a MS Gold Partner consulting firm), Microsoft Certified Master, Microsoft MVP (SQL Server), APS/PDW trainer and leader of the SQL User Group in Adelaide, Australia. Rob is a former director of PASS, and runs training courses around the world in SQL Server and BI topics.

MERGE gives better OUTPUT options

MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things.

One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.

 TSQL2sDay150x150

If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).

That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.

MERGE is slightly different though.

With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this).

image

(Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values)

One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool.

And $action – very convenient.

But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it.

The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted.

Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.

image

This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.

This is also doable if you’re using MERGE to just do INSERTs.

In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2.

This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful.

Oh, and before people ask if you can also hook into the target table’s columns... Yes, of course. That’s what deleted and inserted give you.

Published Tuesday, June 12, 2012 12:40 PM by Rob Farley
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Greg Lucas said:

Rob,

Great post, I use the OUTPUT clause quite regularly - especially to populate rollback tables on production data changes but I never knew about $action or the ability to reference other columns from the WITH block.

Thanks

June 12, 2012 1:56 AM
 

Rob Farley said:

Thanks Greg. And it's worth remembering that this is only available with MERGE - the other commands only let you hook into inserted and deleted.

June 12, 2012 2:15 AM
 

SomewhereSomehow said:

Hi Rob. Very interesting scenario of using merge for audit. Goes to my KB.

Maybe, it is worth saying, the case of using merge for insert is very convenient to get rid of cursors when copying linked entities. I blogged about it sometime ago. For example, if we have classical scheme: Customer<- Order <- OrderDetail and a need to copy orders from one client to another. With help of merge we may get a table of links: (OrderID_old, OrderID_new). And after that, use it in join, to copy all the order details at once, in one single query.

June 13, 2012 3:01 AM
 

Ilmar said:

Very nice article!  I came accross it while searching for an elegant solution to a problem very similar to the one SomewhereSomewhat describes.  Definately added to my toolkit.

June 14, 2012 7:32 PM
 

Rob Farley said:

Yes, definitely.

June 14, 2012 7:38 PM
 

Rob Farley said:

Ilmar - you're more than welcome. Good you found it helpful.

June 14, 2012 7:39 PM
 

Deege said:

Hi Rob,

Is multiple 'OUTPUT INTO' clauses allowed

UPDATE ...

OUTPUT Deleted.*

INTO @oldValues

OUTPUT Inserted.*

INTO @newValues

WHERE id = @id

Currently using

UPDATE ...

OUTPUT Deleted.*

INTO @oldValues

OUTPUT Inserted.*

INTO @newValues

WHERE id = @id;

INSERT @newValues...

SELECT *

FROM   ..

WHERE  id = @id

I am able to do a OUTPUT INTO and OUTPUT but gives an error when I use multiple OUTPUT INTO

August 15, 2012 7:02 PM
 

Rob Farley said:

Hi Deege,

This is correct. You can use only one OUTPUT .. INTO and only one OUTPUT clause. But you can use one of each in a single query.

Rob

August 16, 2012 1:40 AM
 

Lally said:

Rob, you are a god.

Well done on being the only person on the net to point this out. Saved me hours.

February 5, 2013 8:39 AM
 

Rob Farley said:

Not a god. Just happy to help.

February 5, 2013 3:21 PM
 

Shane said:

Rob thanks for the reminder about $action, I often forget that is available. However I wanted to make one correction. The only DML statement that you can't use source data in the output clause is the INSERT statement. You can capture data form the source for UPDATE and DELETE statements as well as MERGE. It is also important to point out that you can add variables, functions, and constants in the output clause caolumns as well. Here are examples:

DECLARE @Colors TABLE (

ID INT IDENTITY PRIMARY KEY,

Color VARCHAR(50)

)

;

DECLARE @DataSource TABLE (

ID INT IDENTITY PRIMARY KEY,

Color VARCHAR(50),

DateModified DATE,

[Action] VARCHAR(50)

)

;

DECLARE @Audit TABLE (

ID INT,

PreviousColor VARCHAR(50),

NewColor VARCHAR(50),

OrignalDateModified DATE,

DateModified DATE,

[Action] VARCHAR(50)

)

;

INSERT INTO

@Colors (Color)

VALUES

('Red'),

('Blue'),

('Yellow')

;

INSERT INTO

@DataSource (Color, DateModified, [Action])

VALUES

('Yellow', '20130207', 'Update'),

('Red', '20130210', 'Update'),

('', '20130212', 'Delete')

;

UPDATE

@Colors

SET

Color = DS.Color

OUTPUT

INSERTED.ID,

DELETED.Color,

INSERTED.Color,

DS.DateModified,

SYSDATETIME(),

'Updated'

INTO

@Audit

FROM

@Colors C

JOIN @DataSource DS

ON DS.ID = C.ID

AND DS.[Action] = 'Update'

DELETE

C

OUTPUT

DELETED.ID,

DELETED.Color,

DS.Color,

DS.DateModified,

SYSDATETIME(),

'Deleted'

INTO

@Audit

FROM

@Colors C

JOIN @DataSource DS

ON DS.ID = C.ID

AND DS.[Action] = 'Delete'

SELECT

*

FROM

@Audit

April 15, 2013 11:39 AM
 

K Aditya Vardhan said:

Hi Rob,,

I am using MERGE Statement in which there are three tablse source target and history table

Source --It acts as an intermediate  stage table

Target --It would be the final table

History-- It would be used as an audit purpose which captures the updated records which also consists of two not null columns

When there is no data in target table and History table the output port is sending the updated records with NULL value for which it is throwing error but actually it is not updating any of records how can i avoid such kind of situation..Its only happening when data in target and History are not available when there is data in target the functionality is working fine

Can you suggest me a solution how to fix this

June 25, 2014 6:07 AM
 

Rob Farley said:

Hi Aditya - Can you send me your code? My email address is on the right...

June 25, 2014 7:32 AM
 

Mark Kennedy said:

You can access data other than the inserted/deleted in an output clause, but it has to be joined.

There's examples on the OUTPUT clause MSDN page, example E is where I got the idea to try self join.

http://msdn.microsoft.com/en-gb/library/ms177564(v=sql.105).aspx

but try this:

UPDATE #tmp

SET someval += 1

OUTPUT deleted.*, inserted.someval, t.id

FROM #tmp

INNER JOIN #tmp as t ON #tmp.id = t.id

I don't know why exactly, but its probably something to do with not being able to use values on the table directly being changed, but joining to its self seems to work.

I did this on SQL server 2008R2

Saves having to delve into using merge statements if you just want to get other data in an OUTPUT from a simple update.

September 18, 2014 5:59 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement