THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Geek City: Accessing auxiliary tables during Data Modification

 

After reading Hugo’s post about when snapshot isolation doesn’t really live up to its promise, I decided to do some research of my own. I met with Stefano Stefani, one of the engineers at Microsoft on the storage engine team and we talked about what kinds of locks are taken during the data modification operations. We know that when a row is actually updated, SQL Server will take an Exclusive (X) lock, but there’s a lot more to the story than that.

 

First of all, you need to be aware of the fact that almost every data modification operation has to do some read operations in order to find the rows to be modified. So what kinds of locks are taken on the data while we’re reading it and checking to see if it is the data that needs to be updated?

 

In addition, reads may need to be performed on other tables, other than the table being modified. These ‘auxiliary’ tables can need to be referenced for two different reasons. First, if your query explicitly mentions additional tables in the FROM clause of your data modification statement, these tables are referred to as explicit auxiliary tables. There may be other tables that need to be referenced because of a need to validate foreign key constraints. The tables that need to be accessed to validate referential integrity as called implicit auxiliary tables. What kinds of locks need to be taken on these auxiliary tables?

 

There are a couple of additional terms I need to mention that relate to how an operator in a query plan is processed in an execution plan. An operator in a SQL Server execution plan (e.g. scan, seek, join, etc) consumes input rows and produces output rows. For example, a seek consumes rows from an index on a particular column, and produces rows that that can meet certain conditions in the indexed columns. A join consumes rows from two different input sources and produces output rows that match either other on the join condition. Every operator can be classified as either blocking or non-blocking. An operator that consumes the input rows and produces output rows at the same time is referred to as non-blocking, or flow-through. Other operators need to consume all input rows before producing any output rows and are referred to as blocking operators, or stop-and-go operators. A common example of a blocking operator is sort, which has to ‘stop’ to sort all the input rows before it produces any output. A loop join operator on the other hand is non blocking, because for every pair of row it consumes from the two input, it determines whether there is an output row to be returned, or whether it should just go on and get the next row. You can get more details and examples about blocking and non-blocking operators in Craig Freedman’s excellent blog here:

http://blogs.msdn.com/craigfr/archive/2006/06/19/637048.aspx

 

All data modification query plans can be thought of as broken into two parts: what comes before the actual modification of the target table and what comes after. In a graphical plan, the before part is to the right of the table or clustered index modification. And the after part is to the left.

 

Read operations performed during the part of the plan before the actual target table modification will lock any auxiliary tables under the user requested isolation level of the session or query, and the target table will take U locks. User requested tables will be those that occur in FROM clause of the operation, i.e. explicitly referenced by the user. Here is an example from the pubs database, which you can download from here.

 

First create copies of the publishers and titles tables:

 

USE pubs

GO

SELECT * INTO newpublishers

FROM publishers

GO

SELECT * INTO newtitles

FROM titles

GO

 

ALTER TABLE newpublishers ADD  CONSTRAINT UPKCL_newpubind PRIMARY KEY CLUSTERED

(pub_id ASC)

GO

ALTER TABLE newtitles ADD  CONSTRAINT UPKCL_newtitleidind PRIMARY KEY CLUSTERED

(title_id)

GO

ALTER TABLE newtitles  WITH CHECK ADD FOREIGN KEY(pub_id)

    REFERENCES newpublishers([pub_id])

GO

 

Here is a SELECT that uses an explicitly referenced auxiliary table, followed by the graphical query plan. The target table is newtitles and the auxiliary table is newpublishers.

 

-- explicit

UPDATE newtitles

SET price = price * 1.1

FROM newtitles nt JOIN newpublishers np

   ON nt.pub_id = np.pub_id

WHERE state = 'CA'

GO

 

 

 

 

 

I have circled the update to the newtitles table itself. The lookup into the newpublishers table is performed before (to the right of) the update to the newtitles table. I realize the details in the plan might be too small to read, but the point is not the details here, but rather to show you the ‘shape’ of the plan, with auxiliary table referenced before the target table. If you want the details, you can run the query yourself, and observe the graphical showplan on your own system.

 

The part of the plan after the actual target table modification can involve additional tables not mentioned explicitly by the user. These are primarily tables used for PK/FK constraint validation. For these implicit auxiliary tables, the isolation level set in the session or by hints in the query is ignored; each implicit auxiliary table is accessed under read committed isolation and will acquire either U or S locks, and with different granularity (usually either row/key locks or range locks) based on the available indexes. The goal of the query processor is to prevent data corruption.

 

Here is a SELECT that uses an implicitly referenced auxiliary table, followed by the graphical query plan. The target table is again newtitles and the auxiliary table is again newpublishers.

 

-- implicit

UPDATE newtitles

SET title_id = title_id + '_$'

WHERE price > 20

 

 

 

For this query plan, note that the seek into the auxiliary table happens after (to the left of) the update of the target table. (Also note that you can’t actually execute this query, as it will fail due to string truncation.)

 

The internal locks for the extra scans required for these implicit tables are kept for the duration of the scan if the operator is a stop-and-go operator, like sort or hash where all the rows must be read before processing can begin. For non stop-and-go (i.e. leaf level operators or flow-through), the lifetime of the lock is like a normal read committed shared lock, with which the lock is released as soon as the data has been read.

 

If the isolation level is SNAPSHOT, normally these locks on the implicit tables will not cause blocking or be blocked. But SQL Server can detect if relevant rows have changed since the transaction started, and will then block. This gets back to the subject of Hugo’s post, referenced at the beginning of my post.

 

You can probably guess that you can have both implicit and explicit auxiliary tables in the same plan. Here is a query and a graphical plan that illustrates this.

 

So this final SELECT uses both an implicitly referenced auxiliary table and an explicitly referenced auxiliary table, followed by the graphical query plan. The target table is again newtitles and the auxiliary tables are newpublishers.

 

-- explicit and implicit

UPDATE newtitles

SET title_id = title_id + 'CA'

FROM newtitles nt JOIN newpublishers np

   ON nt.pub_id = np.pub_id

WHERE state = 'CA'

 

 

 

In my next post, I’ll give you some query examples to illustrate SQL Server detecting changed data under SNAPSHOT isolation.

 

 

-- Kalen

 

 

 

Published Tuesday, May 22, 2007 10:59 AM by Kalen Delaney

Attachment(s): update with explicit auxiliary table.jpg

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alex Kuznetsov said:

Hi Kalen,

Because SNAPSHOT and READ COMMITTED SNAPSHOT detect changes for different scopes (transaction vs. statement), could you please elaborate on differences in implementation between SNAPSHOT and READ COMMITTED SNAPSHOT.

May 23, 2007 3:14 PM
 

James Luetkehoelter said:

Great post - the SQL Server world needs more on Snapshot Isolation - I've already seen it used inappropriately and to the detriment of the client. Look forward to more. I was going to write a bit on them myself, but if you are...I think mine would look silly in comparison :)

May 23, 2007 4:42 PM
 

Adam Machanic said:

James,

Please don't hold back!  Tell us about your client situation.  It would be an interesting post I'm sure, and helpful to many readers.  There is no reason to not post simply because someone else may be discussing a similar issue at some point!

And Kalen, great post as always :)

May 24, 2007 9:57 AM
 

Kalen Delaney said:

Yes, James, please do. This post is actually not about Snapshot Isolation. It's about locking auxiliary tables during data modification. You need to be aware of this locking when using SI, but this post is not ABOUT SI.

I'm leading a BOF discussion at TechEd on the costs of SI, and I would love to have more stories about problems with it.

Thanks

May 24, 2007 11:07 AM
 

James Luetkehoelter said:

Alrighty, I'll write regardless :)

May 29, 2007 9:08 AM
 

AlexLopez said:

Hi Kalen,

I have a question regarding the query plan of this SELECT:

-- implicit

UPDATE newtitles

SET title_id = title_id + '_$'

WHERE price > 20

Why does SQL Server have to query the table "newpublishers"? The referential intregrity is on the column pub_id and not on title_id, so how a change to the title_id affects the other table?

ALTER TABLE newtitles  WITH CHECK ADD FOREIGN KEY(pub_id)

   REFERENCES newpublishers([pub_id])

July 30, 2007 8:51 PM
 

AlexLopez said:

Hi Kalen,

I found out the problem in the example. I queried this statement and it gives me the same query plan you presented:

UPDATE newtitles

SET pub_id = pub_id + '_$'

WHERE price > 20

It makes more sense now.

July 30, 2007 11:10 PM
 

mjswart said:

A message to James, I know this post is approaching two years old, but did you ever write about your unfortunate client situation.

March 30, 2010 1:49 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement