THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Running OLTPish system without deadlocks, and loving it.

Our OLTPish mixed load system has not had a single deadlock since last April, and we just love it.

I would not make any blanket statements, but I think in our case being deadlock-free just makes a lot of practical sense. Of course, in other cases in might not be worth it.

To accomplish deadlock-free state, we have taken the following steps:

  • Snapshot isolation for all readers
  • If concurrent updates are not frequent, we just acquire some exclusive lock before updating. As long as collisions are not likely, this does not slow us down much. We stress test how it works.
  • If concurrent updates become frequent, we solve the problem outside the database. Our application servers accumulate incoming changes and save them in bulk. Of course, these bulk saves still have to acquire some exclusive lock before updating. We stress test how it works.

I still have to see how it works out long term, but short term we save considerable effort on not troubleshooting deadlocks, and not having to implement retry logic. So far I think that being deadlock-free is worth it, but only if we can avoid high concurrency, when lots of modifications touch a single row each. If we cannot avoid high concurrency, then it might be easier to just let it deadlock from time to time.

What do you think? Are you doing anything similar?

 

Published Wednesday, January 04, 2012 5:31 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Gokhan Varol said:

If no more deadlocks why not take that route. I have not used Snapshot isolation myself, I would guess that it would have some penalty, does it?

We do get deadlocks (within a week for sure, usually lot more often), but unfortunately we bring data into staging tables, 47 tables for a single batch load and when the staging load is done indexes are created, foreign keys etc are created we open a transaction and switch those 47 staging tables into 47 main tables (first switch the existing partitions out and then switch new ones in), this usually takes less than 30 seconds but the wideness of the transaction and it needs to require schemalocks against tables that are constantly being read from processes that make 6-20 table joins is making it unavoidable from deadlocks, since snapshot isolation does not save you from schema locks (not in alwayson)

Other than this issue never had a problem making deadlocks disappear for very long time (sometimes during multiple simultaneous database backups they appear) by tweaking the code and indexes (covering some columns etc). But also we did not use entity framework in the app yet and isolation between database and app was through stored procs, this way we were able to tune as much as needed.

I would say if deadlocks were at an extend to be a problem just solve it instead of a bandit.

January 4, 2012 11:31 PM
 

Brent Ozar said:

Have you measured the impact of these changes on your TempDB?  Did you beef up the TempDB drives before implementing it?

January 5, 2012 8:31 AM
 

Alexander Kuznetsov said:

Gokhan,

Yes, Snapshot isolation does put more pressure on TempDb.

Also if I had a large transaction involving 47 tables, I might just let it deadlock and retry.

We are not using Enity Framework, and not going to. I hear what you are saying about EF preventing you from tuning.

January 5, 2012 9:25 AM
 

Alexander Kuznetsov said:

Brent,

Surely we had to make sure TempDb can handle it, and it cost some money.

On the other had, the time spent on troubleshooting also is not free, so the investment in better TempDb paid off soon.

Besides, there are important things that are difficult to measure, such as customers' confidence. Obviously repeated problems such as deadlock erode customers' confidence. Being small, we don't want to lose their business.

One more thing: being Agile, we are always just a few days from the next release. Losing a day on troubleshooting is a very big deal if our deadline is tomorrow. Cost of missing deadlines because we had to troubleshoot also need to be considered.

January 5, 2012 9:32 AM
 

Gokhan Varol said:

Does anybody have experience to share with tempdb on ssd or fusionio?

January 5, 2012 2:27 PM
 

AlexK said:

Gokhan,

If you ask your question on stackoverflow or dba.stackexchange, you will surely get a good answer.

January 5, 2012 5:07 PM
 

Sergey Smelov said:

Alex,

Have you ever experienced page split deadlocks?

March 6, 2012 2:22 AM
 

AlexK said:

Sergey,

I don't recall ever encountering such a thing.  

March 6, 2012 10:09 AM
 

Sergey Smelov said:

That's good. Because when i added a clustered index on uniqueidentifier column and did a concurrent INSERTS/SELECTS on one table i think i had a page split deadlock... INSERT statement asked for two page locks and SELECT asked for two page locks in different order. After that i understood on practice that having a uniqueidentifier as a clustered index is a very bad thing. I think there are lack of information about why deadlocks can happen, how to avoid it and how to handle it without using magic hints and changing default isolation level... It's like an area nobody thinks about until deadlocks start to happen.

March 11, 2012 10:47 AM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works at DRW Trading Group in Chicago, where he leads a team of developers, practicing agile development, defensive programming, TDD, and database unit testing.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement