THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Learning PostgreSql: Fun with REPEATABLE READ

In this post we shall run some examples under REPEATABLE READ, and see how they behave differently.

Setting up test data

On SQL Server, run the following:

CREATE TABLE Tickets(
  
ID INT NOT NULL,
  
Problem VARCHAR(100) NOT NULL,
  
SpaceFiller CHAR(200) NOT NULL
);

INSERT INTO Tickets(
  
ID ,
  
Problem,
  
SpaceFiller)
SELECT Number*10, 'Problem '+CAST([Number] AS VARCHAR(10))+'0', 'Space Filler'
FROM data.Numbers;

ALTER TABLE Tickets ADD PRIMARY KEY(ID);

UPDATE Tickets SET Problem = 'Add 16Gb Ram to Kenny''s workstation'
WHERE ID = 90090;

On PostgreSql, we do not need a Numbers table - there is a built in function generate_series, which is very useful:

CREATE TABLE Tickets(
  
ID INT NOT NULL,
  
Problem VARCHAR NOT NULL
);

TRUNCATE TABLE Tickets;

INSERT INTO Tickets(
  
ID ,
  
Problem)
SELECT generate_series*10, 'Problem ' || CAST(generate_series AS VARCHAR)
FROM generate_series(1, 16000);
  
ALTER TABLE Tickets ADD PRIMARY KEY(ID);

UPDATE Tickets SET Problem = 'Add 16Gb Ram to Kenny''s workstation'
WHERE ID = 90090;

Identical selects running under REPEATABLE READ on Sql Server may return different results 

The following scripts demonstrate that reads are not completely repeatable, despite the isolation level's name being REPEATABLE READ and as such claiming otherwise.

In one tab, run this:

BEGIN TRANSACTION;
UPDATE Tickets SET Problem = 'Replace printer in NW corner'
WHERE ID = 49000;

In another tab, run this:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT COUNT(*) AS TotalProblems,
SUM(CASE WHEN Problem LIKE '%Kenny%' THEN 1 END) AS [Kenny's Problems],
SUM(CASE WHEN Problem LIKE '%printer%' THEN 1 END) AS [Printer Problems]
FROM Tickets WHERE ID BETWEEN 0 AND 100000;

This select is blocked by the uncommitted update in the first tab and as such it does not complete. Let us get back to the first tab and run the following:

UPDATE Tickets SET ID=101
WHERE ID = 90090;

COMMIT;

This update physically moves Kenny's ticket from a page that has not been read by the select yet to a page which has already been read. Once the transaction in the first tab commits, the select in the second tab completes with the following results:

TotalProblems Kenny's Problems Printer Problems
------------- ---------------- ----------------
10000         NULL             1

Clearly Kenny's ticket has not been read at all. If we rerun this select, in the same transaction, we shall get different results, as follows:

TotalProblems Kenny's Problems Printer Problems
------------- ---------------- ----------------
10001         1            1

Note that Kenny's problem is included in the second result set, and the total in TotalProblems column is now correct as well.

As we have seen,

  • when we rerun queries in the same transaction, we may get different results 
  • selects may return rows inserted after the select started - such as the only ticket for "Printer problem"
  • selects may fail to return some rows that were committed before the transaction started - note that the ticket for "Kenny's problem" is completely missing from the first result set. This effect may also cause incorrect totals. Note that TotalProblems column is different in the first and second result sets.

If we rerun this example on PostgreSql, the first select is not blocked by the modification, and consequent selects return the same results as the first one.

REPEATABLE READ on Sql Server does not prevent lost updates

The following scenario demonstrates how an update can be lost. To begin the scenario, run the following script in one tab:

BEGIN TRANSACTION;

UPDATE Tickets
SET Problem = 'Water cooler making funny noises'
WHERE ID = 4010;

In another tab, run this:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;

UPDATE Tickets
SET Problem = 'Printer problem ' + CAST(ID AS VARCHAR(5))
WHERE ID BETWEEN 4000 AND 4100;

The update is waiting on the uncommitted change. As soon as we commit the transaction in the first tab, the update completes. Note that it overwrites the change from the first tab:

SELECT Problem FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;

Problem
----------------------------------------------------------------------------------------------------
(snip)
Printer problem 4010
(snip)
Similarly, a modification running under REPEATABLE READ can update or delete rows that were inserted after the transaction started. 
To reproduce, let us begin archiving tickets in one tab: 

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;

SELECT *
INTO ArchivedTickers
FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;
In another tab, let us add a new ticket: 
INSERT Tickets(ID, Problem, SpaceFiller)
VALUES(4005, 'Water cooler making funny noises', 'Test value');

Let us get back to the first tab and complete the archiving:

DELETE FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;
COMMIT;

SELECT COUNT(*) FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;

-----------
0

Clearly the new ticket number 4005 was deleted along with the archived ones. This is a textbook example of a lost update.

REPEATABLE READ on PostgreSql prevents lost updates

Let us rerun the previous scenario - we shall observe a very different behavior. In one session, let us start archiving a range of tickets:

BEGIN ISOLATION LEVEL REPEATABLE READ;;

CREATE TABLE archived_tickets
AS
SELECT
*
FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;

In another session, let us add a new ticket:

INSERT INTO Tickets(ID, Problem)
VALUES(4005, 'Water cooler making funny noises');

In the first session, let us complete the archiving. The unarchived ticket number 4005 has not been deleted:

DELETE FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;

COMMIT;

SELECT Problem FROM Tickets
WHERE ID BETWEEN 4000 AND 4100;

"Water cooler making funny noises"

Let us rerun the scenario that updates an existing ticket. In one session, run this:

BEGIN TRANSACTION;

UPDATE Tickets
SET Problem = 'Add index on Shipments.Height'
WHERE ID = 4010;

In another session run this:

BEGIN ISOLATION LEVEL REPEATABLE READ;;

UPDATE Tickets
SET Problem = 'Printer problem ' || CAST(ID AS VARCHAR(5))
WHERE ID BETWEEN 4000 AND 4100;

Commit the transaction in the first session, and the second transaction detects a lost update and blows up, as follows: "ERROR:  could not serialize access due to concurrent update".

As we have seen, PostgreSql has detected and prevented both lost updates. As such, we do not need to troubleshoot two subtle bugs that are rather difficult to reproduce.

Conclusion

As we have seen REPEATABLE READ is implemented very differently on PostgreSql, so we need to be very careful when we port T-SQL that uses REPEATABLE READ. As we develop against PostgreSql, we can and should take advantage of its optimistic implementation of isolation levels, rather than trying to exactly replicate Sql Server's behavior.

Published Thursday, December 19, 2013 10:11 AM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

mahesh sharma said:

another option you can learn sql by online free tutorial at Javatpoint.

www.javatpoint.com/sql-tutorial

April 9, 2014 2:42 AM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works at DRW Trading Group in Chicago, where he leads a team of developers, practicing agile development, defensive programming, TDD, and database unit testing.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement