THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Avoiding infinite loops. Part One.

Although there are many discussions about which kind of cursor or loop performs the best, there is no doubt which loops perform the worst – the infinite ones of course. Whenever you write a loop, you need to make sure that it never runs infinitely. I will provide two examples, two rather common scenarios that may potentially cause loops to run infinitely. In this post I will describe a common yet error-prone method of storing and traversing hierarchies. In the next one I will demonstrate how processing rows one by one can sometimes run infinitely.

 

Prerequisites

 

The following table allows to store hierarchies but does not prevent cycles.

CREATE TABLE data.Employee(

     EmployeeID int NOT NULL,

     ManagerID int NULL,

     FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

     LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

 CONSTRAINT PK_Employee_EmployeeID PRIMARY KEY CLUSTERED

(

     EmployeeID ASC

)WITH (IGNORE_DUP_KEY = OFF)

)

 

GO

SET ANSI_PADDING OFF

GO

USE Test

GO

ALTER TABLE data.Employee  WITH CHECK ADD  CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID)

REFERENCES data.Employee (EmployeeID)

 

The following short employee chart is perfectly correct:

 

INSERT INTO data.Employee

           (EmployeeID

           ,ManagerID

           ,FirstName

           ,LastName)

SELECT 1, NULL, 'Dan', 'Chang' UNION ALL

SELECT 2, 1, 'June', 'Yang'


The following employee chart makes no sense at all - it contains a cycle. Yet is can be stored all right:

 

INSERT INTO data.Employee

           (EmployeeID

           ,ManagerID

           ,FirstName

           ,LastName)

SELECT 3, 4, 'Sydney', 'Hobart' UNION ALL

SELECT 4, 3, 'Hobart', 'Sydney'

 

Retrieving such hierarchies may end up in an endless loop

 

Consider the following function:

CREATE FUNCTION Readers.GetManagerAndTeam (@EmployeeID INT)

RETURNS @ret TABLE

(

    EmployeeID INT NOT NULL,

    ManagerID INT NULL,

     FirstName varchar(50),

     LastName varchar(50),

    Level INT NOT NULL

)

AS

BEGIN

DECLARE @level INT, @rc INT;

INSERT INTO @ret(EmployeeID,

     ManagerID,

     FirstName,

     LastName,

    Level)

  SELECT EmployeeID,

     ManagerID,

     FirstName,

     LastName,

    1

  FROM data.Employee

  WHERE EmployeeID=@EmployeeID;

SELECT @level=1, @rc=@@ROWCOUNT;

WHILE @rc>0 BEGIN

  INSERT INTO @ret(EmployeeID,

           ManagerID,

           FirstName,

           LastName,

           Level)

       SELECT e.EmployeeID,

           e.ManagerID,

           e.FirstName,

           e.LastName,

           @level+1

       FROM data.Employee e JOIN @ret r ON e.ManagerID=r.EmployeeID

       WHERE r.Level=@level;

  SELECT @level=@level+1, @rc=@@ROWCOUNT;

END

RETURN;

END

 

This function correctly returns a valid employee chart:

 

 SELECT EmployeeID, ManagerID, FirstName, LastName, Level

  FROM Readers.GetManagerAndTeam(1)

 

But sometimes it runs infinitely. See for yourself:

 

SELECT EmployeeID, ManagerID, FirstName, LastName, Level

  FROM Readers.GetManagerAndTeam(3)

 

Recursive function as a workaround

 

You can implement the same functionality with recursion, as follows

CREATE FUNCTION Readers.GetManagerAndTeam(@ManagerID INT)

RETURNS TABLE

AS

RETURN(

WITH ManagerAndTeam AS(

  SELECT EmployeeID, ManagerID, FirstName, LastName, CAST(1 AS INT) AS Level

    FROM data.Employee

    WHERE EmployeeID = @ManagerID

  UNION ALL

  SELECT e.EmployeeID, e.ManagerID, e.FirstName, e.LastName, m.Level + 1

    FROM data.Employee e JOIN ManagerAndTeam m

      ON m.EmployeeID = e.ManagerID)

SELECT EmployeeID, ManagerID, FirstName, LastName, Level

  FROM ManagerAndTeam);

Under default settings this function will not run forever, it will stop after 100 iterations:

 

SELECT EmployeeID, ManagerID, FirstName, LastName, Level

  FROM Readers.GetManagerAndTeam(3)

Msg 530, Level 16, State 1, Line 1

The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

 

However, it still can run forever. See for yourself.

 

SELECT EmployeeID, ManagerID, FirstName, LastName, Level

  FROM Readers.GetManagerAndTeam(3) OPTION (MAXRECURSION 0)

 

 

Fix the data!

 

You definitely can write a better UDF that will robustly handle dirty data. However, it is preferable to fix the data instead. The original UDF worked correctly under the assumption that there are no cycles in the hierarchy. Let us have the database enforce that. The following table does just that – you cannot store invalid employee charts in it:

 

CREATE TABLE data.Employee(

     EmployeeID int NOT NULL,

     ManagerID int NULL,

     EmployeeLevel INT NOT NULL,

     ManagerLevel INT NULL,

     FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

     LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

 CONSTRAINT PK_Employee_EmployeeID PRIMARY KEY CLUSTERED

(

     EmployeeID ASC

)WITH (IGNORE_DUP_KEY = OFF),

 CONSTRAINT UNQ_Employee_EmployeeID_EmployeeLevel UNIQUE

(

     EmployeeID, EmployeeLevel

),

CONSTRAINT CHK_Employee_EmployeeLevel_ManagerLevel

  CHECK(EmployeeLevel=ManagerLevel+1),

CONSTRAINT CHK_Employee_ManagerID_ManagerLevel

  CHECK((ManagerID IS NULL AND ManagerLevel IS NULL)

     OR (ManagerID IS NOT NULL AND ManagerLevel IS NOT NULL))

)

ALTER TABLE data.Employee  WITH CHECK ADD  CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID,ManagerLevel)

REFERENCES data.Employee (EmployeeID,EmployeeLevel)

GO

 

 

Only valid data can be inserted into this table

 

INSERT INTO data.Employee

           (EmployeeID

           ,ManagerID

                ,EmployeeLevel

                ,ManagerLevel

           ,FirstName

           ,LastName)

SELECT 1, NULL, 1, NULL, 'Dan', 'Chang' UNION ALL

SELECT 2, 1, 2, 1, 'June', 'Yang'

 

 

There are several other ways to store hierarchies. Discussing them is beyond the scope of this post.

 

Here is the second part: When you process your rows one by one, avoid infinite loops
Published Monday, April 20, 2009 11:11 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

AaronBertrand said:

Good post Alex.  Though I think a much more common cause for infinite loops is:

DECLARE @i INT = 1;

WHILE @i < 20

BEGIN

 -- do something

 -- oops, forgot SET @i += 1;

END

I've done that at least a dozen times in my career, in various languages.  :-)

April 21, 2009 12:08 AM
 

Alex Kuznetsov said:

Hi Aaron,

In the next post I will describe just that. Besides forgetting to increment @i, there are some other potential problems with such loops.

April 21, 2009 9:10 AM
 

jayant said:

Hi Aaron

According to AaronBertrand is good but before the code u can put given following

set nocount on

DECLARE @i INT = 1;

WHILE @i < 20

BEGIN

-- insert ,update or delete query here

-- do something

-- oops, forgot SET @i += 1;

END

SET NOCOUNT OFF

Regards

Jayant

09313406257

09650336531

April 22, 2009 5:11 AM
 

Neal Pointer said:

So really you want a database where you can exclude, cycles in data, like a person A been both the boss of, and underlining of person B. But to do so you had to add a extra depth count field, level to the database. Right?

April 22, 2009 8:37 AM
 

Alex Kuznetsoc said:

Neal,

Yes, I have to add another column to prevent cycles - that would work without a major redesign. Should I do it from scratch, I might choose another approach to store hierarchies. For instance, on 2005 I would use materialized path:

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/08/24/store-your-configuration-settings-as-a-hierarchy-in-a-database.aspx

April 22, 2009 9:15 AM
 

Alexander Kuznetsov : When you process your rows one by one, avoid infinite loops said:

April 24, 2009 1:16 PM
 

Michael said:

Solution that uses Check Constraint which uses a Function.

No extra Columns required.

This is probably not very performant, but that doesn't matter in that (my) case.

What do you think of it? (michikutsam  @  yahoo   .de)

USE MyDatabaseName;

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[OrganisationUnit]') AND type in (N'U'))

DROP TABLE [dbo].[OrganisationUnit]

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[GetOrganisationUnitTopmostParent]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))

DROP FUNCTION [dbo].[GetOrganisationUnitTopmostParent]

CREATE TABLE OrganisationUnit (

 Id           INT NOT NULL CONSTRAINT Pk_OrganisationUnit_Id PRIMARY KEY,

 Name         VARCHAR(100) NOT NULL,

 ParentId   INT    NOT NULL CONSTRAINT Fk_OrganisationUnit_ParentId  REFERENCES OrganisationUnit(Id),  

);

Go

CREATE FUNCTION GetOrganisationUnitTopmostParent

              (@Id     INT,

               @Parent INT) -- if @Parent is invalid it might not get checked before reference Check -> leads to loop -> leads to time out error (is OK)

RETURNS INTEGER

AS

 BEGIN

   DECLARE  @return INT

   DECLARE  @newParent INT = @Parent

   -- if it has no parent

   IF @Id = @Parent

     SET @return = @Id

   ELSE

     WHILE @return IS NULL

       BEGIN

         SET @newParent = (SELECT ParentId

                           FROM   OrganisationUnit

                           WHERE  ID = @newParent)

         IF @newParent = @Id

           RETURN NULL -- Error, when loop

         ELSE

           IF (SELECT ParentId

               FROM   OrganisationUnit

               WHERE  ID = @newParent) = @newParent

             SET @return = @newParent

       END

   RETURN @return

 END

Go

ALTER TABLE OrganisationUnit

ADD CONSTRAINT Ck_OrganisationUnit_NoHierarchyLoop CHECK ( dbo.GetOrganisationUnitTopmostParent(Id,ParentId) is not null );

March 19, 2010 1:04 PM
 

Alex K said:

Michael,

UDFs wrapped in CHECK constraints have the following problems when we modify more than one row:

1. They may have false positives, i.e. allow incorrect modifications.

2. They may have false negatives, i.e. disallow correct modifications.

Also they may fail under snbpshot isolation.

Also they are terribly slow.

I would not use them.

May 23, 2010 11:15 PM

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement