Although there are many discussions about which kind of cursor or
loop performs the best, there is no doubt which loops perform the worst – the infinite
ones of course. Whenever you write a loop, you need to make sure that it never
runs infinitely. I will provide two examples, two rather common scenarios that
may potentially cause loops to run infinitely. In this post I will describe a
common yet error-prone method of storing and traversing hierarchies. In the
next one I will demonstrate how processing rows one by one can sometimes run
infinitely.
Prerequisites
The following table allows to store hierarchies but does not
prevent cycles.
CREATE TABLE
data.Employee(
EmployeeID int NOT NULL,
ManagerID int NULL,
FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT
PK_Employee_EmployeeID PRIMARY KEY CLUSTERED
(
EmployeeID ASC
)WITH (IGNORE_DUP_KEY
= OFF)
)
GO
SET ANSI_PADDING
OFF
GO
USE Test
GO
ALTER TABLE
data.Employee
WITH CHECK
ADD CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID)
REFERENCES data.Employee (EmployeeID)
The following short employee chart is perfectly correct:
INSERT INTO data.Employee
(EmployeeID
,ManagerID
,FirstName
,LastName)
SELECT 1, NULL, 'Dan', 'Chang' UNION ALL
SELECT 2, 1, 'June', 'Yang'
The following employee chart makes no sense at all - it contains a
cycle. Yet is can be stored all right:
INSERT INTO
data.Employee
(EmployeeID
,ManagerID
,FirstName
,LastName)
SELECT 3,
4, 'Sydney', 'Hobart' UNION ALL
SELECT 4,
3, 'Hobart', 'Sydney'
Retrieving such hierarchies may end up in an endless loop
Consider the following function:
CREATE FUNCTION
Readers.GetManagerAndTeam (@EmployeeID INT)
RETURNS @ret TABLE
(
EmployeeID INT
NOT NULL,
ManagerID INT
NULL,
FirstName varchar(50),
LastName varchar(50),
Level INT NOT NULL
)
AS
BEGIN
DECLARE @level INT, @rc INT;
INSERT INTO
@ret(EmployeeID,
ManagerID,
FirstName,
LastName,
Level)
SELECT
EmployeeID,
ManagerID,
FirstName,
LastName,
1
FROM data.Employee
WHERE
EmployeeID=@EmployeeID;
SELECT @level=1, @rc=@@ROWCOUNT;
WHILE @rc>0
BEGIN
INSERT INTO @ret(EmployeeID,
ManagerID,
FirstName,
LastName,
Level)
SELECT
e.EmployeeID,
e.ManagerID,
e.FirstName,
e.LastName,
@level+1
FROM
data.Employee e JOIN
@ret r ON e.ManagerID=r.EmployeeID
WHERE
r.Level=@level;
SELECT @level=@level+1, @rc=@@ROWCOUNT;
END
RETURN;
END
This function correctly returns a valid employee chart:
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM Readers.GetManagerAndTeam(1)
But sometimes it runs infinitely. See for yourself:
SELECT EmployeeID, ManagerID,
FirstName, LastName,
Level
FROM Readers.GetManagerAndTeam(3)
Recursive function as a workaround
You can implement the same functionality with recursion, as
follows
CREATE FUNCTION Readers.GetManagerAndTeam(@ManagerID INT)
RETURNS TABLE
AS
RETURN(
WITH
ManagerAndTeam AS(
SELECT
EmployeeID, ManagerID,
FirstName, LastName,
CAST(1 AS INT) AS Level
FROM data.Employee
WHERE
EmployeeID = @ManagerID
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.FirstName, e.LastName, m.Level + 1
FROM data.Employee e JOIN
ManagerAndTeam m
ON m.EmployeeID = e.ManagerID)
SELECT EmployeeID, ManagerID,
FirstName, LastName,
Level
FROM ManagerAndTeam);
Under default settings this function will not run forever,
it will stop after 100 iterations:
SELECT EmployeeID, ManagerID,
FirstName, LastName,
Level
FROM Readers.GetManagerAndTeam(3)
Msg 530, Level 16, State 1,
Line 1
The statement terminated. The
maximum recursion 100 has been exhausted before statement completion.
However, it still can run forever. See for yourself.
SELECT EmployeeID, ManagerID,
FirstName, LastName,
Level
FROM Readers.GetManagerAndTeam(3) OPTION (MAXRECURSION 0)
Fix the data!
You definitely can write a better UDF that will robustly
handle dirty data. However, it is preferable to fix the data instead. The
original UDF worked correctly under the assumption that there are no cycles in
the hierarchy. Let us have the database enforce that. The following table does
just that – you cannot store invalid employee charts in it:
CREATE TABLE
data.Employee(
EmployeeID int NOT NULL,
ManagerID int NULL,
EmployeeLevel INT NOT NULL,
ManagerLevel INT NULL,
FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT
PK_Employee_EmployeeID PRIMARY KEY CLUSTERED
(
EmployeeID ASC
)WITH (IGNORE_DUP_KEY
= OFF),
CONSTRAINT
UNQ_Employee_EmployeeID_EmployeeLevel UNIQUE
(
EmployeeID, EmployeeLevel
),
CONSTRAINT
CHK_Employee_EmployeeLevel_ManagerLevel
CHECK(EmployeeLevel=ManagerLevel+1),
CONSTRAINT
CHK_Employee_ManagerID_ManagerLevel
CHECK((ManagerID IS NULL AND ManagerLevel
IS NULL)
OR (ManagerID
IS NOT NULL AND ManagerLevel
IS NOT NULL))
)
ALTER TABLE
data.Employee
WITH CHECK
ADD CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID,ManagerLevel)
REFERENCES data.Employee (EmployeeID,EmployeeLevel)
GO
Only valid data can be
inserted into this table
INSERT INTO
data.Employee
(EmployeeID
,ManagerID
,EmployeeLevel
,ManagerLevel
,FirstName
,LastName)
SELECT 1,
NULL, 1, NULL, 'Dan', 'Chang' UNION ALL
SELECT 2,
1, 2, 1, 'June', 'Yang'
There are several other
ways to store hierarchies. Discussing them is beyond the scope of this post.
Here is the second part: When you process your rows one by one, avoid infinite loops