THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Deriving a list of tables in dependency order

Currently at work I’m on a data warehousing project and one of the things we typically need to do during development is clear down our data warehouse in preparation for simulating a first-time load. I needed to write a script that would loop over all the tables and empty them one-by-one (that bit is not particularly hard) but which would also respect referential integrity constraints (i.e. foreign keys) – or what I would call the dependency hierarchy. This blog post documents what I came up with.

N.B. No doubt many reading this will already have their own method of doing this which may also be very different (and easier) from mine but nonetheless I thought I’d throw this out there anyway because (a) it demonstrates some nice features of T-SQL such as common table expressions, (b) it was a fun little challenge to try and solve, (c) I’ll probably need this again some day and (d) why the hell not!

I’ll demonstrate using the [AdventureWorks] sample database. First thing I needed to do was to get a list of all the tables in all schemas that had at least one foreign key attached to it; for this I need to use the sys.foreign_keys view:

select    s1.name as from_schema
,        o1.Name as from_table
,        s2.name as to_schema
,        o2.Name as to_table
from    sys.foreign_keys fk
inner    join sys.objects o1
on        fk.parent_object_id = o1.object_id
inner    join sys.schemas s1
on        o1.schema_id = s1.schema_id
inner    join sys.objects o2
on        fk.referenced_object_id = o2.object_id
inner    join sys.schemas s2
on        o2.schema_id = s2.schema_id
/*For the purposes of finding dependency hierarchy
   we're not worried about self-referencing tables*/
where    not    (    s1.name = s2.name
            and    o1.name = o2.name)

 
Notice that I am disregarding self-referencing tables which are not significant for determining the dependency hierarchy.
 
That SQL provides the following result:

tables participating in foreign key relationships

I thought about using the OBJECT_NAME() and SCHEMA_NAME() functions but decided against it because they won’t work across databases which I eventually intend this to do. Update: Thanks to Ben in the comments who pointed out that OBJECT_NAME() and SCHEMA_NAME() can take a database_id argument so I could have used them after all!

All good so far. Now we need to work out which of those tables do not have any dependencies (i.e. do not have any foreign keys pointing to them). We do a LEFT OUTER JOIN between the list of all tables and our previous list of tables with foreign keys which enables to discover which are not in that previous list:

with fk_tables as (
    select    s1.name as from_schema    
    ,        o1.Name as from_table    
    ,        s2.name as to_schema    
    ,        o2.Name as to_table    
    from    sys.foreign_keys fk    
    inner    join sys.objects o1    
    on        fk.parent_object_id = o1.object_id    
    inner    join sys.schemas s1    
    on        o1.schema_id = s1.schema_id    
    inner    join sys.objects o2    
    on        fk.referenced_object_id = o2.object_id    
    inner    join sys.schemas s2    
    on        o2.schema_id = s2.schema_id    
    /*For the purposes of finding dependency hierarchy       
        we're not worried about self-referencing tables*/
    where    not    (    s1.name = s2.name                 
            and        o1.name = o2.name)
)
SELECT    s.name as schemaName
,        t.name as tableName
,        0 AS Level
            FROM    (    select    *
            from    sys.tables             
            where    name <> 'sysdiagrams') t
INNER    JOIN sys.schemas s
on        t.schema_id = s.schema_id
LEFT    OUTER JOIN fk_tables fk
ON        s.name = fk.from_schema
AND        t.name = fk.from_table
WHERE    fk.from_schema IS NULL;


In AdventureWorks that gives us 25 such tables out of 71 in total. Next is to find a list of tables that have a dependency on one of those 25 tables and then in turn which tables have a dependency on the tables in that second list and so on. Hence its a recursive operation and hence we need a recursive common table expression (CTE). We use our previous result set as the anchor and then it join back to our original list of tables that contain foreign keys. That gives us many tables appearing twice and hence we need to join the eventual result set back onto itself to discover how many max levels “deep” a table lies in the dependency hierarchy:
 

with fk_tables as (
    select    s1.name as from_schema    
    ,        o1.Name as from_table    
    ,        s2.name as to_schema    
    ,        o2.Name as to_table    
    from    sys.foreign_keys fk    
    inner    join sys.objects o1    
    on        fk.parent_object_id = o1.object_id    
    inner    join sys.schemas s1    
    on        o1.schema_id = s1.schema_id    
    inner    join sys.objects o2    
    on        fk.referenced_object_id = o2.object_id    
    inner    join sys.schemas s2    
    on        o2.schema_id = s2.schema_id    
    /*For the purposes of finding dependency hierarchy       
        we're not worried about self-referencing tables*/
    where    not    (    s1.name = s2.name                 
            and        o1.name = o2.name)
)
,ordered_tables AS
(        SELECT    s.name as schemaName
        ,        t.name as tableName
        ,        0 AS Level    
        FROM    (    select    *                
                    from    sys.tables                 
                    where    name <> 'sysdiagrams') t    
        INNER    JOIN sys.schemas s    
        on        t.schema_id = s.schema_id    
        LEFT    OUTER JOIN fk_tables fk    
        ON        s.name = fk.from_schema    
        AND        t.name = fk.from_table    
        WHERE    fk.from_schema IS NULL
        UNION    ALL
        SELECT    fk.from_schema
        ,        fk.from_table
        ,        ot.Level + 1    
        FROM    fk_tables fk    
        INNER    JOIN ordered_tables ot    
        ON        fk.to_schema = ot.schemaName    
        AND        fk.to_table = ot.tableName
)select    distinct    ot.schemaName
,        ot.tableName
,        ot.Level
from    ordered_tables ot
inner    join (
        select    schemaName
        ,        tableName
        ,        MAX(Level) maxLevel        
        from    ordered_tables        
        group    by schemaName,tableName
        ) mx
on        ot.schemaName = mx.schemaName
and        ot.tableName = mx.tableName
and        mx.maxLevel = ot.Level;


That gives us the full list of tables along with their level in the dependency hierarchy:

AdventureWorks tables dependency hierarchy

Essentially then the hard work is done and we simply need to plug that result set into a cursor (in descending order of [Level]), emptying each one as we go. I won’t include that code here but you can download the full script from my Skydrive:

That script has been modified slightly in order to take a database name as a parameter and hence must be run in using SQLCMD (or use SQLCMD mode in SSMS).

Comments are welcome!

@Jamiet

UPDATE: Based on the code herein I have built sp_CascadingDataViewer. From that blog post:

sp_CascadingDataViewer has one basic premise: it tells you how many rows in your database are dependant on a row (or set of rows) in a table. Or to put it another way, how many rows would you have to delete from your database in order to be able to delete a given row (or set of rows). All you need to do is specify.

  • A database
    • A schema in that database
      • A table in that schema
        • A set of columns in that table
        • A set of values, one for each specified column respectively

and sp_CascadingDataViewer will tell you all the rows dependant on the specified rows and optionally show you all of those dependant rows as well.  

 

 

P.S. Code formatting was done using Paste As Visual Studio Code plugin for Windows Live Writer

Published Tuesday, September 08, 2009 9:51 PM by jamiet
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Ben Thul said:

<blockquote>I thought about using the OBJECT_NAME() and SCHEMA_NAME() functions but decided against it because they won’t work across databases which I eventually intend this to do.</blockquote>

OBJECT_NAME() takes an optional second argument of database_id which allows it to work cross-database.  Similarly for OBJECT_SCHEMA_NAME().  Also, by using OBJECT_SCHEMA_NAME, you can avoid two joins (as you don't have to explicitly pull in sys.schemas).  Otherwise...good stuff!

September 8, 2009 4:39 PM
 

Jamie Thomson said:

Aha, yes, thanks Ben. Post now updated!

-Jamie

September 8, 2009 4:55 PM
 

Tom groszko said:

This is the way I do it.

SET NOCOUNT ON;

DECLARE @RITable TABLE

( object_id INT PRIMARY KEY

,SchemaName SYSNAME NOT NULL

,TableName SYSNAME NOT NULL

,RILevel TINYINT DEFAULT 0

,IsSelfReferencing TINYINT DEFAULT 0

,HasExcludedRelationship TINYINT DEFAULT 0

,UpdateCount TINYINT DEFAULT 0

);

INSERT @RITable

( object_id

,SchemaName

,TableName

,RILevel

,IsSelfReferencing

,HasExcludedRelationship

,UpdateCount

)

SELECT tables.object_id

,schemas.name

,tables.name

,0

,SUM (CASE WHEN foreign_keys.parent_object_id IS NULL THEN 0 ELSE 1 END)

,SUM (CASE WHEN foreign_keys02.referenced_object_id IS NULL THEN 0 ELSE 1 END)

,0

FROM sys.tables tables

JOIN sys.schemas schemas ON tables.schema_id = schemas.schema_id

LEFT JOIN sys.foreign_keys foreign_keys ON tables.object_id = foreign_keys.parent_object_id

AND tables.object_id = foreign_keys.referenced_object_id

LEFT JOIN sys.foreign_keys foreign_keys01 ON tables.object_id = foreign_keys01.parent_object_id

LEFT JOIN sys.foreign_keys foreign_keys02 ON foreign_keys01.parent_object_id = foreign_keys02.referenced_object_id

AND foreign_keys01.referenced_object_id = foreign_keys02.parent_object_id

AND foreign_keys01.parent_object_id <> foreign_keys01.referenced_object_id

WHERE tables.name NOT IN ('sysdiagrams', 'dtproperties')

GROUP BY  tables.object_id

,schemas.name

,tables.name;

DECLARE @LookLevel INT;

DECLARE @MyRowcount INT;

SELECT @LookLevel = 0

,@MyRowcount = -1;  

WHILE (@MyRowcount <> 0)

BEGIN UPDATE ChildTable

SET  RILevel = @LookLevel + 1

,UpdateCount = ChildTable.UpdateCount + 1

FROM @RITable ChildTable

JOIN sys.foreign_keys foreign_keys ON ChildTable.object_id = foreign_keys.parent_object_id  

   JOIN @RITable ParentTable ON foreign_keys.referenced_object_id = ParentTable.object_id

AND ParentTable.RILevel = @LookLevel

LEFT JOIN sys.foreign_keys foreign_keysEX ON foreign_keys.parent_object_id = foreign_keysEX.referenced_object_id

AND foreign_keys.referenced_object_id = foreign_keysEX.parent_object_id

AND foreign_keys.parent_object_id <> foreign_keys.referenced_object_id

WHERE ChildTable.object_id <> ParentTable.object_id

 AND foreign_keysEX.referenced_object_id IS NULL;

SELECT @MyRowcount = @@ROWCOUNT;

SELECT @LookLevel = @LookLevel + 1;

END;  

SELECT RITable.SchemaName SchemaName

,RITable.TableName TableName

,RITable.RILevel RILevel

,CASE WHEN RITable.IsSelfReferencing > 0

THEN CAST(1 AS BIT)

ELSE CAST(0 AS BIT)

END IsSelfReferencing

,CASE WHEN RITable.HasExcludedRelationship > 0

THEN CAST(1 AS BIT)

ELSE CAST(0 AS BIT)

END HasExcludedRelationship

FROM @RITable RITable

ORDER BY RITable.RILevel DESC

,RITable.TableName ASC;

-- Excluded relationships

SELECT foreign_keys01.name ForeignKeyName  

,ParentSchema.name ParentSchema

,ParentObject.name ParentTable

,ChildSchema.name ChildSchema

,ChildObject.name ChildTable

FROM sys.foreign_keys foreign_keys01

JOIN sys.foreign_keys foreign_keys02 ON foreign_keys01.parent_object_id = foreign_keys02.referenced_object_id

AND foreign_keys01.referenced_object_id = foreign_keys02.parent_object_id

AND foreign_keys01.parent_object_id <> foreign_keys01.referenced_object_id

JOIN sys.objects ParentObject ON foreign_keys01.parent_object_id = ParentObject.object_id

JOIN sys.schemas ParentSchema ON ParentObject.schema_id = ParentSchema.schema_id

JOIN sys.objects ChildObject ON foreign_keys01.referenced_object_id = ChildObject.object_id

JOIN sys.schemas ChildSchema ON ChildObject.schema_id = ChildSchema.schema_id;

Excluded relationships are two tables that reference each other.

tom.groszko@charter.net

September 8, 2009 10:34 PM
 

Rob Farley said:

See, I just put the tables in there without the FKs and then put the FKs in a separate script.

September 9, 2009 5:13 AM
 

Daniel said:

Nice script, but what about this:

exec sp_MSforeachtable @command1=N'ALTER TABLE ? NOCHECK CONSTRAINT ALL; ALTER TABLE ? DISABLE TRIGGER ALL;'

,@command2=N'SET QUOTED_IDENTIFIER ON; DELETE FROM ?;'

,@command3=N'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL; ALTER TABLE ? ENABLE TRIGGER ALL'

exec sp_MSforeachtable @command1=N'DBCC CHECKIDENT(''?'', RESEED, 1);',

@whereand=N'AND OBJECTPROPERTYEX(OBJECT_ID(N''?'', N''U''), N''TableHasIdentity'') = 1'

The idea is:

1 disable checks/FK's/triggers,

2 delete data

3 enable checks/FK's/triggers

4 reseed identity columns

September 9, 2009 8:40 AM
 

Daniel said:

Sorry, that didn't work, it should have been:

exec sp_MSforeachtable N'ALTER TABLE ? NOCHECK CONSTRAINT ALL; ALTER TABLE ? DISABLE TRIGGER ALL;';

exec sp_MSforeachtable N'SET QUOTED_IDENTIFIER ON; DELETE FROM ?;';

exec sp_MSforeachtable N'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL; ALTER TABLE ? ENABLE TRIGGER ALL';

exec sp_MSforeachtable @command1=N'DBCC CHECKIDENT(''?'', RESEED, 1);',

@whereand=N'AND OBJECTPROPERTYEX(OBJECT_ID(N''?'', N''U''), N''TableHasIdentity'') = 1';

September 9, 2009 8:47 AM
 

Jamie Thomson said:

Daniel,

yeah, I thought about that. But I figured it would be a useful exercise to be able to derive the dependancy hierarchy. Thus I refer you to my earlier comment: " (d) why the hell not" :)

-Jamie

September 9, 2009 9:18 AM
 

Luciano Evaristo Guerche (Gorše) said:

Jamie,

On old days when SQL CTE feature was not born yet, I created "Stored Procedure ForeignkeysAnalyze" (http://weblogs.asp.net/guerchele/archive/2004/02/10/Stored-Procedure-ForeignkeysAnalyze.aspx). Although 5 years have past since then, I feel that post is still relevant and have some similarities to yours. Take a look and let me know what do you think about.

Regards.

September 9, 2009 9:20 AM
 

telcogod said:

can this be used to create a wipeout script?

on a much bigger scale, but similar to delete all orderdetails and orders where customerid = 1

September 10, 2009 1:30 PM
 

jamiet said:

telcogod,

Yes, it definitely can. And it gives me a great idea. Watch this space!

-Jamie

September 10, 2009 4:29 PM
 

R Glen Cooper said:

Nice use of CTE.

See also:

http://www.sqlservercentral.com/NewsletterArchive/2009/03/20/2496466

for another approach.

September 13, 2009 5:06 PM
 

Greg said:

Any suggestions for going past the maximum 100 recursion limit?

September 14, 2009 11:14 AM
 

jamiet said:

greg,

Put:

OPTION (MAXRECUSION 0)

on the end of it!

So:

with fk_tables as (

   select    s1.name as from_schema    

   ,        o1.Name as from_table    

   ,        s2.name as to_schema    

   ,        o2.Name as to_table    

   from    sys.foreign_keys fk    

   inner    join sys.objects o1    

   on        fk.parent_object_id = o1.object_id    

   inner    join sys.schemas s1    

   on        o1.schema_id = s1.schema_id    

   inner    join sys.objects o2    

   on        fk.referenced_object_id = o2.object_id    

   inner    join sys.schemas s2    

   on        o2.schema_id = s2.schema_id    

   /*For the purposes of finding dependency hierarchy      

       we're not worried about self-referencing tables*/

   where    not    (    s1.name = s2.name                

           and        o1.name = o2.name)

)

,ordered_tables AS

(        SELECT    s.name as schemaName

       ,        t.name as tableName

       ,        0 AS Level    

       FROM    (    select    *                

                   from    sys.tables                

                   where    name <> 'sysdiagrams') t    

       INNER    JOIN sys.schemas s    

       on        t.schema_id = s.schema_id    

       LEFT    OUTER JOIN fk_tables fk    

       ON        s.name = fk.from_schema    

       AND        t.name = fk.from_table    

       WHERE    fk.from_schema IS NULL

       UNION    ALL

       SELECT    fk.from_schema

       ,        fk.from_table

       ,        ot.Level + 1    

       FROM    fk_tables fk    

       INNER    JOIN ordered_tables ot    

       ON        fk.to_schema = ot.schemaName    

       AND        fk.to_table = ot.tableName

)select    distinct    ot.schemaName

,        ot.tableName

,        ot.Level

from    ordered_tables ot

inner    join (

       select    schemaName

       ,        tableName

       ,        MAX(Level) maxLevel        

       from    ordered_tables        

       group    by schemaName,tableName

       ) mx

on        ot.schemaName = mx.schemaName

and        ot.tableName = mx.tableName

and        mx.maxLevel = ot.Level

OPTION (MAXRECUSION 0);

-Jamie

September 14, 2009 11:29 AM
 

SSIS Junkie said:

In September 2009 I wrote one of my first blog posts on SQLBlog entitled Deriving a list of tables in

September 19, 2010 4:22 PM
 

SSIS Junkie said:

Sometimes I use this blog to share scripts that I know I'm going to need at some point in the future,

September 19, 2011 6:07 AM
 

Gino said:

Checkout this post

http://www.cachelog.net/database-tables-tree-structure/

where I build an example on AdventureWorks to get table relationships

based on foreign keys.

November 1, 2011 3:12 PM
 

Naomi said:

Hi Jamie,

How can I get a list of all dependencies for a particular table?

E.g. I want to see Parent Table Child Table list (with the keys) and down to hierarchy.

Your query seems to solve the different problem.

November 28, 2011 7:37 PM
 

jamiet said:

Hi Naomi,

You would need to change the anchor part of the ordered_tables CTE above. Currently it is this:

       SELECT    s.name as schemaName

       ,        t.name as tableName

       ,        0 AS Level    

       FROM    (    select    *                

                   from    sys.tables                

                   where    name <> 'sysdiagrams') t    

       INNER    JOIN sys.schemas s    

       on        t.schema_id = s.schema_id    

       LEFT    OUTER JOIN fk_tables fk    

       ON        s.name = fk.from_schema    

       AND        t.name = fk.from_table    

       WHERE    fk.from_schema IS NULL

you would need to change it to something like this:

       SELECT    <your_table_schema_name> as schemaName

       ,        <your_table_name> as tableName

       ,        0 AS Level    

Hope that works for you.

November 29, 2011 3:56 AM
 

Naomi said:

Thanks.

November 29, 2011 9:26 PM
 

Evan said:

Thanks a lot for this post.  I was struggling to do something similar and this saved me a ton of time!

April 26, 2012 12:50 PM
 

Sarkunan said:

Hi this is looking gud... thanks for this post...

October 23, 2012 2:47 AM
 

Craig said:

Great stuff thanks, much appreciated!

November 26, 2013 4:30 AM
 

Rajesh Kumar said:

Thanks very much!

June 12, 2014 2:43 AM
 

pmbAustin said:

No matter what I do, I get an error:

The statement terminated. The maximum recursion 2000 has been exhausted before statement completion.

It doesn't matter what I set MAXRECURSION to.  Even 32767.  It just takes longer to die.

And I'm trying  to limit it to just one table (to get all its references).  Still dies.

Yes, there are 800 tables in the database, but only a couple dozen reference this one table.  Ideas?  Anyone?

July 14, 2014 3:56 PM
 

Unit testing – Purging all tables in a SQL database prior to each test | Saurabh's musings on SharePoint, .NET, C#, HTML5/JavaScript, Powershell said:

August 19, 2014 12:32 PM
 

jamiet said:

August 20, 2014 4:50 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement