sysdepends, sp_depends and sys.sql_dependencies
Since my very first experience with SQL Server (version
6.5), I have used the system table sysdepends and the
system stored procedure sp_depends to track dependency
information in my databases. The purpose
of tracking dependency information is to allow me to determine which objects
are required; this can prevent me from deploying only parts of a solution, and not
realizing it until runtime. It is also a
quick way to get a visual picture of a database's organization.
A new catalog view was introduced with SQL Server 2005,
called sys.sql_dependencies. Other than the fact that this uses the new
sys schema and has a different column set, I have yet to stumble upon any case
where dependency information is tracked differently in this new view, when
compared to sysdepends.
As you can imagine, based on the volume of activity in newsgroups and other forums involving sysdepends, this has not always been a smooth exercise. Sadly, these system objects are not perfect, due to several
complications; I will first discuss some of these limitations, and then I will
provide a better solution that will take care of a couple of them, at least.
Dropping / Renaming
Dependent Objects
A problem that has been around since the 6.5 days (see KB #115333) is that when you create a table, then create a stored procedure that references
the table, and then drop the table; the dependency information remains in sysdepends, but points to an object_id
that no longer exists. The table may
later get re-created, but it will almost certainly have a different associated object_id. Furthermore,
in older versions of SQL Server, the old reference could remain orphaned in sysdepends permanently; thankfully, this has been fixed in modern
versions. But even in SQL Server 2008, because
dependency information is not refreshed during DDL events such as CREATE TABLE,
a new reference is not created unless you force it to happen – by dropping and
re-creating the procedure, or calling sys.sp_refreshsqlmodule.
This problem can also affect the ability of the system to
track dependency on columns. Consider
the case where you write a stored procedure that references columns 'a' and 'b'
of table 't', then you drop column 'a' and rename column 'b' to 'c'. Some of the mess is left in sysdepends and sys.sql_dependencies. Thankfully, the system will not let you create
a procedure that references a column that does not exist (unless the table also
does not exist), but once the stored procedure is created, there is nothing
stopping you from changing the column (such as renaming it) or dropping it from
the table altogether. In this case there
is nothing in sysdepends or sys.sql_dependencies
that indicates the stored procedure is referencing a column that no longer
exists… the user is left to find that out at runtime.
Cross-Database and
Cross-Server Dependencies
The system table sysdepends does
not maintain dependency information for any objects located outside of the
local database. This means that if you
want to check the validity of objects with three- or four-part names in SQL
Server 2005 or earlier, you are going to have to manually parse the definitions
of all of your objects. Native tools are
never going to report the validity of cross-database or cross-server
dependencies, but in SQL Server 2008, you will be able to get a report of the
dependencies that are expected – without any nasty parsing.
Deferred Name
Resolution
In SQL Server 7.0, Microsoft started a practice that many of
us regret today: deferred name resolution.
This allows you to create an object (such as a stored procedure) that
references another object (such as a view or table) that does not yet
exist. The assumption here is that you
will create the procedure now, but won't actually call it until later, after
the view or table has been created. The
problem is that, by default, the dependency information is not updated when the
view or table is created; so, even though your structure is intact, the
dependency checks cannot be made (so, the table could later be dropped, and
nothing would be checked). Deferred name
resolution also allows you to create objects with circular references; for
example, procedure A calls procedure B, B calls C, and C might conditionally call
A.
Deferred name resolution does not allow you to reference an
object at a linked server, or reference an invalid column for an object that
exists on the current instance. But it
does allow you to reference an object in any database on the current instance
that does not yet exist. (To understand
better why it works this way, you can read more about deferred name resolution
here: http://msdn.microsoft.com/en-us/library/ms190686(SQL.90).aspx
(SQL 2005) and here: http://msdn.microsoft.com/en-us/library/ms190686(SQL.100).aspx
(SQL 2008).)
One of the greatest adversaries of deferred name resolution
is fellow SQL Server MVP, Erland Sommarskog. He calls for a much more strict set of
checks, similar to OPTION EXPLICIT in Visual Basic, which would prevent
deferred name resolution all. You can
read about this in his article at the following URL: http://www.sommarskog.se/strict_checks.html
and a Connect item at the following URL:
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=260762
To correct dependencies for a table that has been created
after objects that reference it, you can manually call the system stored
procedure sys.sp_refreshsqlmodule for each
referencing object. The trick here still
is to determine which objects might reference it; this part, at least, is
solved to some extent in SQL Server 2008, and I will talk about this further on.
Dynamic SQL
Another problem is when you have a stored procedure that
refers to an object in dynamic SQL. The
system does not parse the contents of dynamic SQL that is executed via EXEC()
or sp_executeSQL, even if it is passed or created in
its entirety in a variable; so you can have a stored procedure like this, and
as of SQL Server 2008, it will never be caught by system dependency checking:
CREATE PROCEDURE dbo.GetFooBar AS BEGIN SET NOCOUNT
ON;
EXEC('SELECT bar FROM dbo.foo'); END GO |
In this case, you are pretty much on your own to ensure that
you reference valid objects. Nothing
native in SQL Server will inspect this code for you, and surely there are ways
you can write dynamic SQL that will prevent your own parsing routines or third
party applications from getting the information correctly and reliably (for
example, in many cases, dynamic SQL is used when you are passing in table or
database names, so they could only be parsed at runtime anyway).
What do people do now
to find dependencies not reported by sysdepends?
Manual Parsing
In addition to the information that they get from sysdepends, parsing is also often the route of choice for
local dependency tracking, however this can be unreliable as well. For example, let's assume I are looking for references to '%dbo.table_name%',
and I search the object_definition of all of your
objects for this text. A few of cases
where I will get incorrect results:
1)
dynamic SQL that interrupts the name, e.g.
EXEC('SELECT * FROM dbo.table' + '_name');
2)
dynamic SQL that uses passed object names, e.g.
EXEC('SELECT * FROM ' + @table);
3)
poor but accepted
format, e.g. SELECT * FROM dbo . table_name;
and,
4)
false positives, e.g.
SELECT * FROM dbo.table_name_2.
So, parsing the definitions of all of your SQL modules will
get you a little bit closer, but is still only a small part of the picture. If you want to see a very thorough T-SQL
parsing routine, that not only finds many missing dependencies, but also
generates a script for the objects in correct dependency order, see this article
in SQL Server Magazine by Omri Bahat:
A
Dependency Checker You Can Depend On
http://www.sqlmag.com/Article/ArticleID/95428/sql_server_95428.html
Third Party Tools
If you are not a do-it-yourself kind of DBA, there are
several tools available that will do a slightly better job than SQL Server will
do natively. Ironically, though, some of
the 3rd party tools designed to compensate for SQL Server's
weaknesses get much of their information from these same tables. The two that seem to receive the best praise
are made by ApexSQL and RedGate.
ApexSQL Doc
http://www.apexsql.com/sql_tools_doc.asp
Red-Gate
SQL Dependency Tracker
http://www.red-gate.com/products/SQL_Dependency_Tracker/index.htm
New capabilities in
SQL Server 2008
In SQL Server 2008, there is a new catalog view, called sys.sql_expression_dependencies. Before I begin, I will tell you first that,
while this view is better at solving a few of the problems outlined earlier, it
is not the answer to dependency problems.
For example, it will store the name of a table referenced in a stored
procedure, even if that table does not exist yet. And when the table is created, this view will
be updated with the correct referenced_object_id. However, sysdepends
is still out of date; for this, you still need to manually run sp_refreshsqlmodule against the stored procedure(s) that
reference the table.
Note: when using SQL Server 2008's new feature in SQL Server Management Studio,
IntelliSense, to write stored procedures against a SQL Server 2008 database,
unintentional deferred name resolution is going to be less of a problem. This is because invalid objects will be
highlighted with a red squiggly line:

Another great feature of sys.sql_expression_dependencies
that is an improvement over sysdepends and sys.sql_dependencies is that it will also store the names
of references to three- or four-part names, so you can now get a list of
dependency objects to check, without having to parse the contents of each
module. And while deferred name
resolution will allow a procedure to be created even though it references a
non-existent object in a different database on the same instance, it will not
allow this for an object on a linked server; the only problems you will find
with objects found on a linked server are that the server is down or otherwise
unreachable, or the referenced object has been dropped, moved or renamed since
the dependency information was recorded.
There are also two new dynamic management functions, called sys.dm_sql_referenced_entities and sys.dm_sql_referencing_entities,
which will parse referenced or referencing objects for dependencies on demand,
instead of looking them up statically in sysdepends. There are problems with this too, though; in
the case of deferred name resolution, the views generate errors if you
reference a column explicitly in a table that does not yet exist. Here is an example:
USE tempdb; GO CREATE TABLE dbo.foo ( bar INT ); GO CREATE PROCEDURE dbo.GetFooBar AS BEGIN SET NOCOUNT
ON;
SELECT bar FROM dbo.foo; END GO SELECT * FROM sys.dm_sql_referenced_entities ( 'dbo.GetFooBar', 'OBJECT' ); GO EXEC sp_rename 'dbo.foo.bar', 'blat', 'COLUMN'; GO SELECT * FROM sys.dm_sql_referenced_entities ( 'dbo.GetFooBar', 'OBJECT' ); GO DROP PROCEDURE dbo.GetFooBar; GO DROP TABLE dbo.foo; GO
|
The result of the second call to the function:
Msg 207, Level 16, State 1, Procedure GetFooBar, Line 6
Invalid column name 'bar'.
Msg 2020, Level 16, State 1, Line 1
The dependencies reported for entity "dbo.GetFooBar" do not include references to columns. This is either because the entity references an object that does not exist or because of an error in one or more statements in the entity. Before rerunning the query, ensure that there are no errors in the entity and that all objects referenced by the entity exist.
Fellow SQL Server MVPs Simon Sabin and Louis Davidson filed
bugs about this on Connect:
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=331830
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=350776
The Next Step
While SQL Server 2008 definitely gives us a better
dependency story, we still do not have a foolproof method. My aim here is to give you a more automated
way to compensate for some of these deficiencies. With that said, I am going to restrict the
discussion to tracking object dependencies (not column dependencies) within a
single SQL Server instance. With a
little bit of code, I will help reduce the amount of time you spend tracking
down this dependency information.
The main problem with the missing dependency information is
that, in order to restore the information in sysdepends
(which various tools rely on), you need to manually call sys.sp_refreshsqlmodule. But why does that need to be the case? As I was researching the new dependency
objects earlier in the SQL Server 2008 beta cycle, I thought to myself,
"There has to be a better way."
It did not take long for me to come up with one; in fact, it took longer
to write the first prototype.
My idea here is to create a DDL trigger that refreshes any
SQL modules for which rows are found in sql_expression_dependencies
and are without a match in sysdepends. The goal is to
do as much as we can to help the system repair itself when a DDL event leaves
an orphaned reference, and when it cannot fix itself, it can at least tell us
it uncovered a problem ... either through a custom log, raising a SQL Server
error, sending an e-mail, etc. With a
custom logging solution you can prevent the same error from notifying you too
often.
This will catch pairs of actions that have been completed,
such as:
-
drop a table and re-create it;
-
alter a procedure to point to a new name or a
new schema, and then change the object; or,
-
create a stored
procedure that references an invalid object, and then create the object.
However, if you have something that is still incomplete, for
example if you change a module to point to an invalid object and have not yet
created that object, or if a referenced object in another database has since
been dropped, the only thing the DDL trigger can do is write a log entry and/or
alert you, since the dependency information cannot be corrected without further
action.
Why search for all objects?
You could write the trigger in such a way that it limits the search to
the object that caused the DDL trigger to fire, but if you perform a global
search, you can also be sure to catch changes that happened while the DDL
trigger was disabled (which is often performed during deployments, either
manually or by schema comparison / synchronization tools).
You could do something similar in SQL Server 2005, using a
DDL trigger, and blindly execute sp_refreshsqlmodule
for every object in the entire database.
But, this would not be very efficient, and would miss something that DDL
triggers can capture in SQL Server 2008: sp_rename. So, in SQL Server 2008, we can find the
missing dependencies using sys.sql_expression_dependencies,
and we can also capture renames, making the solution better targeted and a
little more complete.
The following DDL trigger and stored procedure will refresh any sql module that is found with a valid reference in sys.sql_dependencies, but without a match in sysdepends, whenever any of the defined DDL events occurred. It will
also log or alert you if it finds any references that do not exist, whether
they are in the current database or in another database on the server. I have moved much of the logic to a stored
procedure, so you can call it on demand as well; you should also change the logic to log to a table or send an e-mail in relevant cases, because this demonstration assumes that the one who is performing the DDL actions (you) is also the one who wants to know about any dependency problems those DDL actions might be leaving behind.
CREATE PROCEDURE dbo.RefreshSysDepends AS BEGIN SET NOCOUNT
ON;
DECLARE @referencing_id INT, @ref_db_name NVARCHAR(255), @object_name NVARCHAR(771), @referenced_id INT, @sql NVARCHAR(600);
DECLARE c CURSOR LOCAL FORWARD_ONLY FOR SELECT ed.referencing_id, ed.referenced_database_name, COALESCE(ed.referenced_database_name
+ '.', '') + COALESCE(ed.referenced_schema_name +
'.', '') + ed.referenced_entity_name, ed.referenced_id
FROM sys.sql_expression_dependencies
ed LEFT OUTER JOIN sys.sysdepends sd ON ed.referencing_id = sd.id AND ed.referenced_id = sd.depid WHERE sd.id IS NULL AND ed.referenced_server_name IS
NULL;
OPEN c;
FETCH NEXT
FROM c INTO
@referencing_id, @ref_db_name, @object_name,
@referenced_id;
WHILE (@@FETCH_STATUS = 0) BEGIN IF @referenced_id IS NULL OR @ref_db_name IS NOT NULL BEGIN IF @referenced_id IS NULL AND @ref_db_name IS NULL AND OBJECT_ID(@object_name) IS NULL SELECT OBJECT_SCHEMA_NAME(@referencing_id) + '.' + OBJECT_NAME(@referencing_id) +
' references ' +
@object_name + ' ('
+ RTRIM(COALESCE(@referenced_id, 0)) + '), which
doesn''t exist.';
IF @ref_db_name IS NOT NULL AND OBJECT_ID(@object_name) IS NULL SELECT OBJECT_SCHEMA_NAME(@referencing_id) + '.' + OBJECT_NAME(@referencing_id) +
' might have a cross db dependency problem with '
+ @object_name +
'.'; END ELSE BEGIN SET @sql = 'EXEC
sys.sp_refreshsqlmodule N''' + QUOTENAME(OBJECT_SCHEMA_NAME(@referencing_id)) + '.' + QUOTENAME(OBJECT_NAME(@referencing_id)) + '''';
EXEC sp_executesql @sql;
-- you can also log the
calls you make here -- by storing @sql in a
table along with -- event time, user
information etc.
SELECT 'Executed SQL = ' +
@sql; END FETCH NEXT FROM c INTO @referencing_id,
@ref_db_name, @object_name, @referenced_id; END DEALLOCATE c; END GO
CREATE TRIGGER DDL_Catcher ON
DATABASE FOR CREATE_TABLE, ALTER_TABLE, DROP_TABLE, CREATE_PROCEDURE,
ALTER_PROCEDURE, DROP_PROCEDURE, CREATE_FUNCTION, ALTER_FUNCTION, DROP_FUNCTION, CREATE_TRIGGER, ALTER_TRIGGER, DROP_TRIGGER, CREATE_VIEW, ALTER_VIEW, DROP_VIEW, ALTER_SCHEMA, RENAME /* RENAME is new in SQL Server
2008. If you try this in SQL Server 2005, you
will get:
Msg 1084, Level 15, State 1,
Procedure DDL_Catcher, Line 8 'RENAME' is an invalid event type. */ AS BEGIN SET NOCOUNT
ON; -- you can log this somewhere for
later review, or you -- can just comment it out:
SELECT N'Fired
for ', EVENTDATA().value ( '(/EVENT_INSTANCE/TSQLCommand/CommandText)[1]', 'NVARCHAR(MAX)' );
EXEC dbo.RefreshSysDepends; END GO
|
You can set up a different monitor to validate the entries
in sys.sql_expression_dependencies that reference
another server, and in addition, you can check which objects in another
database are depended upon but which may have been dropped since the last time
the local DDL trigger fired. If the
local database is relatively stable, changes in the other database might have
occurred which break your local dependencies, but there is not an easy way to
do this, unless you want to create a server-wide trigger, or have certain
databases use DDL triggers that check on each others' dependency views.
Having said that, I guess it could just as easily be a
scheduled job, but again, I guess it depends on how often you want it to run,
compared to how often you expect changes to be made to the local database, and
compared still to how likely you expect it to find anything that needs to be
fixed. If you make DDL changes once a
month, it doesn't make much sense to run this every five minutes.
Limitations
As I said before, the solution is far from perfect, and is only meant to get you a step closer to the "holy grail" of dependency tracking.
Note that we still cannot do anything about dynamic SQL,
since sys.sql_expression_dependencies does not
capture information that it doesn't parse correctly. This is why you may still need to rely on
parsing and proper coding practices. If
you can't rely on these, you may be down to grunt work.
Another case that is not easy to handle just yet is the case
of the dropped column. Fellow SQL Server
MVP Alex Kuznetsov filed a bug about this in the
following feedback item, but the issue was incorrectly closed as fixed:
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=352577
And finally, a case that I am not treating here is #temp
table dependencies, which can cause problems for dependency tracking as
well. Erland
has some advice in this thread
about how to deal with this issue, but you are probably not going to be too
eager to try to implement it…