THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Introducing sp_CascadingDataViewer

In September 2009 I wrote one of my first blog posts on SQLBlog entitled Deriving a list of tables in dependency order. In it I provided a SQL statement that would return a list of all tables in a database in what I called dependency order i.e. by adhering to declarative referential integrity (foreign keys to you and me). Here is a portion of one such result after executing it against [AdventureWorks] sample database:

AdventureWorks tables dependency hierarchy

Very useful methinks and it got a fairly good reception in the comments. One particular comment (from telcogod) really caught my attention though, he/she said:

can this be used to create a wipeout script?
on a much bigger scale, but similar to delete all orderdetails and orders where customerid = 1

That really got me thinking about what could be achieved from this simple premise of deriving a table dependency order and what could then be done with it. I realised that what would be really useful, rather than discovering all the tables dependent on a particular table, would be discovering all the data in the database dependent on a particular row in that table. The idea germinated and sp_CascadingDataViewer was born!

Eleven months later (yes, really) and I have finally gotten round to putting something out in the wild that does what I just described. In the rest of this blog post I will go into detail about what sp_CascadingDataViewer is, what it does and how it does it but before that you may want to take a look at this short video that explains exactly what sp_CascadingDataViewer is for (if the video is not showing up for whatever reason then go ahead and watch it at http://vimeo.com/14072554).

This first (beta) release of sp_CascadingDataViewer is available at http://cascadingdataviewer.codeplex.com/releases/view/50441. Go download it now!

Version 1.1(beta) of sp_CascadingDataViewer is now available at http://cascadingdataviewer.codeplex.com/releases/view/52586. This release fixes a bug concerning columns of type 'timestamp', read more at sp_CascadingDataViewer version 1.1 (beta) released!


All of the following notes exist at http://cascadingdataviewer.codeplex.com/. I will actively maintain that resource so please refer there for updates.

Introduction

sp_CascadingDataViewer has one basic premise: it tells you how many rows in your database are dependant on a row (or set of rows) in a table. Or to put it another way, how many rows would you have to delete from your database in order to be able to delete a given row (or set of rows). All you need to do is specify.
  • A database
    • A schema in that database
      • A table in that schema
        • A set of columns in that table
        • A set of values, one for each specified column respectively
and sp_CascadingDataViewer will tell you all the rows dependant on the specified rows and optionally show you all of those dependant rows as well.

A simple example

Take the following database diagram taken from the AdventureWorksDW2008 sample database. It shows [AdventureWorksDW2008].[dbo].[DimProductSubCategory] and all the tables that are dependant on that table as determined by declared referential integrity (i.e. foreign key constraints) :
AdventureWorksDW2008.dbo.DimProductSubCategory_DependantTables.PNG

Here is some sample code to tell us how many rows are dependant on the Sub-Category called "Mountain Bikes":
--The most basic use of *sp_CascadingDataViewer*, return the number of rows dependant on a single row in a given table!
DECLARE	@vNumberOfRows	INT;
EXEC	[master].[dbo].[sp_CascadingDataViewer]
			@pDatabaseName	=	'AdventureWorksDW2008'
		,	@pSchemaName	=	'dbo'
		,	@pTableName	=	'DimProductSubCategory'
		,	@pFilterCols	=	'EnglishProductSubCategoryName'
		,	@pFilterValues	=	'Mountain Bikes'	/*    <-In other words, how many rows in our database are dependant on the row 
								         WHERE [dbo].[DimProductSubCategory].[EnglishProductSubCategoryName]='Mountain Bikes'*/
		,	@pNumberOfRows	=	@vNumberOfRows	OUTPUT;
SELECT	@vNumberOfRows AS [NumberOfRows];

and this screenshot shows us the answer:

AdventureWorksDW2008.dbo.DimProductSubCategory_execoutput.png

Show me the data (@pShowData=1)

sp_CascadingDataViewer offers the user the chance to see all of the dependant rows. It has a parameter called @pShowData which must be set to '1' in order to show all that data. If we take the example from above and add that new parameter we get the following SQL statement:
DECLARE	@vNumberOfRows	INT;
EXEC	[master].[dbo].[sp_CascadingDataViewer]
			@pDatabaseName	=	'AdventureWorksDW2008'
		,	@pSchemaName	=	'dbo'
		,	@pTableName	=	'DimProductSubCategory'
		,	@pFilterCols	=	'EnglishProductSubCategoryName'
		,	@pFilterValues	=	'Mountain Bikes'
		,	@pNumberOfRows	=	@vNumberOfRows	OUTPUT
		,	@pShowData	=	1;	--Show me the data!!!
SELECT	@vNumberOfRows AS [NumberOfRows];

The screenshot below shows the results following execution of that code. Notice that we have a resultset from each of the tables shown in the database diagram near the top of this page. Also notice that, for clarity, the resultset contains the name of the table:

AdventureWorksDW2008.dbo.DimProductSubCategory_ShowData.PNG
Also, just to prove that the returned NumberOfRows is correct, the total number of rows returned in all the resultsets to SSMS is 17358 (one more than NumberOfRows=17357 because that final returned resultset that contains NumberOfRows is counted also).

What if I don't want to see everything? (@pDependantLevels)

Often you may not want to know about all rows in the database that are dependant on the specified rows in @pFilterCols/@pFilterValues and with this in mind sp_CascadingDataViewer provides a parameter @pDependantLevels that limits the algorithm to only considering tables that are a given number of levels away from the specified table.
This may be better explained by looking at an example. Consider the diagram that I showed in the first example at the top of this page:
AdventureWorksDW2008.dbo.DimProductSubCategory_DependantTables.PNG

If we did not want to consider rows in [FactInternetSalesReason] then we can set @pDependantLevels=2. This means that [DimProduct] will be considered (because it is one level away from the starting table [DimProductSubCategory]) as will [FactInternetSales] & [FactResellerSales] (because they are two levels away) but [FactInternetSalesReason] will not (because it is three levels away).
Here's the code:
DECLARE	@vNumberOfRows	INT;
EXEC	[master].[dbo].[sp_CascadingDataViewer]
			@pDatabaseName		=	'AdventureWorksDW2008'
		,	@pSchemaName		=	'dbo'
		,	@pTableName		=	'DimProductSubCategory'
		,	@pFilterCols		=	'EnglishProductSubCategoryName'
		,	@pFilterValues		=	'Mountain Bikes'
		,	@pNumberOfRows		=	@vNumberOfRows	OUTPUT
		,	@pShowData		=	1
		,	@pDependantLevels	=	2;      --Note the new parameter @pDependantLevels
SELECT	@vNumberOfRows AS [NumberOfRows];  

and the result:

AdventureWorksDW2008.dbo.DimProductSubCategory_DependantLevels.png
As we would expect, reducing the number of tables we consider also reduces the actual result (i.e. the number of dependant rows).

Show distribution of all values in the specified column(s)

Knowing how many rows are dependant on a single filter condition in a single table is useful but it becomes much more useful still when compared to all other values in the specified columns. For example, I know from the basic example above that there are 17537 rows dependant on 'Mountain Bikes' in [AdventureworksDW2008].[dbo].[DimProductSubCategory].[EnglishProductSubCategoryName] but I don't know whether that is high or low compared to all other values in that field.
This realisation gave rise to another stored procedure sp_CascadingDataViewerTableWrapper. This second stored procedure calls sp_CascadingDataViewer on every distinct value defined by @vFilterCols.
Observe. We tell sp_CascadingDataViewerTableWrapper which column to work on. It calls sp_CascadingDataViewer on every distinct value for the columns listed in @pFilterCols and returns the results in an ordered resultset.
Here's the code:
exec	[dbo].[sp_CascadingDataViewerTableWrapper]
		@pDatabaseName	= 'AdventureWorksDW2008'
,		@pTableName	= 'DimProductSubCategory'
,		@pSchemaName	= 'dbo'
,		@pFilterCols	= 'EnglishProductSubCategoryName';

We're calling sp_CascadingDataViewerTableWrapper on the same table &column that we used in previous examples, the difference in this case is that we don't need to specify @pFilterValues because sp_CascadingDataViewerTableWrapper will call sp_CascadingDataViewer for every possible value. Here are the results:

AdventureWorksDW2008.dbo.DimProductSubCategory_TableWrapper.png

This is much much more useful than calling sp_CascadingDataViewer in isolation. We can now discover exactly which of our SubCategories in [DimProductSubCategory] has the most dependant rows and that is very useful information to know. We have at-a-glance information about which are our most popular categories and, of even more value, the difference in popularity too! This is BI over our raw data!
Note that sp_CascadingDataViewerTableWrapper supports @pDependantLevels as you would expect and also has an additional parameter, @pAscending, that defines whether or not the results are returned in ascending or descending order.

Putting it all together

Executing SQL statements and such like is all very useful but this information really starts to become value when it gets visualised and for that I provide a SQL Server Reporting Services report (a .rdl file). The report provides parameters allowing the user to specify:
  • Database
  • schema.table
  • column(s)
The report passes those values to sp_CascadingDataViewerTableWrapper and visualises the returned resultset in a bar graph. The screenshot below shows the chart displaying the same values that we saw in our call to sp_CascadingDataViewerTableWrapper:
AdventureWorksDW2008.dbo.DimProductSubCategory_Chart.png
Instantly we can see which subcategories have the most data against them and how much in comparison to all of the others.  

Behaviour, capabilities and limitations

It is important to understand exactly what sp_CascadingDataViewer is doing all of the covers and that is why I provide this section that provides that information. I must draw your attention to the Behaviour to be aware of section!

Capabilities

sp_CascadingDataViewer can
  • Return the number of rows that are dependant on a given row (or set of rows) as defined by a filter condition (i.e. Show me all dependants WHERE colA = <some-value>)
  • Display all of the dependant data as well as returning the total number of dependant rows. (simply set @pShowData=1)
  • Allow the user to specify multiple columns in the filter column (i.e. Show me all dependants WHERE colA = <some-value> AND colB = <another-value>)
  • handle self-referencing tables
  • handle a row having more than one reason to be a dependant row (read more)
  • display the dynamic SQL that is being generated under the covers (simply set @pPrintSQL=1)
  • Display the contents of temporary tables & table variables that get created during execution. This is mainly for the benefit of us developer(s) but if you want to see what sp_CascadingDataViewer is doing under the covers then this may prove useful. (simply set @pDebug=1)

Limitations

sp_CascadingDataViewer can/will not
  • (currently) handle specification of columns in the @pFilterCols parameter that are not of type int, tinyint, smallint, bigint, varchar, nvarchar, char or nchar. Any dependant tables that have columns of any other types are fine - they will not be affected.
  • work on any version of SQL Server prior to SQL Server 2008
  • cope with any values in @pFilterValues that contain commas. This is due to the ability to select multiple columns in @pFilterCols and hence multiple values in @pFilterValues- the values in @pFilterValues are comma-delimited and hence any commas that exist within the values themselves will result in an error: There must be the same count of values in @pFilterCols as there are columns in @pFilterValues. Note that sp_CascadingDataViewerTableWrapper handles this by not calling sp_CascadingDataViewer on any such values and instead returning NULLs in its resultset.

Behaviour to be aware of

sp_CascadingDataViewer
  • The final number of dependant rows includes any rows in the specified table that match the filter criteria specified by @pFilterCols & @pFilterValues. In other words, sp_CascadingDataViewer will never return 0 unless the filter criteria specifies some non-existent rows.
  • constructs tables in the database that you specify then removes them at the end of processing Please be aware of this one sp_CascadingDataViewer will not attempt to create any tables if it does not have permission to and in this case it will fail gracefully, however just be aware that ordinarily it will be creating some tables.
  • I stated in Capabilities that sp_CascadingDataViewer handles the situation where a row in a dependant table has more than one reason to be a dependant row. There is a caveat to this, if the dependant table does not have a PRIMARY KEY or UNIQUE KEY defined then sp_CascadingDataViewer will not be able to determine if a row is already a dependant row or not. This makes sense if you think about it but nonetheless, be aware.
  • sp_CascadingDataViewer does not pass a return value. Any information (including the number of dependant rows) is passed out either by OUTPUT parameters or resultset.

Input Validation

sp_CascadingDataViewer carries out the following checks on the input values prior to execution:
  • Check input values are non-zero length
  • Check specified database exists
  • Check the execution security context has the ability to create tables
  • Check specified table exists
  • Check number of specified columns equals number of specified values
  • Check specified columns exist in the specified table
  • Check datatype(s) of column(s) are of type int, tinyint, smallint, bigint, varchar, nvarchar, char or nchar
  • Check the same column has not been specified twice

Wrap Up!

If you made it this far, well done. I would love to know what people think about this and whether or not it is useful to you. This is of course only version 1 and I have lots of ideas about where I’d like to take this in the future including:

  • Giving you the ability to delete all the data that it finds (which is what telcogod originally suggested)
  • Optionally storing the result from each execution of sp_CascadingDataViewer so that the growth of data can be trended over time

Let me know what you think of those ideas too!

Published Wednesday, August 11, 2010 11:59 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

SSIS Junkie said:

In yesterday's blog post Introducing sp_CascadingDataViewer I mentioned that one of the artifacts up

August 12, 2010 10:09 AM
 

Matt Farley said:

At first glance you had me confused with:

http://spservices.codeplex.com/wikipage?title=%24%28%29.SPServices.SPCascadeDropdowns

Which I use all the time :)

August 14, 2010 11:34 AM
 

Calvin Jones said:

I get the following error if a table contains a timestamp column.

Cannot insert an explicit value into a timestamp column. Use INSERT with a column list to exclude the timestamp column, or insert a DEFAULT into the timestamp column.

August 26, 2010 8:37 AM
 

jamiet said:

calvin,

ouch! thanks for letting me know. i'll get into it ASAP.

August 26, 2010 8:42 AM
 

RadaurDude said:

Hi Jamie,

This is a very useful sql code. I would appreciate if there are little more comments in the code.

But amazing work!!!

Thanks!

August 31, 2010 9:45 AM
 

jamiet said:

RadaurDude,

Thanks for the kind words, really appreciated.

What sort of comments would you like? What isn't clear?

cheers

Jamie

August 31, 2010 11:53 AM
 

Darren Gosbell said:

Cool proc Jamie. I would suggest that you see if SSRS chart would work as a bar chart. I would expect that this would make the values easier to read and would potentially let it scale over multiple pages.

September 6, 2010 8:21 AM
 

jamiet said:

Darren,

Nice idea dude. I think I'll do that (as and when I get around to it).

September 14, 2010 5:35 PM
 

SSIS Junkie said:

Its now been a month since I introduced sp_CascadingDataViewer and I wanted to take the opportunity to

September 18, 2010 12:41 PM
 

SSIS Junkie said:

Just a short note here to inform anyone that is interested that I have just made available v1.1 of sp_CascadingDataViewer

September 19, 2010 4:15 PM
 

SSIS Junkie said:

Sometimes I use this blog to share scripts that I know I'm going to need at some point in the future,

September 19, 2011 6:07 AM
 

Jamie Thomson : Deriving a list of tables in dependency order said:

August 20, 2014 4:44 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement