THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Allen White

Use the Best Tool for the Task

Last February I wrote an article for Simple-Talk called Let PowerShell do an Inventory of your Servers. This script was useful to me then, and it continues to be so, but it recently ran into a glitch.

Here is the code in the script to gather data on each file used by a database:

$fgs = $db.FileGroups
foreach ($fg in $fgs) {
	$files = $fg.Files
	$outnm = ".\" + $svr + "\" + $instnm + $dbtype + "_" + $dbname + "_DataFiles.csv"
	$files | select $db.Name, Name, FileName, Size, UsedSpace | export-csv -path $outnm -noType
	}

It's nice and succinct, and as "set-based" as it can be, given that it has to iterate through the file groups. The problem is that underneath the hood, SMO sends a set of queries to SQL Server to create a temp table, then run "DBCC ShowFileStats", then join that temp table with sys.filegroups and sys.master_files catalog views to return the results in the Files collection.

The problem is that the underlying code contains a WHERE clause filtering the results FOR EACH FILE.

When this command runs the Reads counter (on my system) in Profiler averages around 78000, the CPU between 300 and 500, and Duration between 700 and 900, and this is run for each file. At a client site one database has over 10,000 files spread across almost 400 filegroups. I created a similarly configured database on my test system and the above code took about 2 hours to run. I replaced it with the following code:

$df = new-object "System.Data.DataSet" "dsConfigData"
$cn = new-object system.data.SqlClient.SqlConnection("Data Source=$inst;Integrated Security=SSPI;Initial Catalog=$dbname");
$q = "create table #tmpspc"
$q = $q + "    ("
$q = $q + "      Fileid int,"
$q = $q + "      FileGroup int,"
$q = $q + "      TotalExtents int,"
$q = $q + "      UsedExtents int,"
$q = $q + "      Name sysname,"
$q = $q + "      FileName nchar(520)"
$q = $q + "    ) "
$q = $q + "insert #tmpspc EXEC ('dbcc showfilestats')
"
$q = $q + " SELECT  s.name AS [Name],"
$q = $q + "        s.file_id AS [ID],"
$q = $q + "        s.physical_name AS [FileName],"
$q = $q + "        s.size * CONVERT(float, 8) AS [Size],"
$q = $q + "        CASE when s.max_size = -1 then -1"
$q = $q + "             else s.max_size * CONVERT(float, 8)"
$q = $q + "        END AS [MaxSize],"
$q = $q + "        CAST(CASE s.is_percent_growth"
$q = $q + "               WHEN 1 THEN s.growth"
$q = $q + "               ELSE s.growth * 8"
$q = $q + "             END AS float) AS [Growth],"
$q = $q + "        s.is_media_read_only AS [IsReadOnlyMedia],"
$q = $q + "        s.is_read_only AS [IsReadOnly],"
$q = $q + "        CAST(case s.state"
$q = $q + "               when 6 then 1"
$q = $q + "               else 0"
$q = $q + "             end AS bit) AS [IsOffline],"
$q = $q + "        s.is_sparse AS [IsSparse],"
$q = $q + "        CAST(CASE when s.growth = 0 THEN 99"
$q = $q + "                  ELSE s.is_percent_growth"
$q = $q + "             END AS int) AS [GrowthType],"
$q = $q + "        CAST(CASE s.file_id"
$q = $q + "               WHEN 1 THEN 1"
$q = $q + "               ELSE 0"
$q = $q + "             END AS bit) AS [IsPrimaryFile],"
$q = $q + "        CAST(tspc.UsedExtents * convert(float, 64) AS float) AS [UsedSpace],"
$q = $q + "        ( tspc.TotalExtents - tspc.UsedExtents ) * convert(float, 64) AS [AvailableSpace]"
$q = $q + " FROM    sys.filegroups AS g"
$q = $q + "        INNER JOIN sys.master_files AS s ON ( s.type = 0"
$q = $q + "                                              and s.database_id = db_id()"
$q = $q + "                                              and ( s.drop_lsn IS NULL )"
$q = $q + "                                            )"
$q = $q + "                                            AND ( s.data_space_id = g.data_space_id )"
$q = $q + "        LEFT OUTER JOIN #tmpspc tspc ON tspc.Fileid = s.file_id
"
$q = $q + " drop table #tmpspc"

$daf = new-object "System.Data.SqlClient.SqlDataAdapter" ($q, $cn)
$daf.Fill($df)
$dtFiles = new-object "System.Data.DataTable" "dtFileData"
$dtFiles = $df.Tables[0]
$outnm = ".\" + $svr + "\" + $instnm + $dbtype + "_" + $dbname + "_DataFiles.csv"
$dtFiles | select $db.Name, Name, FileName, Size, UsedSpace | export-csv -path $outnm -noType

Once this code was in place I ran it and the process that took 2 hours originally finished in 9 seconds.

I've filed a Connect item for this here.

Just remember the important thing is that there's not ever just one tool to use. Understand when it's better to use T-SQL than PowerShell, or vice versa.

Allen

Published Wednesday, September 02, 2009 9:22 AM by AllenMWhite
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Jack Corbett said:

Nice update Allen.  I haven't delved into PS yet, but I really appreciate it when someone like yourself picks up a "better" way to do some thing and shares it.

September 2, 2009 9:26 AM
 

Chad Miller said:

Good tip, I voted on your Connect item. One suggestion--use HERE strings for multi-line strings instead of  concatenating the $q variable. For example this HERE string works the same as above:

$q = @"

create table #tmpspc

    (

      Fileid int,

      FileGroup int,

      TotalExtents int,

      UsedExtents int,

      Name sysname,

      FileName nchar(520)

    )

insert #tmpspc EXEC ('dbcc showfilestats')

...

"@

September 2, 2009 9:49 AM
 

AllenMWhite said:

Thanks, Chad.  I'll try that.

September 2, 2009 9:53 AM
 

Linchi Shea said:

SMO is a good tool. But I would not advise people who are interested in scalability to study how SMO interacts with SQL Server. What you found here is just one of too many performance issues with SMO. Try to use SMO to script all the objects from a database that has a large number of objects.

September 2, 2009 10:47 AM
 

Phil Factor said:

Allen

What an interesting post. Actually, it illustrates so well the pitfalls of any abstraction layer. You can fix these individual problems one by one, but as Linchi points out, SMOs shortcomings grab you more painfully, the more work it has to do. But it isn't a fault that is particular to SMO. It is just that the object model doesn't quite fit the reality of what is really going on under the covers.

At the same time, I'd have to admit that, in the first instance, I'd use SMO whenever I can, and just be wary of the pitfalls, and be ready to program around them. It is still a nice clean way of scripting admin tasks.

September 4, 2009 6:37 PM

Leave a Comment

(required) 
(required) 
Submit

About AllenMWhite

Allen White is a consultant and mentor for Upsearch Technology Services in Northeast Ohio. He has worked as a Database Administrator, Architect and Developer for over 30 years, supporting both the Sybase and Microsoft SQL Server platforms over that period.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement