THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Did you know? Cloning is Legal (and Available in SQL Server 2005)

 

I first heard about the possibility of database cloning way back in version 7.0 and thought it sounded like a great idea. Who wouldn't love to have a bunch of identical little databases running around, of smaller size and easier manageability?

Cloning a database means creating a statistics only copy of it, so that you can examine execution plans for poorly performing queries in a test environment that takes hardly any space at all.  A clone database has all the metadata for all your objects, including function and stored procedure definitions. It has all the statistics for all the objects including the histograms and density information returned by DBCC SHOW_STATISTICS. Looking at space usage information shows you only the pages used, but not the pages allocated but unused.

So how can you get a little clone of your own? The original plan was to have a toolbar option to just click and presto, there would be your clone. That never happened. Later, one of the optimizer engineers at Microsoft created a stored procedure, called something like sp_clone_db that took a database name as a parameter, and created a new database with all the metadata and statistics, but none of the data, of the original. That procedure was never included with any version of the product.

Instead, what we have in SQL Server 2005 is the ability to script a database, and include the statistics and histogram information. 

In SQL Server Management Studio's Object Explorer, right-click your database name and choose Tasks, and then Generate Scripts....

Choose the name of the database to script, and select the box to "Script all objects in the selected database".  Next you'll see a "Choose Script Options" dialog with several dozen options to choose from, mostly indicating whether your want to include certain properties or objects in your database. Most of these you can leave at the default, or deselect the ones you're not interested in. I usually select the option to create the database, so it will include the CREATE DATABASE statement and any necessary filegroups. If you don't do this, and your database has objects created on filegroups other than the primary, your script will need a lot more manual editing.

There are two options that you must change in this "Choose Script Options" dialog. In the General section is the one called "Script Statistics". When you click in the list of options on the right side, you'll see three different options. The default is "Do not script statistics". Choose the option to "Script statistics and histograms." When you make this selection, you'll get a warning like the following:

If you want a clone, you'll have to use this option, so don't let this warning scare you.

In the Table/View Options section (you'll have to scroll down) change "Script Indexes" to True.

Click Ok, and after you click Next to leave this "Choose Script Options" dialog, the "Output Options" dialog opens.   (If you forget to check the box for scripting all objects in the database in the very first dialog, you'll get about 10 different dialogs asking you to choose which objects you want to copy, before you get to the "Output Options" dialog. Don't say I didn't warn you.)

I usually just choose to copy to a New Query Window and click Next. I then get a chance to confirm my choices, and then I click Finish. The copy process begins.

WARNING: If you notice in the confirmation list that you forgot something, or want to go back and make any changes, the option to Script Indexes will be set back to false. You'll need to make sure to reset it to True before going forward again.

Once I get the script in the new window, I need to make sure I change all references to the original database name to a new name. (If you are creating the clone on a different server that doesn't have the original database, then you won't have to do this step.) Make sure you check the filenames and make sure they don't duplicate existing names.  Execute when ready.

Or, you might want to spend a few minutes examining the script. You'll see UPDATE STATISTICS statements like you've never seen before, looking something like the following. (My clone was created on the AdventureWorks database):

UPDATE STATISTICS [Sales].[CreditCard]([AK_CreditCard_CardNumber]) WITH STATS_STREAM = 0x0100000002000144 <huge big long hex value> 000000000000510D0000000000,  ROWCOUNT = 19118, PAGECOUNT = 93

So the UPDATE STATISTICS statements force a particular histogram into the statistics, as well as the page count and row count.

Once you've created the new database, you should be able run queries to look at sizing information.

EXEC sp_spaceused 'Sales.SalesOrderDetail'

returns the following:

 

name             rows   reserved data    index_size unused
---------------- ------ -------- ------- ---------- ------
SalesOrderDetail 121317 0 KB     9880 KB 0 KB       0 KB

However,

SELECT count(*) FROM Sales.SalesOrderDetail

returns 0.

You can look at execution plans for queries involving any of the cloned tables and pre-existing indexes. You can create new indexes, but be careful. Any new indexes will have statistics created based on the real data in the clone database, which is 0 rows. So the statistics on your new indexes will not be very accurate.

Of course, your clone database cannot be used for timing tests, or troubleshooting concurrency issues, since there is no data to lock, but for examining how plans change as you rewrite queries, it's a great idea.

Have fun!

~Kalen

Published Wednesday, November 21, 2007 11:50 AM by Kalen Delaney

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

david wei said:

Thanks a lot, Kalen.

That's my long waited post.

I know I can use UPDATE STATISTICS with ROWCOUNTand PAGECOUNT to make a fake statistics to let optimizer thinks it is dealing with a large table.

So we can test query plan without loading huge data.

But I could not find an easy way to change the density and histogram.

I even think to write code or application to export the statblog from live then import into dev.

Your post makes statistics copy much much easier.

Very appreciate.

November 21, 2007 4:33 PM
 

RichB said:

Cheers, thats a good thing to know!

November 26, 2007 10:46 AM
 

Daniel van der Meulen said:

From the profiler I learned that DBCC SHOW_STATISTICS is used with the undocumented option STATS_STREAM to get the statistics.

This way you can script the statistics yourself...

November 26, 2007 11:26 AM
 

Charles Evans said:

Is there a way to copy excrypted stored procedures?

January 4, 2008 7:22 PM
 

James said:

Excellent

September 30, 2009 1:54 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement