This post is prompted by feedback I received regarding some comments I made recently on Twitter regarding the regular usage of DBCC SHRINKDATABASE against SQL Server databases. I recently explained multiple times at work that using SHRINKDATABASE against a production database was not a recommended practice and should not be performed in a daily/weekly/monthly task if the database would eventually reuse the space that was reclaimed by shrinking the database, and on my third time trying to make this point clear, out of frustration I made a tweet about the fact that I was trying to explain the subject to someone. What resulted was a conversation about when SHRINKDATABASE might actually be beneficial, as well as what leads people to thinking that they should in fact shrink a database as a regular part of maintenance.
More than a few times I have answered forums posts regarding SHRINKDATABASE and why it is bad for production databases in SQL Server. However, only recently did I actually consider where the idea that shrinking a database would improve performance. It would seem to me that this mentality comes from people who are previous users of Microsoft Access, or are solely out to save disk space on the server wherever possible. Microsoft hasn’t done a lot of good when it comes to creating the mentality that SQL Server is essentially, Microsoft Access’s big brother when it comes to storing data in a database. The upsizing wizard makes and easy ability to hook an Access application to a SQL Server data store can easily lead someone unfamiliar with the semantics behind SQL Server to think that it is simply an extension to Access and the same rules apply.
Microsoft KB Article 288631 – Defragment and Compact database to improve performance in Microsoft Access, provides information about how compacting a database in Microsoft Access regularly, can lead to better performance of the database. However, there are a couple of key statements in this KB article that detail how a compact in Access functions that are very different than how SHRINKDATABASE works in SQL Server that provide clues as to why performance would improve in Access but not in SQL Server.
This utility makes a copy of the database file and, if it is fragmented, rearranges how the database file is stored on disk.
Compacting a database in Access actually creates a new database file, and then creates tables and inserts the data into those tables in a logical order, creating an optimized database that has logical ordering of data. This can be accommodated in Microsoft Access since there is a cap on the size of a database. However, SQL Server, which is designed for much larger database, this kind of operation would be cost prohibitive.
If a primary key exists in the table, compacting restores table records into their primary key order.
In Access performing a compact reorders the table data in primary key order, making read ahead scans of the data more efficient. This is not the case for SQL Server however, which is designed to work with databases that are much larger and scalable. When a shrink occurs inside of SQL Server the data pages are moved from their current position into free space at the beginning of the data file. Logical ordering of the data is not a factor in the shrink algorithms and the result is high fragmentation levels of the indexes and data of the database.
Compacting also updates the table statistics within the database that are used as Jet optimizes queries.
Access updates the statistics for a table when it compacts a database, but SQL Server doesn’t. Access can accommodate this kind of optimization since it is already reading all of the data for a table during a compact operation to insert it into the new database file in logical order. SQL Server is strictly moving data to another location in the data file and doesn’t take into account updating statistics information during the move because it is simply moving pages to free up space.
A logical question that follows this comparison is “Why doesn’t SQL Server work like Access?” It’s actually a matter of supported Database Sizes that makes this question seem ridiculous. Access has a 2GB database size limit. In the realm of SQL Server, this is childsplay. A single LOB column in SQL Server 2005/2008 has a limit of 2GB, and this column can exist in multiple rows of a SQL Server database. It is not uncommon for SQL Server databases to have sizes that are multiple terabytes. In order for SQL Server to perform a shrink that is synonymous with a Microsoft Access compact operation, it would necessary for the server to have double the space available since a Access compact creates a new database, and one of the limitations for an Access compact operation is that the server must have available free space equal to the resulting size of the database to perform the compact to create the new database file in.
Does this mean that SHRINKDATABASE is always evil?
Absolutely not! There are specific scenarios under which shrinking a database in SQL Server actually makes sense. For example, consider that a archiving database was created to archive Sales Order History data to. Seven years later, the database is two terabytes in size and growing constantly and the business determines that only the last three years of data is important for operations, and a weekly purge of the data should be performed to maintain the size of the database at a consistent size. This means that the initial purge process is going to trim four years worth of Sales History data from the database, which will leave significant free space in the database files. In this scenario, it makes sense to perform a one time shrink of the database to release this space back for reuse by other databases on the server.
When this initial shrink occurs, an end result will be that the indexes inside the database will become heavily fragmented. To correct this problem, the database should immediately be reindexed, which will result in free space in the database files, but that is not a problem. Having the indexes in logical order is more important than reclaiming the space in this scenario. The amount of free space will be dictated by a number of items such as the FillFactor specifications for the indexes in the database as well as the size of the indexes in the databases. It is safe to expect that the amount of free space will be at least equal to the size of the largest index in the database in this scenario. This isn’t necessarily a problem because subsequent additions of data to the database will reuse this space.
While situations exist where shrinking a database in SQL Server makes sense, performing this operation in a SQL Agent Job daily, weekly, or monthly simply doesn’t make sense. It is not uncommon to find Database Maintenance Plans where a full Reindex of a database is followed by a shrink of the database. In this case, the shrink simply undoes the optimizations performed by reindexing the database in question, by refragmenting the indexes that were defragmented by performing the reindex operation.