THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

SQL Server 2008 Page Compression: Performance impact on inserts

Before I begin, let me be very clear that the results I report in this post were obtained with SQL Server 2008 CTP6. Microsoft is still working on improving the performance of data compression. And in particular, the performance of inserting into a compressed table is expected to be significantly enhanced in SQL Server 2008 RTM.

In my previous post, I have shown that data compression can result in significant savings in storage space. In this post, I want to highlight the fact that not everything comes for free. Let's take a look at inserting into a B-tree table, i.e. a table with a clustered index. What is the performance impact of page compression on inserts?

Note that you must use TABLOCK hint for pages to receive page-level compression when the pages of a heap table are being populated with the INSERT statement. So, heap tables are much less interesting, and I'll hence focus only on B-tree tables.

In my tests, I ran the following script:

truncate table customer2;
-- enable page compression
alter table customer2 rebuild with (data_compression=page);
go
declare @dt datetime;
select @dt = getdate();
insert customer2 select * from customer;
select 'insert into customer2 with compression', datediff(second, @dt, getdate());
go
truncate table customer2;
-- disable compression
alter table customer2 rebuild with (data_compression=none);
go
declare @dt datetime;
select @dt = getdate();
insert customer2 select * from customer;
select 'insert into customer2 without compression', datediff(second, @dt, getdate());
go
 

The script measures the difference in the elapsed time of inserting 3,000,000 rows into (1) an empty table when page compression is enabled, and (2) an empty table when no data compression is enabled. The script was run multiple times to ensure that the elapsed times recorded were consistent. The following chart shows the results:

Clearly, the impact of page compression on inserts was rather significant in the tests. Without data compression, inserting 3,000,000 rows took ~80 seconds, whereas it took 217 seconds when page compression was enabled on the target B-tree table. No free lunch!

Again, these numbers were obtained with SQL Server 2008 CTP6, a work in progress build. I'll re-run the test once SQL Server 2008 RTM is released.

Now that I have shown that there can be performance penalty in using data compression, next time I'll show that data compression can also help improve your performance.

Published Monday, May 12, 2008 11:49 PM by Linchi Shea

Attachment(s): data_inserts.gif

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

david wei said:

Can you add the following 2 lines before each insert, to make a cold start

dbcc dropcleanbuffers

dbcc freeproccache

May 13, 2008 4:28 PM
 

Linchi Shea said:

As mentioned in the blog, the script was run multiple time and that ensured that all the data pages from the customer table (the source table) were cached. DBCC DROPCLEANBUFFERS was not included on purpose. The script was intended to measure only the impact on the writes. Like any other engineering tests, I wanted to isolate the specific aspect of what I was testing, though I understand in real world scenarios you would have many factors all mixed in there, and therefore the difference may not be as pronounced.

May 13, 2008 9:36 PM
 

jchang said:

the right way to run this test is with source data in memory, what is the size of the table, normal and compressed by the way.

did you verify no check point occurred during the test?

anyways 3M rows in 80 sec is 37,500 rows/sec, which pretty good.

assuming this is favorite 2.3GHz processor, one proc was pegged during the test, excluding CPU from the lazy writer, and log writer, thats 61K cpu-cycles per row.

The basic cost of compression should not be that high relative to this per row insert cost. LiteSpeed could do something like 120MB per CPU-sec if I recall.

So the additional 137 CPU-sec (if one CPU was pegged) is rather high.

I would if the SQL compression engine is decompressing and recompressing the entire page for each row insert, rather than taking a block of rows and compressing it

May 14, 2008 9:15 PM
 

SQLDiva said:

Please call/email me regarding the 5/20 NJSQL meeting.

Melissa[AT]SQLDiva.com

May 15, 2008 5:00 AM
 

Linchi Shea said:

My previous post shows that data compression may not come for free, although hopefully by the RTM time

May 16, 2008 9:38 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement