My previous post shows that data compression may not come for free, although hopefully by the RTM time the adverse performance impact on inserts will have been reduced significantly. In this post, I want to show you that data compression can enhance read performance. Intuitively, data compression may reduce the number of pages SQL Server must read into memory from disk, and the savings in terms of the number of pages can be significant, and therefore there may be a positive impact on reads.
To continue our tradition of letting data points do the talking, I ran the following simple script with the customer table when (1) it was page compressed and when (2) it was not page compressed. This was the same customer table I described in an earlier post on page compression and multiple processors.
dbcc dropcleanbuffers;
declare @dt datetime;
select @dt = getdate();
select avg(c_balance) from customer;
select datediff(second, @dt, getdate());
go
Note that there was no index on the c_balance column, and I verified that the actual execution plan was a table scan (well a clustered index scan). DBCC DROPCLEANBUFFER was included to ensure pages were read from disk, not already cached. The following chart summarizes the results.
Wow! The elapsed time difference between the compressed data and uncompressed data was huge. It took 52 seconds to scan the table when it's not compressed, but it took only 3 seconds when the table was compressed. Note that when uncompressed, the table had 2264 pages, and when compressed it had 534 pages.
Two more points to note: First, I did not try to explain why the improvement in table scan performance seemed to be disproportional to the reduction in the table size as the result of page compression. But I did run these tests many times to make sure I was not going to give you incorrect data. Second, I would caution you not to get too hung up on these specific numbers. As they often say, your mileage may vary. Instead, focus on the big picture impression that page compression may give your reporting queries a performance boost. In a real app, life is not so simple, and you would have to strike a balance between better reporting queries and potentially longer batch/data feed processing time when you consider page compression.