THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

Please visit the same blog at - comments are now disabled here but you can comment on the new blog (all blog posts available here are available there, too).
You can follow me on Twitter: @marcorus

Cost of Process Defrag in Analysis Services Tabular #ssas #tabular

I recently received a question about the memory required to run a Process Defrag on a Tabular model in Analysis Services. The Process Defrag is useful when you run incremental processing of a table frequently, or when some of the values in the dictionary are no longer used in the table, for example if you process the same partition multiple times in a table and/or remove partitions from a table. Cathy Dumas wrote an interesting blog post about the savings you can obtain by running process defrag.


I made some investigation and I’ve been assured that data is not completely uncompressed in this process, even if some coding/encoding happens. In a rough estimate, you need a free space equivalent to the size of the table you are going to defrag (which is already in memory) plus buffer for transient data structure, but data are managed in a compressed form, without requiring larger memory buffers for uncompressed raw data. So, if all partitions of your table and the column dictionaries require 100MB in RAM, then you need another 100MB of free RAM in order to execute a Process Defrag.

Here is a more detailed description I received from Akshai Mirchandani:

In addition to the master copies, it requires enough memory for a new dictionary, and the final compressed data + some small temporary buffer space for the transient data structures (no big buffers like normal processing).
It is essentially going to read each column value, insert it into a new dictionary, get a new DataID back from the dictionary insert, and append that DataID to the current segment. It doesn’t need to do VertiPaq (compression) again, and it doesn’t keep the uncompressed data in buffers like the data processing algorithm does.

The important point here is that the analysis of the segment data to come up with the best compression strategy no longer needs to be performed – and that’s typically the most expensive step of the compression (the VERTIPAQ_STATE in DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS shows whether this was done).

This is a good news if you are concerned with memory required to perform this operation.

Published Wednesday, March 13, 2013 3:17 PM by Marco Russo (SQLBI)



Stefano said:

Does it affect query perfromances?

July 1, 2016 6:03 AM

Marco Russo (SQLBI) said:

It could, it depends on the fragmentation you have. My experience is that the performance difference is minimal.

July 1, 2016 6:45 AM

shrikant said:

Hi Marco,

we have 239 gigs tabular cube for large retailer. I observed dictionary size grown to 50 gigs from 12 gigs in years time (I think due to frequent updates to underlying fact and incremental processing). After process defrag I gained almost 22 gigs (dictionary size reduced to 28 gigs) however lost all relationships between this fact table and all other dimensions. why would process defrag corrupt database. Process ran for almost 10 H, would you recommend to drop, recreated and reprocess entire model? Basically same concept of DROP and CREATE INDEX on SQL Server instead of ALTER INDEX to defrag.      

February 15, 2017 1:28 PM

Marco Russo (SQLBI) said:

It seems like a bug, I suggest you to contact Microsoft support for these problems. It shouldn't happen!

March 17, 2017 7:16 AM
New Comments to this post are disabled

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog



Privacy Statement