THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

PowerPivot FILTER condition optimizations

In the comments of a recent post from Alberto Ferrari there was an interesting note about different performance related to the order of conditions in a FILTER call. I investigated about that and Jeffrey Wang has been so nice to give me some info about actual implementation that I can share on a blog post.

First of all, an important disclaimer: PowerPivot is intended to make life easier, not requiring the user to think how to write the order of elements in a formula just to get better performance. To achieve this goal, following releases of PowerPivot could have a smarter engine that will not require the author to worry too much about how he writes a DAX expression!

Suppose you have a table Sales with three columns: Product, Price and Quantity.

These two FILTER conditions should be equivalent:

FILTER( Product, Product[Price] > 10 && Product[Quantity] > 5 )

FILTER( Product, Product[Quantity] > 5 && Product[Price] > 10 )

In fact, from the point of view of the final result, they are equivalent. However, in the current version of PowerPivot there is a short-circuit evaluation (in reality something like that), which might favor performance if the most selective condition is put upfront in case of an AND condition.

The reason for that relies in the columnar behavior of the Vertipaq engine. This condition:

FILTER( T, p1 && p2 )

is internally rewritten (if possible) as

FILTER( FILTER( T, p1 ), p2 )

You might wonder whether the difference can be measured, because Vertipaq is so fast. Well, as I said, only certain conditions can be optimized in this way and are those that can be solved by the Vertipaq engine without evaluating a more complex DAX expression. Moreover, a sort of “bulk evaluation mode” is used, without going to evaluate the predicate conditions row by row.

Now, consider a more complex statement:

COUNTROWS( FILTER( Products, Products[Name] = "Name"
                             && Products[Stock] > CALCULATE( AVERAGEX( Fact, Fact[Quantity] / 5 ) ) ) )

In this case, reverting the statement to:

COUNTROWS( FILTER( Products, Products[Stock] > CALCULATE( AVERAGEX( Fact, Fact[Quantity] / 5 ) ) 
                             && Products[Name] = "Name" ) )

produces a slower query, which requires much more memory to be computed and also more CPU time.

The final consideration is that in the current version of PowerPivot it is better to put the most selective condition upfront in a filter predicate. Future versions of PowerPivot might be able to automatically reorder these conditions.

Published Monday, February 07, 2011 4:10 PM by Marco Russo (SQLBI)
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement