THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Microsoft OLAP by Mosha Pasumansky

Product Volatility: Optimizing MDX with MDX Studio

In this article we will show how to take typical, everyday MDX and optimize it by simply following advices of MDX Studio Analyzer.

As a starting point, I will take “Product Volatility” example as published by Thomas Ivarson in his blog. It is a perfect example, because Thomas took a real business problem, and implemented it using very clean and straightforward MDX, just like I expect many MDX practitioners do. This MDX can be optimized, but some of these optimizations might not be obvious. Running MDX Studio Analyzer and following its advices takes the guesswork out of equation.

Here is the MDX that computes total number of products, number of products with sales increased from last month and number of products with sales decreased from last month:

WITH 
  MEMBER Measures.TotNumberOfProducts AS 
    Count
    (
      Descendants
      (
        [Product].[Product Categories].CurrentMember
       ,[Product].[Product Categories].[Product]
      )
    ) 
  MEMBER Measures.NumberOfSoldProductsIncreasing AS 
    Count
    (
      Filter
      (
        NonEmpty
        (
          Descendants
          (
            [Product].[Product Categories].CurrentMember
           ,[Product].[Product Categories].[Product]
          )
         ,[Measures].[Internet Sales Amount]
        )
       ,
            (
              [Date].[Calendar].CurrentMember
             ,[Measures].[Internet Sales Amount]
            )
          - 
            (
              [Date].[Calendar].PrevMember
             ,[Measures].[Internet Sales Amount]
            )
        > 0
      )
    ) 
  MEMBER Measures.NumberOfSoldProductsDecreasing AS 
    Count
    (
      Filter
      (
        NonEmpty
        (
          Descendants
          (
            [Product].[Product Categories].CurrentMember
           ,[Product].[Product Categories].[Product]
          )
         ,[Measures].[Internet Sales Amount]
        )
       ,
            (
              [Date].[Calendar].CurrentMember
             ,[Measures].[Internet Sales Amount]
            )
          - 
            (
              [Date].[Calendar].PrevMember
             ,[Measures].[Internet Sales Amount]
            )
        < 0
      )
    ) 
SELECT
  {
    [Measures].[TotNumberOfProducts]
   ,[Measures].[NumberOfSoldProductsIncreasing]
   ,[Measures].[NumberOfSoldProductsDecreasing]
  } ON 0
 ,[Date].[Calendar].[Month] ON 1
FROM [Adventure Works]

Executing this query with hot cache in MDX Studio produces the following perfmon counter statistics:

Time             : 359 ms
Calc covers      : 85
Cells calculated : 7672
Sonar subcubes   : 79
SE queries       : 76
Cache hits       : 76
Cache misses     : 3
Cache inserts    : 3
Cache lookups    : 79

The time looks good, but only because both the cube and the query are small. The number of SE queries should raise a red flag here. With only 114 cells in the result, why there were 76 SE queries ? Obviously, on more serious cubes, this will become a major performance issue. So, let’s run MDX Studio Analyzer and see what it has to say about this query. We get back the following list:

Line Col Message Link
3 6 If you are trying to count number of members in current selection - consider introducing special measure group for this dimension More Info
7 41 MDX function 'CurrentMember' may raise an error or produce non-desired result when user applies multiselect More Info
12 6 Consider rewriting Count(Filter(set, cond)) construct as Sum(set, summator) with summator IIF(cond, 1, NULL) More Info
20 45 MDX function 'CurrentMember' may raise an error or produce non-desired result when user applies multiselect More Info
27 34 MDX function 'CurrentMember' may raise an error or produce non-desired result when user applies multiselect More Info
32 34 MDX function 'PrevMember' may raise an error or produce non-desired result when user applies multiselect More Info
39 6 Consider rewriting Count(Filter(set, cond)) construct as Sum(set, summator) with summator IIF(cond, 1, NULL) More Info
47 45 MDX function 'CurrentMember' may raise an error or produce non-desired result when user applies multiselect More Info
53 14 Same expression was used before at Line 26 Column 14. Consider eliminating common subexpressions for better performance and to take advantage of cache  

Let’s go over this list and fix issue by issue.

Line 3, Col 6: If you are trying to count number of members in current selection - consider introducing special measure group for this dimension

The link points to “Counting days in MDX” blog. While we count here Products and not Days, the same technique applies here. These types of questions are best done not in MDX, but by introducing measure group, which only contains Products dimension, and single measure [Number Of Products] of type Count.

Line 7, Col 41: MDX function 'CurrentMember' may raise an error or produce non-desired result when user applies multiselect

Now, this is no longer a problem because of how we solved the previous warning ! Indeed, if we wanted to see total number of products across Bikes and Accessories categories, we would’ve added the following WHERE clause:

WHERE {[Product].[Category].&[1],[Product].[Category].&[4]}

It fails with the original query, but works fine with Count measure [Number of Products]

Lines 12/39, Col 6: Consider rewriting Count(Filter(set, cond)) construct as Sum(set, summator) with summator IIF(cond, 1, NULL)

This is a common construct, and as shown in the link to the “Optimizing Count(Filter(…)) expressions in MDX”, it performs much better if rewritten to use Sum. But before we do this rewrite, let’s take a look at the next warning:

Line 53, Col 14: Same expression was used before at Line 26 Column 14. Consider eliminating common subexpressions for better performance and to take advantage of cache

If we click on this message, the expression in question is the following:

            (
              [Date].[Calendar].CurrentMember
             ,[Measures].[Internet Sales Amount]
            )
          - 
            (
              [Date].[Calendar].PrevMember
             ,[Measures].[Internet Sales Amount]
            )

Indeed, this expression represents Sales Growth, so we can isolate it into separate calculated member. But before we write the exact expression for the calculated member, let’s look deeper into the original expression. It features Count(Filter(NonEmpty(…))) construct, i.e. only products which had sales in current month are considered for increasing/decreasing calculation. I.e. even if product had sales in the previous month, but not in the current month, it won’t be counted in the number of products with decreasing sales. In order to repeat the same logic, the expression would be

  [Sales Growth] =
    IIF
    (
      IsEmpty([Measures].[Internet Sales Amount])
     ,NULL
     ,
        [Measures].[Internet Sales Amount]
      - 
        (
          [Date].[Calendar].PrevMember
         ,[Measures].[Internet Sales Amount]
        )
    ) 

For more details why IIF(cond, NULL, exp) is good for performance, read “Performance of IIF function in MDX” blog. Now, putting all of together so far, we will get the following query:

WITH 
  MEMBER [Sales Growth] AS 
    IIF
    (
      IsEmpty([Measures].[Internet Sales Amount])
     ,NULL
     ,
        [Measures].[Internet Sales Amount]
      - 
        (
          [Date].[Calendar].PrevMember
         ,[Measures].[Internet Sales Amount]
        )
    ) 
  MEMBER PositiveGrowth AS 
    IIF
    (
      [Measures].[Sales Growth] > 0
     ,1
     ,NULL
    ) 
  MEMBER Measures.NumberOfSoldProductsIncreasing AS 
    Sum
    (
      Descendants
      (
        [Product].[Product Categories].CurrentMember
       ,[Product].[Product Categories].[Product]
      )
     ,[Measures].[PositiveGrowth]
    ) 
  MEMBER NegativeGrowth AS 
    IIF
    (
      [Measures].[Sales Growth] < 0
     ,1
     ,NULL
    ) 
  MEMBER Measures.NumberOfSoldProductsDecreasing AS 
    Sum
    (
      Descendants
      (
        [Product].[Product Categories].CurrentMember
       ,[Product].[Product Categories].[Product]
      )
     ,[Measures].[NegativeGrowth]
    ) 
SELECT
  {
    [Measures].[Number Of Products]
   ,[Measures].[NumberOfSoldProductsIncreasing]
   ,[Measures].[NumberOfSoldProductsDecreasing]
  } ON 0
 ,[Date].[Calendar].[Month] ON 1
FROM [Adventure Works];

When executed in MDX Studio, we get the following perfmon statistics:

Time             : 171 ms
Calc covers      : 15
Cells calculated : 1993
Sonar subcubes   : 6
SE queries       : 3
Cache hits       : 7
Cache misses     : 5
Cache inserts    : 5
Cache lookups    : 12

This is now much better than before, there are only 3 SE queries. All of the performance warnings have been addressed, but we still have warnings about CurrentMember not working well with multiselect. Indeed, when we put multiselect on product categories on Bikes and Accessories, we get errors for number of increasing and decreasing products. Links to “Writing multiselect friendly MDX calculations” and “Multiselect friendly MDX for calculations looking at current coordinate” blogs are helpful to understand what the problem is. Unfortunately, there is no universal solution, and each case needs to be evaluated differently.

In this scenario, we can use apply to Sum(Descendants(…), exp) very similar technique that we applied to Count(Descendants(…)). The idea is not to use Descendants function, which forces us to use CurrentMember, which in turn fails on multiselect, but to put the expression “exp” directly on the Product attribute. In order to do that, we need to create fake measure in the measure group - “Num Products Increased”. It is bound to NULL expression, so this measure will always be NULL, and therefore won’t take any space in storage. But since this measure will have aggregation function Sum, it will properly aggregate when assigned expressions inside MDX Script.

We will use MDX Studio to emulate MDX Script too (MDX Studio has a feature, where it automatically uses cube selected in the dropdown list to resolve current cube used in script statements).

Here is the MDX:

CREATE 
  [Sales Growth] = 
    IIF
    (
      IsEmpty([Measures].[Internet Sales Amount])
     ,NULL
     ,
        [Measures].[Internet Sales Amount]
      - 
        (
          [Date].[Calendar].PrevMember
         ,[Measures].[Internet Sales Amount]
        )
    );
(
  [Product].[Product Categories].[Product]
 ,[Measures].[Num Products Increased]
) = 
  IIF
  (
    [Measures].[Sales Growth] > 0
   ,1
   ,NULL
  );
SELECT
  {
    [Measures].[Number Of Products]
   ,[Measures].[Num Products Increased]
  } ON 0
 ,[Date].[Calendar].[Month] ON 1
FROM [Adventure Works]
WHERE 
  {
    [Product].[Category].&[1]
   ,[Product].[Category].&[4]
  };

It’s performance characteristics are even slightly better than the query before:

Time             : 31 ms
Calc covers      : 6
Cells calculated : 76
Sonar subcubes   : 1
SE queries       : 1
Cache hits       : 2
Cache misses     : 0
Cache inserts    : 0
Cache lookups    : 2

Not to mention the fact that it works just fine with multiselect on products ! This is a great example, where fixing MDX to work properly in multiselect scenarios, has the side effect of improving performance.

We still have warnings about PrevMember with respect to multiselect on the Time dimension. It is more difficult problem, since the definition of “Sales Growth” when there is multiselect on the Time dimension is ambiguous.

Published Tuesday, July 29, 2008 9:38 PM by mosha
Filed under: , ,
Anonymous comments are disabled
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement