THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Microsoft OLAP by Mosha Pasumansky

Counting Days in MDX

Time dimension is special in OLAP. Many MDX functions usually only make sense when applied to Time dimension (PrevMember, Lag, ParallelPeriod, PeriodsToDate, ClosingPeriod etc); semiadditive measures work differently with Time etc. Today, however, we will talk about much simpler subject - counting number of days in the currently selected period. There are all kinds of uses for this metric, for example computing averages over time (this is usually interesting in inventory applications to get average level of inventory). For this article we will use Adventure Works cube and compute average of [Internet Sales Amount] over [Ship Date] dimension. We could say that really for computing averages over time, one should use AverageOfChildren semiadditive measure. This is true with two caveats. First, semantics of AverageOfChildren semiadditive aggregation with respect to treating NULLs is the same as with Avg function - i.e. it will not count days which had no sales, and if we wanted to count such days, then AverageOfChildren won't work. Secondly AverageOfChildren is available only in Enterprise Edition. Lastly knowing number of days (or other time periods) in the currently selected time is useful in other calculations as well, we use average because it is simple enough, yet illustrative for our goals.

The simplest and most straightforward way to calculate the number of days in current date would be

Count(Descendants([Ship Date].[Calendar], [Ship Date].[Calendar].[Date_]))

Now using this calculation we can also calculate other metrics such as Sales per Day etc. Let's see it work in the following query

WITH 
 MEMBER Measures.[Number Of Days] AS Count(Descendants([Ship Date].[Calendar], [Ship Date].[Calendar].[Date_]))
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
, DrillDownMember(
   DrillDownMember(
    DrillDownMember(
     DrillDownMember([Ship Date].[Calendar].[All Periods], [Ship Date].[Calendar].[All Periods])
     ,[Ship Date].[Calendar].[Calendar Year].&[2003]) 
    ,[Ship Date].[Calendar].[Calendar Semester].&[2003]&[1])  
   ,[Ship Date].[Calendar].[Calendar Quarter].&[2003]&[2])
  ON 1
FROM [Adventure Works]

This query shows sales, number of days and sales per day for the report where the user drilled down to year 2003, first semester, second quarter to show months of April, May and June. Works great. But the problems start if the user wants to see SalesPerDay metric not for the entire Q2 and not for individual months, but for combination of April and May. The operation widely known as multiselect.

WITH 
 MEMBER Measures.[Number Of Days] AS Count(Descendants([Ship Date].[Calendar], [Ship Date].[Calendar].[Date_]))
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
FROM [Adventure Works]
WHERE {[Ship Date].[Calendar].[Month].&[2003]&[4],[Ship Date].[Calendar].[Month].&[2003]&[5]}

Both number of days and sales per day show familiar error 'Query (2, 45) The MDX function DESCENDANTS failed because the coordinate for the 'Month Name' attribute contains a set'. The piece 'Query (2, 45)' tells us that the problem is in the query at line 2, column 45. That's in the definition of the [Number of Days] calculated member during the call to the Descendants function (we knew that already from the error text, just in case if there were more than one Descendats function there, we know which one failed). I have written about this particular problem before in the article "Writing multiselect friendly MDX calculations". Using the techniques described in that article we could rewrite the definition of the calculated member to replace Descendants function with EXISTING operator. The result will look like following:

WITH 
 MEMBER Measures.[Number Of Days] AS Count(EXISTING [Ship Date].[Calendar].[Date_])
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
FROM [Adventure Works]
WHERE {[Ship Date].[Calendar].[Month].&[2003]&[4],[Ship Date].[Calendar].[Month].&[2003]&[5]}

Now it doesn't fail and returns correct result for both number of days and sales per day. So are we happy ? No, we are not happy, because not all of the problems are solved. And I am not even talking now about performance hit that we take by replacing Descendants with EXISTING. Given the fact that the Time dimension is usually very small (i.e. 10 years will require no more than 3660 days, which is tiny as far as Analysis Services cares), the performance difference is not going to be very significant. I am talking about how basic scenarios which require either visual totals or custom grouping are going to be broken in more fundamental way than multiselect.

Let's go back to our first query where we drilled down to 2003, Quarter 3. Now, let's say we want to hide month of June as if it didn't exist, and see how the results for Q3, and the entire year 2003 change. This is so called visual totals mode, and Excel always turned it on by default. Other tools such as OWC also allow it. There are many different ways how visual totals can be implemented - either by using 'Default Visual Mode' connection string property, or by using VisualTotals MDX function or by using subselects or CREATE SUBCUBE statement - it doesn't matter. At the end of the day all of them achieve the same thing. Let's see how our query would look like in this scenario:

WITH 
 MEMBER Measures.[Number Of Days] AS Count(Descendants([Ship Date].[Calendar], [Ship Date].[Calendar].[Date_]))
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
,  VisualTotals(
    Except(
     DrillDownMember(
      DrillDownMember(
       DrillDownMember(
        DrillDownMember([Ship Date].[Calendar].[All Periods], [Ship Date].[Calendar].[All Periods])
       ,[Ship Date].[Calendar].[Calendar Year].&[2003]) 
      ,[Ship Date].[Calendar].[Calendar Semester].&[2003]&[1])
     ,[Ship Date].[Calendar].[Calendar Quarter].&[2003]&[2])
    ,[Ship Date].[Calendar].[Month].&[2003]&[6])
   )
   ON 1
FROM [Adventure Works]

Here we hide June by using Except, and then wrap everything in the VisualTotals call to get visual totals. The results are pretty bad. While the physical measure [Internet Sales Amount] works absolutely correctly in the presence of visual totals and shows only sales of April and May, the calculated measures completely ignore it. I.e. the [Number of Days] still shows 91 even though it should be 61. What's worse, the SalesPerDay metric ends up completely fubared as a result of it. After all it is division of [Internet Sales Amount] by [Number of Days]. So we divide something which took visual totals into account by something that didn't. The result is neither donkey nor orange. The number simply doesn't make any sense. At least with multiselect we got an error - so the user didn't get the result he wanted, but he also didn't get wrong result either. He got an error, so he knew something was wrong. Here user gets wrong result without any warning. How does he know it is wrong ? And changing the expression to use EXISTING isn't going help - the result will stay the same. There is a fundamental problem here. With multiselect, the current coordinate reflects it, so MDX expressions can detect presence of multiselect. But visual totals are stealth. It is impossible to detect them directly from MDX, one can only watch their side effects. Generally speaking it is a hard problem, but some people think about solution and you can help !

Luckily, in this particular case there is something that can be done. There is a solution for counting number of days in the current time period. And unlike with some other problems, this solution is actually very elegant. And it also solves some other problems. And it has best performance. And it doesn't require any MDX ! This may sound like magic, but it is true. The solution is actually quite simple, and it was discovered by many people independently. In particular, Vladimir Chtepa is known for popularizing it in Internet forums. The trick is to create new measure group, which will have only one dimension - Time, and a measure with Aggregation Type 'Count' bound to the Day attribute of Time dimension (or any other attribute if we need to count something different from days). The most important setting on this new measure group is to leave IgnoreUnrelatedDimensions=true - this will allow to use this measure across any other dimension in the cube. Let's call this measure [Number Of Days] and see how our queries perform with it.

Both

WITH 
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
FROM [Adventure Works]
WHERE {[Ship Date].[Calendar].[Month].&[2003]&[4],[Ship Date].[Calendar].[Month].&[2003]&[5]}

And

WITH 
 MEMBER Measures.SalesPerDay  AS [Measures].[Internet Sales Amount] / Measures.[Number Of Days]
SELECT
{[Measures].[Internet Sales Amount], Measures.[Number Of Days], Measures.SalesPerDay} ON 0
,  VisualTotals(
    Except(
	  DrillDownMember(
		DrillDownMember(
			DrillDownMember(
			  DrillDownMember([Ship Date].[Calendar].[All Periods], [Ship Date].[Calendar].[All Periods])
			  , [Ship Date].[Calendar].[Calendar Year].&[2003]) 
			, [Ship Date].[Calendar].[Calendar Semester].&[2003]&[1])
		, [Ship Date].[Calendar].[Calendar Quarter].&[2003]&[2])
	  , [Ship Date].[Calendar].[Month].&[2003]&[6])
	)
   ON 1
FROM [Adventure Works]

now work fine and return correct results. But there are more advantages to this approach

* If we need to compute not just number of days, but something more sophisticated, like number of non-holidays - this can be easily done. Just add another measure into the measure group, with AggregationType Sum, and put the value '1' for the days which are considered to be non-holidays and NULL for the days which are holidays.

* It works great with multiple Time dimensions and Role playing dimensions. Indeed, in Adventure Works we have 3 different role playing Time dimensions. By using all of them in the measure group, we will get single measure [Number Of Days] to correctly show the number of days regardless by which Time dimension user slices. With calculated measures approach, different calculated measures need to be created per Time dimension.

* Performance is the best there can be, because now [Number Of Days] is a real measure which is precomputed during processing.

* In addition to multiselect and visual totals, other scenarios such as custom grouping and sliced local cubes - all work correctly.

 

Published Sunday, May 27, 2007 6:15 PM by mosha
Filed under:
Anonymous comments are disabled
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement