THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Microsoft OLAP by Mosha Pasumansky

Time calculations in UDM: Parallel Period

Comparing current data to the data in the same unit (day, month, week) in the previous period (year, quarter, semester) is one of the most fundamental tasks in the OLAP system. In MDX this is usually archived through the ParallelPeriod function. However, the classic approach to ParallelPeriod doesn't work well in UDM where Time dimension has many attributes and hierarchies, and where symmetric hierarchy structure is hard to achieve across Calendar, Fiscal and other hierarchies. Time Intelligence wizard doesn't address all these issues either. In this article we will discuss strategies how to build MDX Script calculations for parallel period to work properly across all attributes and hierarchies.

Most of the existing literature cover use of ParallelPeriod in single hierarchy scenarios, for example

Even Analysis Services own built-in Time Intelligence wizard only deals with single hierarchy at a time. But there is also a very well written whitepaper - A Different Approach to Implementing Time Calculations in SSAS - by David Shroyer, where (among other things) he addresses Year over Year calculation in the Time dimension with multiple hierarchies. I would like to expand this subject further, and both address additional important scenarios (namely when there is partial overall between user hierarchies) which David omits, as well as proposing different approach to writing calculations, which both simplifies the expressions and offers better performance. I want to say upfront, that David had done an excellent job in his paper, and what follows in this article is not a critique, but simply a continuation of his ideas.

First, let's review the scenario and solution proposed by David in his whitepaper. In the scenario, the time dimension has two hierarchies - [Calendar Hierarchy] and [Fiscal Hierarchy]. The special dimension (he calls it "shell dimension") is created - much like the Time Intelligence wizard creates special attribute within Time dimension. (In my opinion it is less confusing to create new attribute within dimension then create new dimension, but it is matter of taste). Then in order to compute the values for the previous year, the following formula is proposed:

CREATE MEMBER CURRENTCUBE.[Time Calculations].[Prior Year] AS
  CASE WHEN [Time].[Fiscal Hierarchy].CurrentMember.level.ordinal=0 THEN
    Aggregate(
      (ParallelPeriod([Time].[Calendar Hierarchy].[Year], 1,[Time].[Calendar Hierarchy].CurrentMember)
      * {[Time Calculations].[Current Period]})
    )
  ELSE
    Aggregate(
      (ParallelPeriod([Time].[Fiscal Hierarchy].[Fiscal Year], 1,[Time].[Fiscal Hierarchy].CurrentMember)
      * {[Time Calculations].[Current Period]})
   )
  END
,
VISIBLE = 1;

Let's first acknowledge, that the use of Aggregate here is redundant, since it goes over single tuple set. Removing Aggregate, and omitting the default parameters to ParallelPeriod, the formula can be simplified down to

CREATE [Time Calculations].[Prior Year] =
  CASE WHEN [Time].[Fiscal Hierarchy].CurrentMember.Level.Ordinal=0 THEN
    (ParallelPeriod([Time].[Calendar Hierarchy].[Year]), [Time Calculations].[Current Period])
  ELSE
    (ParallelPeriod([Time].[Fiscal Hierarchy].[Fiscal Year]),[Time Calculations].[Current Period])
  END

Now let's analyze the idea behind this approach. It tries to check whether user browsed by [Fiscal Hierarchy] or by [Calendar Hierarchy]. This is done by looking at whether the current level in [Fiscal Hierarchy] is [All]. This is done through checks on the level ordinals (see my blog about Comparing Levels in MDX for more details). This approach works in this scenario, but it has some problems:

  • It relies on the hierarchies being disjoint. If Time dimension has hierarchies which share some attributes, then it might not be possible to say by which hierarchy user browsed by checking the current level
  • It decides which formula to apply based on the hierarchy checks, which is always a bad idea in UDM - instead the coordinate checks should be done by attributes
  • It uses expensive runtime-dynamic CASE operator for checks instead of static SCOPE'ing which causes serious performance problems (see the last paragraph here)
  • Assuming that Calendar and Fiscal hierarchies share some attributes - which is almost always a case, for example, in AdventureWorks they share month and date attributes - the formula returns wrong results for the beginning of the year for one of them. The reason is, that if Time dimension starts, say from July 2001 (like in AdventureWorks), then the wrong results will happen for first six month of 2002 in Calendar hierarchy, since there is no 'parallel' month to January 2002 through June 2002. And if Time dimension starts from January - then same problem will happen in the Fiscal hierarchy.

In order to demonstrate these concepts, we need slightly more complex attribute relationship then what stock AdventureWorks provides. We need something which is informally known as "diamond shape" relationship. I.e. when the graph of relationships is a generic direct acyclic graph rather then more simple case of a tree. The picture on the right is a classic example of diamond shape. Both Week and Month are related to Date, and they both are relating to Year. Yet, there is no one to many relationship between them. So I had to create new attribute in the AdventureWorks Date dimension. In order to reproduce my results - you need to do the same. Below is the list of steps that I followed:

  1. In DSV create new calculated column in the Date table
  2. Name it CalendarWeek
  3. Use the following expression:
    'Week ' + Cast(WeekNumberOfYear as char(2)) + ' ' + CalendarYear
  4. Create new attribute in the Date dimension, name it [Calendar Week]
  5. Set the key for this attribute to a compound key: First key column is DimTime.CalendarYear (WChar), and the second key column is DimTime.WeekNumberOfYear (UnsignedTinyInt)
  6. Set the Name to column CalendarWeek created at step 2
  7. Set OrderBy property to Key.
  8. Set AttributeHierarchyDisplayFolder to Calendar
  9. The [Calendar Week] attribute has one attribute relationship defined - to [Calendar Year] attribute
  10. Create new hierarchy [Calendar by Weeks] with [Calendar Year] attribute at first level and [Calendar Week] attribute at the second level
  11. Set DisplayFolder for [Calendar by Weeks] to Calendar

Step 9 is very important here - that's what completes the diamond shape of relationship. Step 10, on the other hand, seem to be optional. However, in the current version of Analysis Services, it is extremely important. Without creating the natural hierarchy which follows the newly created attribute relationship, the engine won't always pick up that relationship, and the results would be wrong. The fact that this hierarchy needs to be created almost sounds as a bug, but that is the way it is currently implemented. Hopefully future versions will remove dependency on it.

Now we are ready to attack the ParallelPeriod calculations. For simplicity sake, we will move all the formulas from the utility dimension to the Measures dimension, so our calculations will compute previous year for Internet Sales Amount only. It should be a simple exercise to rewrite the calculations to properly work with utility dimension later (and we will pick [Ship Date] role playing dimension as a Time dimension). First thing we should realize, is that ParallelPeriod function does all of its computations within hierarchy, so in order to get correct results, we will need to feed different hierarchies to it depending on what attribute we are positioned on. However, as mentioned above, we are not going to use runtime checks with IIF, instead we will simply scope on the appropriate attributes. In order to decide which attributes we are going to scope on, we have to clearly understand the attribute relationships within the dimension. Attribute relationships play a critical role in understanding how coordinates change within calculations. I strongly recommend reading the "Attribute Relationships in Analysis Services" whitepaper by Richard Tkachuk. While it explains the coordinate overwrite rules for the calculation expressions, the very same rules apply everywhere else, including the space defined by the SCOPE statement.

Here is a brief summary of the coordinate overwrite rules:

When attribute coordinate is changed explicitly (by calculation expression, by SCOPE etc) then

  • Coordinates above it are decoded (i.e. granularity is set on such attributes)
  • Coordinates below it are set to All (i.e. granularity is removed from such attributes)
  • Coordinates in unrelated attributes are unchanged (i.e. granularity remains the same on such attributes)

(Sometimes these rules referred to as 'Strong Hierarchy' rules, because above/below directions remind us levels in the imaginary hierarchy).

So in order to understand how one attribute will affect others, let's draw the lattice of attribute relationship for the [Ship Date] dimension in AdventureWorks cube. The picture on the right shows the subset of the relationship lattice (there are more attributes in this dimension, but for the time being we are not interested in them). We are ready now to start writing the appropriate MDX Script. First, we will create new calculated measure [Prior Year Internet Sales Amount]

CREATE [Prior Year Internet Sales Amount]; 

Now we can start defining calculations for it. Let's start from the formula which will work correctly for Fiscal attributes. Obviously, the expression itself should be [Measures].[Internet Sales Amount], ParallelPeriod([Ship Date].[Fiscal].[Fiscal Year])), but where to apply it ? We want it to apply for Fiscal Year, Fiscal Semester and Fiscal Quarter attributes, so it seems logical to set it on Fiscal Quarter attribute, and make it propagate on all of the attribute above. But we also note, that since dimension table in AdventureWorks is really built to fit the Fiscal calendar, i.e. it starts from July 1st as opposed to January 1st, we can and should include Month Name and Date attributes as well to be covered by this formula. So we will start from the Date attribute. Therefore the first assignment inside MDX Script would be

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Date].MEMBERS) 
  = ([Measures].[Internet Sales Amount], ParallelPeriod([Ship Date].[Fiscal].[Fiscal Year]));

However, you may object to this by noticing, that since Date is the key of [Ship Date] dimension, ultimately all other attribute are relating to it, therefore our first assignment will in fact cover all the attributes in the dimension, including the attributes where we don't want the expression to apply. And since we scope both on All Dates as well as on [Date] attribute members - we define mixed granularity which covers the entire dimension. It's OK, since we are going to overwrite the expression for this calculated member at coordinates where we want different expression to be used. To do a quick check that the formula worked, we can issue the following query:

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
, Descendants([Ship Date].[Fiscal].[All Periods],[Ship Date].[Fiscal].[Month],SELF_AND_BEFORE) ON 1
FROM [Adventure Works]

Now we need to fix other attributes. Let's switch to the Calendar ones. First is seems to be straightforward - just do

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Quarter].MEMBERS) 
  = ([Measures].[Internet Sales Amount], ParallelPeriod([Ship Date].[Calendar].[Calendar Year]));

But there is a catch. ParallelPeriod won't work correctly in Calendar hierarchy ! Remember, that dates start from July 1, 2001 in the [Ship Date] dimension. ParallelPeriod really knows nothing about it, the way it works is exactly the way the Cousin function works, i.e. it finds the member of the hierarchy which is at the same relative position as the current one. Therefore, ParallelPeriod for January 2002 instead of going to January 2001 (which doesn't exist) goes to July 2001, February 2002 goes to August 2001 and so on ! Of course, in 2003 everything normalizes, since 2002 has all months, but the results for the first half of 2002 for ParallelPeriod are clearly wrong. This anomaly can be demonstrated by the following query

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
, Descendants([Ship Date].[Calendar].[All Periods],[Ship Date].[Calendar].[Month],SELF_AND_BEFORE) ON 1
FROM [Adventure Works]

Therefore, we cannot use ParallelPeriod function, and instead will have to compute remaining attributes manually. It is actually pretty simple, since all we need to do is to go back fixed number of periods. For quarters we go back 4 quarters, for semesters - 2 semesters, for years - 1 year. This can be done with the Lag function and it leads us to the following statements to the MDX Script:

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Quarter].[Calendar Quarter].MEMBERS) 
  = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Quarter].Lag(4));
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Semester].[Calendar Semester].MEMBERS) 
  = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Semester].Lag(2));
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Year].[Calendar Year].MEMBERS) 
  = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Year].PrevMember);

Note, that all of the scoping is done on the attribute members, excluding the All member. The All member is excluded in order to avoid mixed granularity, which will force expansion of the space above the attribute. We are OK with the space below, since we know that the granularities below will not be affected (because granularity below is removed by strong hierarchy rules). Now the query below will return correct results:

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
, Descendants([Ship Date].[Calendar].[All Periods],[Ship Date].[Calendar].[Month],SELF_AND_BEFORE) ON 1
FROM [Adventure Works]

The only remaining uncovered attribute in the picture is [Calendar Week]. As before, we cannot use ParallelPeriod on it, because the first week happens to be Week 27, but we again can use Lag function to look 53 weeks back, i.e.

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Week].[Calendar Week].MEMBERS) 
  = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar by Weeks].Lag(53));

We can verify correctness for the Weeks by using

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
,[Ship Date].[Calendar By Weeks].MEMBERS ON 1
FROM [Adventure Works]

Finally, we can add proper formatting for the calculated measure -

FORMAT_STRING([Measures].[Prior Year Internet Sales Amount]) = 'Currency';

And the task is done ! (This is what Time Intelligence wizard should really generate).

But now, when it appears that we are done, let's remember that the Date dimension has more attributes, although they don't participate in the user hierarchies. Indeed, we still have all those [Calendar Quarter of Year], [Week of Year], [Calendar Semester of Year], [Day of Month], [Day of Week], [Day of Year] etc. Actually AdventureWorks doesn't have all such possible attributes, and the relationships aren't defined strict, but if it had, then the lattice would've looked something like the picture on the right where the cardinalities of attributes are added for clarity (this diagram is courtesy of Marius Dumitru).

So far we've been focusing on the natural hierarchies, i.e. the hierarchies which lie on the path of attribute relationships. But let's also investigate what happen with the unnatural hierarchies, i.e. what about attributes which don't have any relationship to the [Calendar Year] attribute. To illustrate this, let's build several such unnatural hierarchies:

Since we didn't do anything specific to those attributes, the initial reaction could be that our calculation won't work. But let's check it out using the following queries:

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
,[Ship Date].[Calendar By Weeks].MEMBERS ON 1
FROM [Adventure Works]

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
,[Ship Date].[Year By Weeks].MEMBERS ON 1
FROM [Adventure Works]

SELECT {[Measures].[Internet Sales Amount],[Measures].[Prior Year Internet Sales Amount]} ON 0
,[Ship Date].[Year By Months].MEMBERS ON 1
FROM [Adventure Works]

Amazingly, all the queries work perfectly and return absolutely correct results. The key statement that made it happen is

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Year].[Calendar Year].MEMBERS) 
  = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Year].PrevMember);

Here, we are moving the [Calendar Year] one attribute back using PrevMember function. Now remember what overwrite rules say about unrelated attributes - the unrelated attributes remain completely unchanged. Therefore, when we navigate to [2002].[August].[7] in the [Year By Months] hierarchy, the [Month of Year] remains August, and [Day of Month] remains 7 (this is my birthday, by the way). The only thing that changes is the [Calendar Year] - from [2002] to [2001]. As simple as that ! Now, I normally don't recommend to create unnatural hierarchies - there are some serious performance considerations, but I feel that in the case of Time dimension, where the cardinalities are very low, i.e. the dimension is pretty small - perhaps several thousand members, such unnatural hierarchies simplify the relative period calculations greatly. I.e. single line with straightforward PrevMember, as opposed to carefully crafted MDX Script in the first half of the article. For the reference, the entire MDX script is copied below:

CREATE [Prior Year Internet Sales Amount];

([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Date].MEMBERS) = ([Measures].[Internet Sales Amount], ParallelPeriod([Ship Date].[Fiscal].[Fiscal Year]));
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Quarter].[Calendar Quarter].MEMBERS) = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Quarter].Lag(4));
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Semester].[Calendar Semester].MEMBERS) = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Semester].Lag(2));
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Year].[Calendar Year].MEMBERS) = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar Year].PrevMember);
([Measures].[Prior Year Internet Sales Amount], [Ship Date].[Calendar Week].[Calendar Week].MEMBERS) = ([Measures].[Internet Sales Amount], [Ship Date].[Calendar by Weeks].Lag(53));

FORMAT_STRING([Measures].[Prior Year Internet Sales Amount]) = 'Currency';

In summary, this article discussed the techniques for developing Time calculations in rich UDM multihierarchy dimensions. What's more important, I tried to show the importance of attribute relationships on the result of calculations, and the general principals used in this article should be applicable to other problem domains. Next time we will approach highly debatable subject of Running Sum calculations. While there is plenty of material published about this problem, from some of the respected OLAP experts, I feel that the last word wasn't yet spoken, and I will demonstrate my approach to this problem, which while will seem to be from the first view too simplistic compared to alternative proposals, in fact is the most correct and offers best performance.

Published Wednesday, October 25, 2006 10:07 AM by mosha
Anonymous comments are disabled
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement