THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Microsoft OLAP by Mosha Pasumansky

MDX is like chess. Are you ready to play against Grandmaster ?

Warning: This post carries no or very little practical value. It is mostly obscure. However, if you are MDX junkie, you may find it interesting.

I read this post by Panorama employee Oz Shal-Bar. It is very well written - it talks about non-trivial issues with MDX interop between different providers, yet it is very entertaining. It isn't written in a dull boring technical staff, but in a captive style typical to detective stories. I wish I could write like that ! And while I don't sympathize with the subject of the post, i.e. I think it solves the artificial problems, I really enjoyed reading it. It poses a riddle, and I love riddles, even when I think they are completely made up and don't reflect real problems. Oz's solution is witty and elegant given all the constraints he had to work through. However, as I was reading this post, I felt that there are few places that I would've liked to comment on. I tried to leave a comment about some of them, but as of today, my comment didn't make it through moderation system. And I was thinking more about this riddle, and I thought I had something meaningful to add to it and that there is a certain class of readers of my blog who would like it too. Hence this post - my reply to Oz. While it may appear as a critique (or even arrogant) to some, I felt like the entertaining tone of the original post invited response in the same tone. Besides, I myself is a former Panorama employee, and I still know couple of people there (Kobi and Danny), so I trust that I won't be misunderstood. So here we go. If you are in for a ride, please start with reading the original post, otherwise you won't get much from mine.

"MDX is like chess - knowing the rules doesn't mean you know how to play". What a wonderful opening line. Well, indeed, I agree with it. But what if you were to play the Grandmaster ? He won't miss subtle inaccuracies which you can let in a casual game. And in a position that you consider to be won, he will find the stunning move that will turn the tables... So let's analyze the riddle from the end. The article ends with the following MDX expression:

Generate (Filter ( { [Measures].CurrentMember }, Condition), [True Set])
Generate (Filter ( { [Measures].CurrentMember }, NOT Condition), [False Set])

And Oz proclaims "Check Mate!!!". But is it really ? Does the above expression always really solves the original problem ? The answer is - no, it doesn't ! This isn't a Check Mate yet, although it is pretty close to one. Can you see what's wrong with this expression ? When wouldn't it work as desired ? I guess the rules of the game were not fully described... There are subtleties to how Generate function works, which make this solution a wrong one. Let's read carefully definition of the Generate function in MDX. We will use the very first version of OLEDB for OLAP spec which came out on September 9, 1997 (and you can see Panorama in the list of companies who announced support for that spec!). This very first version already contains enough description about Generate function to spot the problem. Let's read the relevant portion of the "Set Value Expression" section of the spec:

If the optional ALL flag is specified, all duplicates are retained in the result. If ALL is not specified, duplicates are removed.

What this tells us is that when the optional ALL flag is not specified, Generate will eliminate duplicates from the set. Therefore if either [True Set] or [False Set] had duplicates in them, the result of the above expression will be wrong, since it will return Distinct([True Set]) or Distinct([False Set]). To fix it we should really write

Generate (Filter ( { [Measures].CurrentMember }, Condition), [True Set], ALL)
Generate (Filter ( { [Measures].CurrentMember }, NOT Condition), [False Set], ALL)

For now let's assume that this is the final answer, but we will revisit it later. Now, let's go back to the beginning and walk through it. Let's talk a little about the very first, most intuitive and most performant solution Iif (Condition, [True Set], [False Set]). Oz dismisses this solution with the following comment: "it’s just that Instant-If shouldn’t return sets, only numbers, strings, or members." Apart from the fact that Iif stands for "Immediate IF", not for "Instant IF" what else is wrong with this statement ? It is true that according to the official spec, IIF is not supposed to return sets. In the "Conditional Expressions" section it clearly states that "The data type of <true_part> must be the same as the data type of <false_part>. That is, either both are numeric value expressions or both are string value expressions.". But why did Oz added "or members" piece ? It is clearly not part of the spec. You may say - yes, but he didn't refer to the formal spec when he said "shouldn't", but rather to AS2000 (and OLAP Services 7.0) implementation which seem to allow members in the <true_part> or <false_part>. Are you sure it allows members ? Yes, you just tried and it worked ! Excuse me, but then let's also try some other data type, like tuple. Hey - that worked too. What about hierarchy object ? Works great as well ! So should the statement say "only numbers, strings, members, tuples or hierarchies" then ? No, it shouldn't. The right answer is "only numbers or strings". The reason you can use hierarchy object is because there is an implicit conversion in MDX from hierarchy to member object by applying .CurrentMember function. That's the reason why you can say <hierarchy>.PrevMember - because behind the scenes it is translated to <hierarchy>.CurrentMember.PrevMember. And the reason you can use member object is because there is an implicit conversion in MDX from member to tuple, by simply constructing tuple with single hierarchality in it: <tuple> = ( <member> ). And the reason why you can use tuple object is because there an implicit conversion in MDX from tuple to the scalar data type (such as number or string) by applying the .Value function. I.e. <tuple>.Value. This, by the way, makes .Value the most heavily used MDX function, with people not even realizing, that when they write expression Sales - Cost, it actually get translated to something like (Sales).Value - (Cost).Value.

AS2005 greatly relaxed the limitation of IIF, as documented in the BOL for IIF function: "In Microsoft SQL Server 2000, Analysis Services supported only numeric and string return types, and the types of specified expressions had to be the same. These restrictions do not apply to SQL Server 2005 Analysis Services (SSAS)." I can only add, that scalar data types are not limited by numbers or strings. AS2005 supports wide variety of other data types, such as booleans, dates, times, currencies etc. Hey, AS2000 supported currency as well.

Let's move on, to the StrToSet solution - StrToSet ( Iif (Condition, “[True Set]”, “[False Set]”)). This is good solution for AS2000. But what's up with Oz's comment that "Besides, what if the sets are huge? Conversion might be time & memory consuming.". What we are converting here are names of the sets, i.e. very small strings "[True Set]" is only 10 characters and "[False Set]" is only 11 characters. It doesn't matter whether the set itself will be huge or not. There will be no additional memory allocated - we referred to set by name, of course AS is not going to duplicate it ! There is nothing time or memory consuming in this solution. StrToSet function has other problems, of course, mostly related to how it interferes with cache management, but these are really orthogonal and we cannot explore them in this article.

What about the next solution, the one involving Filter - Filter ([True Set], Condition). The only comment Oz makes here is that "Alas, it’s very time consuming! Applying the condition on *all* members?". I agree on the performance comment, but why there is no mention that this is simply wrong thing to do ! The Condition gets evaluated in the context of the tuples inside the [True Set], which is the different context from being outside. So it is possible, that while outside Condition will evaluate to True, inside the Filter, it may evaluate to False. Or even worse - it will be True for some tuples in [True Set], and False for others ! This is definitely not going to work...

Now we are coming back to the final solution. It seems to be OK after the addition of the ALL flag. And it is in fact a working solution in both AS2000 and AS2005 (although not the best one). But will it work with SAP BW. I know very little about SAP BW support of MDX. Definitely much less than Oz. But I did read his other blog "MDX as a cross-platform query language". And I learned there about SAP BW interpretation of some of the core concepts in MDX, such as current context. Basically, SAP BW completely ignores the current context inside Filter function. (I actually wonder whether the query language that SAP BW implemented can be called MDX at all - while the syntax does look like MDX, deviating from one of the most fundamental principles makes it something else in my opinion. But I won't go there). Anyway, apparently, in SAP BW version of MDX, while you are inside Filter function, you have to explicitly mention all the dimensions for which you want to use context. But doesn't it mean that Condition evaluated inside Filter({Measures.CurrentMember}, Condition) and just Condition might return different results ? The truth is - I don't know, because as I said, I know very little about SAP BW interpretation of MDX. But we already introduced so many artificial restrictions, so it won't hurt to put one more. Let's assume that we cannot use Filter function. What now, is there solution for the riddle ? Well, of course, there is, and actually there is more than one alternative solution !

So, do you see any other approach ? Here is a hint - with Filter falling out of favorites, how about going back to the IIF function. We know that by the rules we cannot return sets from it, but we still can return strings and numbers. Strings were the dead end, because by the rules we cannot use StrToSet, so this leaves us with numbers. We can check condition inside IIF, and return different numbers from it. See where we are heading ? Or need a little more help ? Here is another hint: MDX has plenty of functions which operate on sets. We can do unions, intersects, crossjoins, extracts, subtractions, subsets etc. Wait a second - did I say "subset" ? Of course ! We can take a subset of the original set. If the condition is true - we will take the whole set, and if it is false, then empty subset. I.e. either one of Head, Tail or SubSet functions will work here. Head(<set>, <count>) is the simplest to work with. If we call Head(<set>, 0) we will get empty set, but Head(<set>, <set>.Count) will return the entire set. So the solution will look like following:

{ Head([True Set], IIF(Condition, [True Set].Count, 0)), Head([False Set], IIF(Condition, 0, [False Set].Count)) }

Nice work ! So could I say "Check Mate" now ? What do you think ? Not quite... There is more to this problem. But like in any good chess article, I should leave some exercise to the reader. Not to be too hard on you, I will leave only two questions for you to solve. One is moderate, and one is difficult. I will give you some time - you can either email me with solution or leave it in the comments claiming the hall of fame (if your solution is correct, of course!).  When the "some time" will pass, I will announce the winner and post the answers. Not sure how soon this will happen, as I am about to leave for a family vacation, and I don't know how easy it is to find Internet Cafe in coastal Italy :) But you get at least a week to think about it. Here are the questions:

1. Find yet another solution to the riddle which doesn't require neither IIF nor Filter functions.

2. There is a flaw in the solution with the Head function which isn't there in the solution with Generate over Filter. Can you identify this flaw ?

3. In both solutions (with Head and with Generate over Filter) the Condition is evaluated twice. Is there a solution under given constraints which requires only single evaluation of solution ?

Good luck !

Published Saturday, June 09, 2007 12:01 AM by mosha
Anonymous comments are disabled
Privacy Statement