THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Microsoft OLAP by Mosha Pasumansky

Custom filters with MDX subcubes

Omniture Discover is not general purpose OLAP client tool. Rather it is specialized application that lets users to explore data from the Omniture’s Web Analytics product. But it does look and feel like an OLAP browser – which is not surprising, since “A” in OLAP stands for “Analysis”, and the product is all about analyzing web site performance. Omniture model feels like typical cube too - there are dimensions (called segments), and there are measures (called metrics), and there are drilldowns, pivots, aggregations etc. However, there is something not typical about this client tool – the way it handles slicers. Usually, OLAP client provides a UI element called slicer or filter or page field or something similar, where user can choose how the current view should be sliced. UI for slicer does differ between applications, from classic drilldown list to calendar control for Time dimension to something else, but there is always something to choose the current slice. Omniture Discover doesn’t have this. Instead, it uses different paradigm. User is expected to define “custom segments”, which really are dimension filters, but they are given names and list of these segments is stored in special folder. Then user can apply these predefined filters to any report view. The definitions of the segments are very flexible, they really can be arbitrary expressions – something from “Last 3 days” (filter on Time) to “Users referred by www.live.com through paid search campaign who bought product in specific category” (filter on Referer, Action and Product dimensions) etc. Dimensions used in the custom segment definition can be included in the report, and can be omitted from it – the filter is applied in either case. The only limitation is that only one such segment can be applied to the report at a time – if the user want to combine two custom segments, he needs to create a new one. First I was taken aback by how unusual this approach was, but then I saw that it could come in handy sometimes.

So this is an interesting approach, how could we add this capability to existing Analysis Services client tools ? Well, the only way to extend existing client UI functionality is through actions, so it is clear we will need to use actions here, but how can we implement custom segments in MDX ? Actually, MDX has perfect solution for that – and it is CREATE SUBCUBE statement – in fact – this is exactly what CREATE SUBCUBE was invented for in the first place – to be able to apply arbitrary custom filters on any MDX query. CREATE SUBCUBE sets this filter in the session state, and all MDX queries executed after it are affected  by it.

Let’s take few examples from Adventure Works 2008 cube to demonstrate this approach in action:

1. Filtering on named set.

In this example, we want to limit analysis to the top 50 customers. Fortunately, Adventure Works cube already has named set defined inside MDX Script which defines these customers. So, our CREATE SUBCUBE will  be fairly simple:

CREATE SUBCUBE [Adventure Works] AS (SELECT [Top 50 Customers] ON 0 FROM [Adventure Works])

The only tricky thing here to remember is how actions work in Analysis Services. The action expression is an MDX expression, which, in case of STATEMENT action type, should return string representing MDX statement to be executed by the client tool. Therefore, in the action expression field – we need to put a string with statement inside, not the statement itself. This will become more clear in example 3.

Here is how action definition will look like:

Name          : Top 50 Customers
Target type   : Cells
Target object : All cells
Type          : Statement
Expression    : "CREATE SUBCUBE [Adventure Works] AS (SELECT [Top 50 Customers] ON 0 FROM [Adventure Works])"
Invocation    : Interactive
Caption       : Top 50 Customers
Caption is MDX: false

This action now works fine inside cube browser – once executed, the current view needs to be refreshed (OWC doesn’t do it automatically after action execution) to reflect the currently set filter. The action will work in any client tool which properly supports statement actions. Unfortunately, it seems like Excel 2007 is not such a tool – I wasn’t able to get statement actions to work in it.

2. Filtering on multiple dimensions.

Of course the custom filters don’t have to be on single attribute, hierarchy or dimension – they can employ any combination of those. In this example, we want to filter on transactions performed by customers with low income, but on high priced products, perhaps this segment is susceptible for fraud.

We have Yearly Income attribute in the Customer dimension, and we will pick “10000-30000” bucket as “low income” indicator. The Product dimension has List Price attribute, and we will take all the members with value above $1700 as high priced products – in MDX this is best done with range set operator [Product].[List Price].&[1700.99] : NULL

CREATE SUBCUBE [Adventure Works] AS
(
SELECT 
   [Customer].[Yearly Income].&[0] * {[Product].[List Price].&[1700.99] : NULL}  ON 0
   FROM [Adventure Works]
)

The action definition would be

Name          : Potential Fraud
Target type   : Cells
Target object : All cells
Type          : Statement
Expression    : "CREATE SUBCUBE [Adventure Works] AS 
  (SELECT [Customer].[Yearly Income].&[0] * {[Product].[List Price].&[1700.99] : NULL} ON 0 
  FROM [Adventure Works])"
Invocation    : Interactive
Caption       : Potential Fraud
Caption is MDX: false

3. Dynamic (context dependant) filters

So far we only explored static filters – i.e. they always defined the same subspace, regardless of the position in the cube. This is how all Omniture custom segments work. However, MDX is, of course more powerful than that ! Action expression is called expression for a reason – it doesn’t have to return the constant string all the time. Let’s say we want to build a filter which will return last 3 months starting with the current month. In MDX we can do this easily using range set operator over Lag. The action target this time will be members of the Month attribute – because it only makes sense to count 3 months back if we are at granularity of Month (or if we are below it – so we could also define similar action over Days as well). The client tool will send the current Month coordinate, and it can be used inside MDX by referring to CurrentMember in the Month hierarchy. With some simple string concatenations (which unfortunately are always required to make dynamic action) we will get the following for action expression:

"CREATE SUBCUBE [Adventure Works] AS (SELECT " + 
[Date].[Month Name].CurrentMember.UniqueName + 
".Lag(2):" + [Date].[Month Name].CurrentMember.UniqueName + 
" ON 0 FROM [Adventure Works])"

The rest of parameters for the action are

Name          : Last 3 Months
Target type   : Attribute Members
Target object : Date.Month Name
Type          : Statement
Expression    : see above
Invocation    : Interactive
Caption       : Last 3 Months
Caption is MDX: false

4. Arbitrary complex filters

So far every filter we have discussed – was theoretically possible to define using standard slicer UI in client tools. That’s because every condition we put was either on single hierarchy or when it was on multiple hierarchies, they were combined using AND logic, i.e. using CrossJoin operator. But custom filters can be arbitrary complex – they can include any condition logic, combination of AND, OR, NOT, XOR and other logical operators. The good news are that even though it is usually impossible to set such complex filters in standard UI, it is totally possible with custom CREATE SUBCUBE statement. Let’s take the following example: We want to define a segment of customers with family. Definition of family means that either customer is married, or he has children. Customer dimension has the Marital Status attribute – so this will give us married or not information, and Customer dimension also has Number of Children attribute – everything which is not 0 – counts as having children – in MDX this can be easily coded with set unary operator "-". Now we need to figure out how to implement OR logic condition with sets – the answer is to use Union operator. We will union married customers no matter how many children they have, and customers with children no matter what their marital status is. Of course, there will be big overlap between these two sets, but CREATE SUBCUBE is going to do the right thing – it guarantees that there will be no double counting even if the set used to define it has duplicates.

CREATE SUBCUBE [Adventure Works] AS
(SELECT 
   {([Customer].[Marital Status].&[M], [Customer].[Total Children].[All Customers])
    ,([Customer].[Marital Status].[All Customers],-{[Customer].[Total Children].&[0]})}
ON 0 FROM [Adventure Works])

The resulting union set has interesting shape – it cannot be represented as pure crossjoin of attribute hierarchies. Such sets are nicknamed “arbitrary shaped sets” in Analysis Services, because they represent shapes of arbitrary complexity. Analysis Services in general doesn’t like arbitrary shaped sets, and there are often performance issues when using them, but if the business requirement calls for them – they can be used and will work correctly, although not most efficiently.

5. Clearing up

And, of course, we also need to provide a way for the user to remove whatever filter he had set. This is done with DROP SUBCUBE command, so we need to provision one more action to clear things up.

Name          : Clear Segments
Target type   : Cells
Target object : All cells
Type          : Statement
Expression    : "DROP SUBCUBE [Adventure Works]"
Invocation    : Interactive
Caption       : Clear Segments
Caption is MDX: false
Published Monday, November 03, 2008 11:30 PM by mosha
Filed under: ,
Anonymous comments are disabled
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement