THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Mind that collation : SSAS

I’ve been using Analysis Services a lot lately in my current day job and recently encountered an issue that I believe people need to be aware of. The issue transpired to be a bug in our application rather than in Analysis Services itself but it was brought about by a misunderstanding of how Analysis Services handles uniqueness for dimension attributes and its worth highlighting the issue here.

I’ll demonstrate with a simple demo. Take this very simple [Product] dimension table:

product table

We only have two products in here but note the [ProductCategory], both have the same value of “fruit” but they have different case-sensitivity. As far as SQL Server is concerned those values are the same because I have set it to have a case-insensitive collation:

sql server collation

Note: The CI of Latin_General_CI_AS stands for case-insensitive

Analysis Services on the other hand has a case-sensitive collation:

analysis services collation

Onto that Analysis server I have deployed a [Product] dimension that has [Product Category] as an attribute:

product dimension

And when I process that dimension I get an error:

Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: 'dbo_Product', Column: 'ProductCategory', Value: 'fruit'. The attribute is 'Product Category'. Errors in the OLAP storage engine: An error occurred while the 'Product' attribute of the 'Product' dimension from the 'SSAS Collation Demo' database was being processed.

dimension processing error

 

OK, so what’s going on here? Well fundamentally the problem is that our SQL Server and Analysis Server are using different collations. Analysis Services asks SQL Server to give it a distinct list of [ProductCategory] and SQL Server responds by giving back a single value (‘Fruit’) and hence our [Product Category] attribute only has one member. When Analysis Services attempts to process the its key attribute (i.e. [Product]) it tries to specify the relationship between the two members and the appropriate member in the [Product Category] attribute; it fails to do that because ‘fruit’ does not exist (because Analysis Services is using a case-sensitive collation) and thus we get a processing error.

You may think that the simple solution would be to change the collation of our Analysis Server but in fact that probably is not the correct course of action. What we should do is change our [Product Category] attribute so that instead of being uniquely identified by the [Product].[Product Category] field it is instead uniquely identified by [Product].[ProductId]. In other words our dimension changes from this:

product dimension definition prior

to this:

product dimension definition post

Note the change in the <ColumnID> property.

After that, it all works because our unique list of [Product Category] is defined by [Product][ProductId] instead of [Product].[ProductCategory]!

 

I hope this has been useful to you. Its important to know about collations when SQL Server and Analysis Services are communicating with each other and hopefully this post serves to heighten the awareness.

Incidentally, this situation could probably be mitigated by normalising [ProductCategory] into a dimension table all of its own but given the lambasting I received last time I fired a shot across Kimball’s bows I think I’ll leave that one well alone :)

@Jamiet

Published Monday, February 15, 2010 8:29 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Hans Geurtsen said:

Hi Jamie, I'm not sure whether I agree to your solution, or maybe I am missing something. First of all, I believe the solution would be to solve the root cause of the problem: making sure 'Fruit' is always spelled the same way, the same case. Secondly, with the solution you suggest, you always get different product categories for each product, whether they have the same case or not. I don't think that's right, is it? Again, maybe I am missing something.

Regards, Hans.

February 16, 2010 1:34 AM
 

jamiet said:

Hi Hans,

Thanks for the comments.

"First of all, I believe the solution would be to solve the root cause of the problem: making sure 'Fruit' is always spelled the same way, the same case"

I totally absolutely 100% agree. However there may be situations where this has not been done and hence the situation I describe here could occur (as it did to me last week). In our case the value was a free text field and hence we were experiencing the problem. I should point out that in our case we weren't dealing with [Product], I merely used [Product] here for ease of demonstration.

"with the solution you suggest, you always get different product categories for each product, whether they have the same case or not"

Indeed, as far as Analysis Services is concerned they would all be different membes (even if the values were the same). This is a trade-off and something on which the developer would have to make a decision. In our case we chose to implement it as I have described here (largely because we weren't easily able to restructure the underlying database as you suggested).

What you say is correct and I still think it helps to be aware of the possibility of this problem occurring and it helps to enhance one's understanding of Analysis Services.

thanks again

Jamie

February 16, 2010 10:48 AM
 

Darren Gosbell said:

Hi Jamie,

Unfortunately I think your solution has a few problems. Consider what happens when you add a few more products

1 Apples  Fruit

2 Bananas fruit

3 Oranges fruit

4 Pears   Fruit

With Product ID as the key, when you display Product Categories you will see the following.

Fruit

fruit

fruit

Fruit

And if you query for [Product].[Product Category].[Fruit], SSAS will not show you the sum of Apples and Pears, rather it will return the first member that it finds (which may be non-deterministic). If this case sensitivity issue is widespread this would pretty much mean that you would not be able to properly analyze by Product Categories.

I think you would be better to either change the collation of the column in SQL, or to create a view to specifying the collation with the COLLATE clause, or possibly using COLLATE in a calculated column in the DSV (although I'm not sure off the top of my head if COLLATE is supported in the DSV)

February 16, 2010 3:33 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement