THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

Difference between DISTINCT and VALUES in DAX

I recently got a question about differences between DISTINCT and VALUES in DAX and thanks to Jeffrey Wang I created a simple example to describe the difference. Consider the two tables below: Fact and Dim tables, having a single column with the same name of the table. A relationship exists between Fact[Fact] and Dim[Dim]. This relationship generates a referential integrity violations in table Fact for rows containing C, which doesn’t exist in table Dim. In this case, an empty row is virtually inserted into the table Dim and all the rows in Fact that don’t have a correspondent member in Dim will point to this blank row. Such a row is the only difference between DISTINCT and VALUES.

image

The measure “Count Distinct” is defined as =COUNTROWS(DISTINCT(Dim[Dim]))
The measure “Count Values” is defined as =COUNTROWS(VALUES(Dim[Dim]))

As you can see, the VALUES call returns the blank row in Dim that has been added to match unrelated rows in Fact, whereas the DISTINCT call only returns values that exists in the original Dim table. This is the only case when there is a difference in the behavior of these two functions. It is important to understand that a BLANK value existing in the original table, either in a source column or in a calculated column, will be considered by both VALUES and DISTINCT in the same way, as a regular member that will not be ignored by the DISTINCT function. You can see the extra blank value in the rows of the PivotTable in the screenshot above.

Not all the BLANK are the same in PowerPivot!

IMPORTANT UPDATE 04/14/2011

As you can read in the comments, if you want to calculate the number of non blank distinct values in a column, regardless of related values, a smart formula is:

= COUNTAX( DISTINCT( Dim[Dim] ), Dim[Dim] )

Published Tuesday, March 08, 2011 2:39 AM by Marco Russo (SQLBI)
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Russell A said:

This is where Countrows(distinct( really fails. Why on earth does it count a blank row as a distinct value? Count distinct in SQL does not include blank values in their counts :(

March 9, 2011 2:53 PM
 

Marco Russo said:

I agree that the actual behavior is not intuitive...

March 9, 2011 3:02 PM
 

Russell A said:

Thanks for the reply! Love the giving data meaning book.

My question I guess is that is there a way to effectivly negate the aggregation of BLANK within a countrows(distinct() measure?

Many many thanks,

-Russell

March 9, 2011 5:13 PM
 

Marco Russo said:

Well, I understand it's not so elegant, but a possible solution is the following one:

=COUNTROWS(DISTINCT(Table1[Col1])) - CALCULATE(COUNTROWS(DISTINCT(Table1[Col1])), ISBLANK(Table1[Col1]))

March 9, 2011 5:26 PM
 

Russell A said:

It's beautiful to me.. thanks I was trying ISBLANK but not within a CALCULATE context.

This helps a lot! Cheers,

-R

March 9, 2011 5:42 PM
 

Jeffrey Wang said:

COUNTA(Table1[Col1]) does not include blanks.

March 12, 2011 2:30 AM
 

Marco Russo (SQLBI) said:

Jeffrey,

thanks, but the initially questions was about calculating the distinct values ignoring blanks. However, your note suggested me this alternative way:

=COUNTROWS(DISTINCT(Table1[Col1])) - IF(COUNTBLANK(Table1[Col1]) > 0, 1, 0 )

Is it faster than the CALCULATE version in my previous comment?

I suppose it could be, but...

Thanks again!

Marco

March 13, 2011 5:08 PM
 

Jeffrey Wang said:

COUNTAX(DISTINCT(Table1[Col1]), [Col1])

March 14, 2011 5:41 PM
 

Marco Russo (SQLBI) said:

Oh, right - thank you Jeffrey, I'm going to update the blog post!

March 14, 2011 6:00 PM
 

Luis Simoes said:

Why not use DISTINCTCOUNT??

May 13, 2014 6:14 AM
 

Marco Russo (SQLBI) said:

You're right, but DISTINCTCOUNT have been introduced after we published this blog post :) the first version of Power Pivot in Excel 2010 didn't have it.

May 13, 2014 8:36 AM

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement