THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

Distinct Count Measure in PowerPivot using DAX

A PivotTable based on PowerPivot data doesn’t offer a native Distinct Count aggregation. To calculate it, you need to create a calculated measure in the PivotTable. For example, if we have a simple Sales table from AdventureWorks loaded in PowerPivot and projected into a PivotTable like the following picture.

Distinct-01

To create a new measure, we need to right-click on the field pane the menu “Add New Measure…”

clip_image004 

At this point we name the new measure Distinct Customers and write the DAX formula that calculates the number of rows returned by the DISTINCT function, which returns a list of the distinct values contained in a column (we choose EmailAddress in this example, but if we had a customer key we would have used it instead).

clip_image006

The DAX formula is:

COUNTROWS( DISTINCT( Table[Column] ) )

Where Table is the name of the PowerPivot Table containing the column and Column is the name of the column containing the values we want to count as distinct. Having defined this calculated measure, it is computed in the context defined by the PivotTable query. For example, we can look at the number of distinct customers that have bought each product, comparing that number with the number of sales for each product (which is always a number greater than or equal to the number of distinct customers).

Distinct-04

I’d like to add a personal note. It would have been better if a shorter syntax would have been available, and much better if PivotTable would have offered the Distinct Count as a possible option in the Summarize By options that are available for each native measure (current available options are Sum, Count, Min, Max and Average).

Distinct-05

Moreover, even a more direct DAX function like a DISTINCTCOUNT( [column] ) would have been easier and intuitive for “that guy”!

Published Saturday, December 26, 2009 2:02 PM by Marco Russo (SQLBI)
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Russell A said:

Thanks very much. I'm curious, if I have blank rows in my column data, is there a way to exclude these as a distinct value?

I've tried =countrows(distinct(allnoblankrow(tbl[clm]))) but no avail :)

March 7, 2011 5:16 PM
 

Marco Russo said:

March 8, 2011 4:33 AM
 

Serah said:

Can we use CountRows(Distinct()) for a column from aquery in Power Pivot.

Instead of selecting from table, i have columns from a query.

March 31, 2011 9:29 AM
 

Marco Russo (SQLBI) said:

Serah, what do you mean?

You can write = COUNTROWS( DISTINCT( Table[Column] ) ) but I don't understand what do you mean with "Instead of selecting from table, i have columns from a query."

Marco

March 31, 2011 9:45 AM
 

OtterMBA said:

I have a question on count(distint()) or distinctcount() - I use Denali for Excel.

I have a single table that I create a linked table into PowerPivot. On the table I have the following columns [sales staff],[customer],[year]. I have been able to do a distinctcount for 2009,2010, and 2011 just fine. ie.."=calculate(distinctcount([customer]),table1[year]=2009)"

I need to count all of the customers that we did business with in 2009 and 2010 but not 2011, 2011 only, 2010 only, etc. and every iteration of the years. Once I figure out the formula I know I can manipulate for the others.

Curretly I build a pivot table by salesperson,and customer by distinctcount by year. Then use an if formula in an empty cell like if(and(c1=1,d1=1,e1=0),1,0) to figure out customers for 2009 and 2010 only. Then I total all the 1s for each salesperson. Is there a way to create a measure in powerpivot to do this?

January 11, 2012 2:17 PM
 

Marco Russo (SQLBI) said:

Do you know the possible combination in advance, or do you want to select a list of "DO" and a list of "DO NOT" and having a dynamica calculation based on that?

In the latter, take a look at this scenario in this case, because is very similar to the requirement.

http://sqlblog.com/blogs/alberto_ferrari/archive/2011/05/19/powerpivot-basket-analysis-and-the-hidden-many-to-many.aspx

January 11, 2012 5:14 PM
 

Johannes said:

Is there a difference between countrows(distinct()) and distinctcount() ?

February 20, 2012 9:10 AM
 

Marco Russo (SQLBI) said:

Johannes, no differences - actually DISTINCTCOUNT is more clear from a semantic point of view, but internally COUNTROWS(DISTINCT()) is computed instead, so they are identical from a computational point of view.

February 20, 2012 6:01 PM
 

Johannes said:

Hey Marco, forgot to check back until now - thanks for clarifying that!

March 26, 2012 11:44 AM
 

Pathik Bhatt said:

Its been sometime since, I have been trying to get my head around this. I have imported information on close to 1.5 million customer orders in a powerpivot table. Customer orders generate different delivery packets. Sometimes for the same customer orders there are several packets generated. So the data looks like:

Order,Package

e1,p1

e1,p1

e1,p2

e1,p2

e1,p3

e2,p5

e2,p6 etc..

What I want to do is calculate in adjacent field a count of all unique packages that the customer order has generated. In the above example that would be:

Order,Package

e1,p1,3

e1,p1,3

e1,p2,3

e1,p2,3

e1,p3,3

e2,p5,x …

I just learnt about powerpivot and am not able to understand how this could be achieved. I need this field for calculation purpose. Many Thanks!

March 31, 2012 9:53 AM
 

Marco Russo (SQLBI) said:

Sorry for big delay in answering, I missed your comment!

Assuming the table you used as an example is called Orders, the solution is the following one (in PowerPivot 2012)

= CALCULATE(DISTINCTCOUNT(Orders[Package]),ALL(Orders[Package]))

In PowerPivot v1 you can obtain the same result by writing:

= CALCULATE(COUNTROWS(DISTINCT(Orders[Package])),ALL(Orders[Package]))

May 27, 2012 5:07 PM
 

HK said:

Thanks for posting this. It saved me a lot of time on a project.

January 23, 2013 11:23 AM

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement