THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

Calculate Distinct Count in a Group By operation in Power Query #powerquery #powerbi

The current version of Power Query does not have a user interface to create a Distinct Count calculation in a Group By operation. However, you can do this in “M” with a simple edit of the code generated by the Power Query window.

Consider the following table in Excel:

DistinctPowerQuery_01

You want to obtain a table containing the number of distinct products bought by every customer. You create a query starting from a table

DistinctPowerQuery_02

You keep in the query only the columns required for the group by and the distinct count calculation, removing the others. For example, select Products and Customers and right-click the Remove Other Columns menu choice.

DistinctPowerQuery_03

Select the Customer column and click the Group By transformation. You see a dialog box that by default creates a count rows column.

DistinctPowerQuery_04

This query counts how many transactions have been made by each customer, and you don’t have a way to apply a distinct count calculation. At this point, simply change the query from this:

let
    Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
    R
emovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
    GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(_), type number}})
in
    GroupedRows

To this:

let
    Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
    RemovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
    GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(Table.Distinct(_)), type number}})
in
    GroupedRows

The Table.RowCount function counts how many rows exist in the group. By calling Table.DistinctCount here, you reduce the number of rows in the table to a list of distinct count values, returning a correct value.

DistinctPowerQuery_05

I hope Power Query team will implement a distinct count option in the user interface. In the meantime, you can apply this easy workaround.

Published Monday, March 31, 2014 7:23 PM by Marco Russo (SQLBI)
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Ben said:

Thanks for the feedback Marco! We've started a thread to talk about this feature based on your post.

- PQ Team

April 7, 2014 1:36 PM
 

Marco Russo (SQLBI) said:

You're a great team - thanks! :)

April 7, 2014 3:37 PM
 

DBExcelAccounting@Blogspot.com said:

Thanks for the post - How about going the other way - I made a function to create an enumerated list type from a comma separated list.

The function takes a list where each item is text in the format integer (for the number of times to enumerate the item) followed by the text string (name, label or number )to be enumerated after the first comma).  This way calling the function on {"1,A" "2,XY,Z" "3,3"} gives {"A", "XY,Z", "XY,Z", "3", "3", "3"}

For me this was useful in writing a procedure to do algebraic operations on long lists of number and going from compressed to decompressed data and back.  I also have a version that works the same way on a 2 column table the same way.

(x as list) =>

let

   Source = List.Transform(x,(_)=> [ Count = Text.Start( _ , Text.PositionOf( _ , "," )), Name = Text.End(_,Text.Length(_) - Text.PositionOf(_,",")-1) ]),

   InsertedCustom = List.Transform(Source, (_) => enum(_[Name],Number.From(_[Count]),{})),

   enum = (x,y,z) => if y <2 then List.Combine({z, {x}}) else iter(x,y,z),

   iter = (x,y,z) => enum(x,y-1, List.Combine({z,{x}})),

  Output =  List.Combine(InsertedCustom)

in

   Output

April 25, 2014 11:30 AM
 

Marco Russo (SQLBI) said:

Thanks for the function!

May 4, 2014 11:31 AM
 

Tim Rodman said:

Any thoughts on how to do a COUNT with criteria, like the HAVING clause in SQL?

May 14, 2014 7:14 PM

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement