THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

ABC Analysis in PowerPivot

The calculation for ABC analysis can be made in PowerPivot using calculated columns. In this way each row can have an attribute with the appropriate ABC class. The ABC calculation has to be made considering a particular grouping and sort order.

For example, we wanted to define an ABC class of Products for internet sales transactions of AdventureWorks. We want to classify our Products in this way:

  • Products that make 70% of the sales are in Class A
  • Products that make 20% of the sales are in Class B
  • Products that make 10% of the sales are in Class C

Example with a single denormalized table

In this first example we use a completely denormalized table. A different approach would be necessary using a normalized product table.

WARNING: Please note that calculating ABC class on a denormalized table could have bad performances because it requires heavy calculation during data load. The normalized approach is far better in terms of performance, because the calculation only depends on the size of the related table (usually the dimension) and not on the size of the table containing the sale transactions.

This is the query we use to get data from AdventureWorks DW.

SELECT
    c.FirstName + ' ' + c.LastName AS Customer,
    c.EmailAddress,
    p.EnglishProductName,
    p.ModelName,
    s.OrderQuantity,
    s.SalesAmount
FROM dbo.FactInternetSales s
INNER JOIN dbo.DimProduct p
        ON p.ProductKey = s.ProductKey
INNER JOIN dbo.DimCustomer c
        ON s.CustomerKey = c.CustomerKey

There are several steps to obtain the desired ABC class, each of them is a new calculated column in the PowerPivot dataset. Before starting, consider that we will use the ProductName and the SalesAmount columns in our formulas.

  • SalesAmountProduct – this is the sales amount for each Product; the same value is duplicated for each row of the same product. The EARLIER function get the product name of the current row and the FILTER returns all the rows for the sales of the same product. This complexity is not required in the normalized version of the ABC calculation.
    = SUMX( FILTER( 'Sales',
                    'Sales'[ProductName] = EARLIER('Sales'[ProductName]) ),
            'Sales'[SalesAmount] )
  • CumulatedProduct – this value is the cumulated value of a product, considering them ordered from the top-seller down to the worst one. In this case, the EARLIER function is used to get the sales amount for the current product and the FILTER returns all the rows of the products that sold at least the amount value of the current product.
    = SUMX( FILTER( 'Sales',
                    'Sales'[SalesAmountProduct] >=
                               EARLIER('Sales'[SalesAmountProduct] ) ),
            'Sales'[SalesAmount] )
  • SortedWeightProduct – This calculation simply transforms the CumulatedProduct calculation into a percentage. This number will be used to filter all the transactions according to the percentage limit corresponding to ABC classes (usually A = 70%, B = 20% and C = 10%).
    ='Sales'[CumulatedProduct] / SUM( 'Sales'[SalesAmount] )
  • ABC Class Product – this is the final result of our calculation; depending of the value of SortedWeightProduct an A, B or C is displayed
    =IF( 'Sales'[SortedWeightProduct] < 0.7,
         "A",
         IF( 'Sales'[SortedWeightProduct] < 0.9,
             "B",
             "C" ) )

In this way, we obtain an ABC class for each product sale.

Pareto-01

At this point, we can browse the data with a PivotTable in Excel. Just to make an example, we can look at the relevance of ABC products for each customer (in this case customers are sorted by total of sales in descending order).

Pareto-02

Example with normalized tables

In this second example we use three normalized tables, importing from AdventureWorks DW the following tables:

  • DimCustomer
  • DimProduct
  • FactInternetSales

In this case, we defines these calculated columns in the DimProduct table. In this case will use the ProductKey to identify the product and SalesAmount as the measure to use for ABC classification.

  • SalesAmountProduct – this is the sales amount for each Product; in this case, each calculation is different for each row, because each row is a single product. We make use of the existing relationship between FactInternetSales and DimProduct (remember that we are defining a calculated column in the DimProduct table).
    = SUMX( RELATEDTABLE('FactInternetSales'),
            'FactInternetSales'[SalesAmount] )
  • CumulatedProduct – this value is the cumulated value of a product, considering them ordered from the top-seller down to the worst one. In this case, the EARLIER function is used to get the sales amount for the current product and the FILTER returns all the rows of the products that sold at least the amount value of the current product.
    = SUMX( FILTER( 'DimProduct',
                    'DimProduct'[SalesAmountProduct] >= 
                             EARLIER( 'DimProduct'[SalesAmountProduct] ) ),
            'DimProduct'[SalesAmountProduct] )
  • SortedWeightProduct – This calculation simply transforms the CumulatedProduct calculation into a percentage. This number will be used to filter all the transactions according to the percentage limit corresponding to ABC classes (usually A = 70%, B = 20% and C = 10%).
    = 'DimProduct'[CumulatedProduct] / SUM( 'DimProduct'[SalesAmountProduct] )
  • ABC Class Product – this is the final result of our calculation; depending of the value of SortedWeightProduct an A, B or C is displayed
    = IF( 'DimProduct'[SortedWeightProduct] < 0.7,
          "A",
          IF( 'DimProduct'[SortedWeightProduct] < 0.9,
              "B",
              "C" ) )

As we said, the calculated columns are all defined in the DimProduct table.

Pareto-03

Finally, we can do the same ABC analysis as in the case of the denormalized table used as a source of data. In this case we see the distribution of ABC class of products among models. As we might expect, there is a strong relationship, which indicates we might classify Models instead of products, producing a very similar results, but with a few notable exceptions (like the Road-650 model at row 11, which indicates there are smaller sales divided between very similar products). From a business point of view, this could be an important decision to be made (classifying ABC for models instead than for products), especially if products differs only for characteristics that doesn’t affect the production lines.

Pareto-04

Working with denormalized attributes on normalized tables

At this point, it could be interesting to look at how we can write the ABC classification based on the models of the product using the previous example with normalized tables. In this case, we will use a mix of the techniques we have seen in the two scenarios (denormalized and normalized tables).

The set of tables is the same as the previous example and we will continue to add columns to the DimProduct table. In this case, the ModelName column will be our key to identify the granularity of ABC classification.

  • SalesAmountModel – this is the sales amount for each product Model; in this case, calculation is the same for all the products of the same model. The EARLIER function get the model name of the current row (a single product) and the FILTER returns all the products of the same model.
    =SUMX( FILTER( 'DimProduct',
                   'DimProduct'[ModelName] =
                            EARLIER( 'DimProduct'[ModelName] ) ),
           'DimProduct'[SalesAmountProduct] )
  • CumulatedProduct – this value is the cumulated value of a model, considering them ordered from the top-seller down to the worst one. We have to use the CALCULATE function to sum a calculated column filtering only the rows (products) of a certain model. We cannot use the SUMX on the FILTER result, because it wouldn’t contain the SalesAmountProduct column we need to sum.
    = SUMX( FILTER( 'DimProduct' ,
                    'DimProduct'[SalesAmountModel] >=
                            EARLIER( 'DimProduct'[SalesAmountModel] )),
            'DimProduct'[SalesAmountProduct] )
  • SortedWeightProduct – This calculation simply transforms the CumulatedModel calculation into a percentage. This number will be used to filter all the transactions according to the percentage limit corresponding to ABC classes (usually A = 70%, B = 20% and C = 10%). Please look at the denominatore, which contains an aggregatable column (SalesAmountProduct) instead of the sum of the CumulatedModel column, which is not aggregatable!
    = 'DimProduct'[CumulatedModel] / SUM( 'DimProduct'[SalesAmountProduct] )
  • ABC Class Product – this is the final result of our calculation; depending of the value of SortedWeightProduct an A, B or C is displayed
    = IF( 'DimProduct'[SortedWeightModel] < 0.7,
          "A",
          IF( 'DimProduct'[SortedWeightModel] < 0.9,
              "B",
              "C" ) )

This is the new DimProduct with both the ABC Class Product and ABC Class Model (even if it is not visible in this screenshot, there are several cases where these two calculated columns don’t have the same value).

Pareto-05

Using the ABC Class Model, we can see that each model only belongs to one class.

Pareto-06

Common calculations

Coming back to DAX, we can see that the three cases are very similar and, just to simplify cut & paste of these calculations, we can customize just the first column, leaving standard DAX formula (without the name of the table) in the following ones. Usually, we will need to create just one ABC class in a PowerPivot workbook. For this reason, having these formulas ready to use could be useful. I use the example using SalesAmountProduct as a starting measure (we can simply search & replace this name with our own measure) and ‘Table’ to indicate the table in which we create these calculated columns with the data in PowerPivot.

  • SalesAmountProduct – this is the only calculation that differs for each implementation. Look at previous examples to check what DAX formula to use.
  • CumulatedProduct – in this case the denominator (highlighted in bold) shall correspond to the value of SalesAmount calculated at the granularity of Table. Thus, the following formula is good only for Product if the denormalized table Is the Product one, otherwise check the bold denominator because it must be adapted to conform to Table granularity (see the previous sections to look at some examples).
    = SUMX( FILTER( ‘Table’,
                    [SalesAmountProduct] >=
                          EARLIER( [SalesAmountProduct] ) ),
           [SalesAmountProduct] )
  • SortedWeightProduct
    = [CumulatedProduct] / SUM( [SalesAmountProduct] )
  • ABC Class Product
    = IF( [SortedWeightProduct] < 0.7,
          "A",
          IF( [SortedWeightProduct] < 0.9,
              "B",
              "C" ) )
Published Tuesday, January 19, 2010 3:24 AM by Marco Russo (SQLBI)
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

SQLBI - Marco Russo said:

I recently talked about memory usage in PowerPivot and previously I wrote about difference modeling options

February 4, 2010 5:06 AM
 

John Bradley said:

Hey Marco,

I met you at the Business Intelligence conference in New Orleans this last year and really enjoyed your presentations.  This is some awesome information for PowerPivot (which I totally enjoy, by the way).  The one question that I have though is whether we can do ABC analysis in PowerPivot but still make it context sensitive to slicers in the actual pivot table.  Looks like we have the ABC analysis only in terms of the entire data set and not any subset of the data that our slicers would automatically cut.  I am using your formulas for a pull of data that is pulled by each sale and not aggregated except on a few very specific instances, so I used your denormalized suggestions.  Any suggestions to make the data dynamic on the pivot table side?

Very Kind Regards,

John Bradley

johngabbradley@yahoo.com

April 5, 2011 11:16 AM
 

Marco Russo (SQLBI) said:

Hi John,

the bad news is this: we need to create an attribute that can be used as a slicer (rows/columns/filters) and a measure can only be used in the Values area of the PivotTable.

For this reason we require a calculated column, which cannot be dynamic.

Possible workaround: it wouldn't be as interactive as a PivotTable slicer, but what you can do is something like that:

- create a first worksheet with just parameter you want to use to run the ABC calculation; you might use a PivotTable as well, but at the end copy the resulting "filter" parameters into a set of cell that can be imported as a Linked Table in PowerPivot.

- refresh the Linked Table - this invoke the refresh of the related calculated columns

- create a second PivotTable in another worksheet of the same workbook in order to display the resulting ABC attribute. You can refresh this PivotTable after you updated the Linked Table

I know this is tricky, but it's what we have today.

Does it makes sense to you?

Marco

April 5, 2011 12:48 PM
 

John Bradley said:

Marco,

I am not following as well as I should be I think.  Can you be more explicit with the first example (Example with a single denormalized table) you gave in how you would do the whole process.  It would be very helpful.  I don't want to take too much of your time, but I think if you could be more detailed in how you would go through the process with examples with your data that might be very helpful.  

For the linked sheet that I am creating I not quite clear on what data that I would be relinking.  The sales data, the product data (in your example), a combination of both, or am I taking the calculated column that you name SalesAmountProduct.

Also would this work with a data set that gets updated each day on a sharepoint site?

Any additional help would be very helpful.  I hope that you can help me understand what how you would accomplish this.

Thanks very kindly!

John

April 5, 2011 2:00 PM
 

Marco Russo (SQLBI) said:

I don't have time in these days to prepare an example - I will work on that later if I can. It seems interesting.

However, it would not work on SharePoint because it would require the double refresh and I'm not sure how you might be able to implement it.

I put a task for that in my agenda.

Marco

April 5, 2011 2:06 PM
 

John Bradley said:

Thanks very much for you time.  I have learned a lot from your blog and wish you the best.

April 5, 2011 2:18 PM
 

MIke A said:

Hi Marco,

Is this ABC analysis available in a measure? I would like to do this for multiple categories.

Thanks,

Mike

June 22, 2012 4:30 PM
 

Marco Russo (SQLBI) said:

Mike,

yes you can implement the ABC Analysis in a measure but in this case you cannot use the result as a slicer or as a filter, because measures can be displayed only in the values area of a PivotTable.

To obtain the calculation in a measure, you have to expand all the calculated measures in a single DAX expression, using VALUES( Product[Name] ) instead of Product[Name] to get the value from a row context, wrapping everything with a check on the number of selected values (HASONEVALUE in PowerPivot 2012 or COUNTROWS(VALUES(...)) in PowerPivot 2010).

June 23, 2012 6:48 AM
 

Andre said:

Marco,

I'm trying to get this to work by creating a virtual table. unfortunately the earlier function in the sumx only sums the row and does not filter the table in the first place. see the code below. i can provide a sample data set if necessary.

evaluate

ADDCOLUMNS(

CALCULATETABLE(

              ADDCOLUMNS( SUMMARIZE(ClassMap,ClassMap[Product]), "Net Premium",

              Premiums[Premium] - Premiums[Rein Premium] + Claims[Incurred Claims] -

Claims[Rein Incurred]

              - Commissions[Commission] + Commissions[Rein Commission]

              )

),

              "CumulatedProduct", SUMX(FILTER(CALCULATETABLE(

              ADDCOLUMNS( SUMMARIZE(ClassMap,ClassMap[Product]), "Net Premium",

              Premiums[Premium] - Premiums[Rein Premium] + Claims[Incurred Claims] -

Claims[Rein Incurred]

              - Commissions[Commission] + Commissions[Rein Commission]

              )

), [Net Premium] >= EARLIER( [Net Premium])), [Net Premium]))

June 26, 2013 5:11 PM
 

Marco Russo (SQLBI) said:

Andre,

you cannot reference the "Net Premium" calculated column by using EARLIER, the problem is that you cannot iterate the table you just created...

Try this approach instead:

DEFINE MEASURE Premiums[Calc Net Premium] = Premiums[Premium] - Premiums[Rein Premium] + Claims[Incurred Claims] - Claims[Rein Incurred] - Commissions[Commission] + Commissions[Rein Commission]

EVALUATE

ADDCOLUMNS (

   ADDCOLUMNS (

       SUMMARIZE ( ClassMap, ClassMap[Product] ),

       "Net Premium", Premiums[Net Premium]

   ),

   "CumulatedProduct",

       SUMX(

           SUMMARIZE ( ClassMap, ClassMap[Product] ),

           IF (

               Premiums[Calc Net Premium] > [Net Premium],

               Premiums[Calc Net Premium],

               BLANK()

           )

       )

)

July 7, 2013 4:46 PM
 

ABC Classification With SQL Server Window Function | Ms SQL Girl said:

April 14, 2014 7:04 PM

Leave a Comment

(required) 
(required) 
Submit

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement