THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

  • Create Excel Dashboards working on Excel for iPad #excel #ipad #dashboard

    I recently tried Excel for iPad and tried opening several workbooks. The results are pretty good, but I’ve found that it wasn’t possible to display certain workbooks. For example, opening a workbook that contains many CUBEVALUE formulas, I should have seen this result:

    Dashboard-Ok

    However, sometime I’ve got an error saying that the workbook cannot be updated (the screenshot is in Italian language because the test was made on an iPad configured with such a language)

    Dashboard-fail

    What’s happened? Thanks to Dan Parish (Microsoft), I realized that the problem was that opening the workbook, an automatic refresh was happening, and this stumbled into the CUBEVALUE function I extensively used. If this happen, you have two possible workarounds:

    1. Disable automatic refresh
    2. Replace CUBVALUE with GETPIVOTDATA

    The reason why conditional formatting is going away is because conditional formatting doesn’t work on error cells, and neither do charts.  If opening the workbook the automatic recalc starts, every CUBE* function fails (it’s not supported on Excel for iPad, because of the lack of support for external data in this version of Excel) because they can’t fetch any data. At this point I wondered why the automatic recalc was happening, because I didn’t enabled such a condition in the Excel workbook. However, Dan’s explanation is: The reason it wasn’t happening for me is because Excel (all the Excel’s across all platforms) also have very complicated logic to determine when they need to recalculate.  If you saved this from a previous version of Excel (earlier than 2013), or one of several other things occurred, we’ll recalculate it on open.  In some cases where we feel it’s safe not to however, we don’t as a performance improvement.  That’s what I was seeing.

    I am not sure that the different locale settings were the reason why automatic refresh was happening. However, when I set the calculation mode to Manual (Formulas ribbon –> Calculation Options –> Manual) the problem of automatic refresh went away, and I was able to open the workbook. If you don’t want to rely on disabling automatic refresh (after all, changing the setting to Manual affect usability when you open the workbook from Excel on Windows), another approach is relying on GETPIVOTDATA functions instead of CUBE*. The operation can be time consuming, but in reality it’s not so different than using CUBE* functions if you start designing the dashboard in this way. This will be a topic for a future, longer article. But there is an interesting advantage of using GETPIVOTDATA if you use Power Pivot or Tabular: performance are better than CUBE* functions (the opposite is true for OLAP cubes).

  • Create Custom Time Intelligence Calculations in #dax #powerpivot #tabular

    The recent Time Patterns article published in www.daxpatterns.com contains many DAX formulas that I hope will be useful to anyone is interest in implementing time-related calculations in DAX without relying on the Time Intelligence functions. There are several reasons for doing that:

    • Custom Calendar: if you have special requirement for your calendar (such as week-based and ISO 8601 calendars), you cannot use standard DAX time intelligence functions.
    • DirectQuery: if you enable DirectQuery, time intelligence functions are not supported.

    I chose to use a standard month calendar for the complete pattern, because it’s a more complete example of the calculation required. In fact, the ISO calendar has a simpler requirement for comparisons over different periods, and I also have another example for that published on the Week-Based Time Intelligence in DAX article published on SQLBI more than one year ago.

    As usual, feedbacks are welcome!

  • Calculate Distinct Count in a Group By operation in Power Query #powerquery #powerbi

    The current version of Power Query does not have a user interface to create a Distinct Count calculation in a Group By operation. However, you can do this in “M” with a simple edit of the code generated by the Power Query window.

    Consider the following table in Excel:

    DistinctPowerQuery_01

    You want to obtain a table containing the number of distinct products bought by every customer. You create a query starting from a table

    DistinctPowerQuery_02

    You keep in the query only the columns required for the group by and the distinct count calculation, removing the others. For example, select Products and Customers and right-click the Remove Other Columns menu choice.

    DistinctPowerQuery_03

    Select the Customer column and click the Group By transformation. You see a dialog box that by default creates a count rows column.

    DistinctPowerQuery_04

    This query counts how many transactions have been made by each customer, and you don’t have a way to apply a distinct count calculation. At this point, simply change the query from this:

    let
        Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
        R
    emovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
        GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(_), type number}})
    in
        GroupedRows

    To this:

    let
        Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
        RemovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
        GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(Table.Distinct(_)), type number}})
    in
        GroupedRows

    The Table.RowCount function counts how many rows exist in the group. By calling Table.DistinctCount here, you reduce the number of rows in the table to a list of distinct count values, returning a correct value.

    DistinctPowerQuery_05

    I hope Power Query team will implement a distinct count option in the user interface. In the meantime, you can apply this easy workaround.

  • Optimize DISTINCTCOUNT in #dax with SQL Server 2012 SP1 CU 9 #ssas #tabular

    If you use DISTINCTCOUNT measures in DAX, you know performance are usually great, but you might have also observed that the performance slow down when the resulting number is high (depending on other conditions, it starts decreasing between 1 and 2 million as a result).

    If you have seen that, there is a good news. Microsoft fixed this issue (KB2927844) in SQL Server 2012 SP1 Cumulative Update 9. Performance improvement is amazing. With this fix, I have queries previously running in 15 seconds (cold cache) now running in less than 5 seconds. So if you have databases in Tabular with a column containing more than 1 million distinct values, probably it’s better you test this update. It’s available also for Power Pivot for Excel 2010, but not for Excel 2013 (as far as I know – Power Pivot for Excel 2013 updates are included in Excel updates). You can request the SP1CU9 here: http://support.microsoft.com/kb/2931078.

    Please consider that the build of Analysis Services that fixes this issue is 11.0.3412 (so a following build should not require this hotfix – useful note for readers coming here in the future, when newer builds will be available).

  • Common Statistical #DAX Patterns for #powerpivot and #tabular

    DAX includes several statistical functions, such as average, variance, and standard deviation. Other common algorithms require some DAX code and we published an article about common Statistical Patterns on www.daxpatterns.com, including:

    I think that Median and Percentile implementation are the most interesting patterns, because performance might be very different depending on the implementation. I am sure that a native implementation in DAX of such algorithms would be much better, but in the meantime you can just copy and paste the formulas presented in the article!

  • How to implement classification in #DAX #powerpivot #ssas #tabular

    In the last two weeks we published two new patterns of www.daxpatterns.com:

    These two patterns offers solutions to the general problem of classifying an item by the value of a measure or of a column in your Power Pivot or Tabular data model. For example, you might create groups of products based on the price or on the volume of sales. The difference between the two techniques is that the static segmentation applies the classification using calculated columns (so it is calculated in advance - at refresh time - and is not subject to changes made to filter selection in queries), whereas the dynamic segmentation perform the classification at query time (so it is slower but it considers filters applied to queries).

    In my experience, many people want to use the dynamic approach, but in reality they often realize later that the static segmentation was the right choice, not just for performance but mainly for easiness of use.

  • Amsterdam PASS UG Meeting on March 18 #dax #tabular #powerpivot

    I will be in Amsterdam for the Advanced DAX Workshop on March 17-19, 2014 (hint: there are still a few seats available if you want to do a last-minute registration), and the evening of March 18 I will speak at a PASS Nederland UG meeting (between 18:30 and 21:00) that you can attend for free by registering here.

    Here are the two topics I will present at the UG meeting, in two relatively short 45 minutes sessions:

    DAX Patterns

    Do you know that great feeling when you are struggling to find a formula, spend hours writing non-sense calculations until a light turns into your brain, your fingers move rapidly on the keyboard and, after a quick debug, DAX starts to compute exactly what you wanted? This session shows some of these scenarios spending some time looking at the pattern of each one, discussing the formula and its challenge and, at the end, writing the formula. Scenarios include custom calendars, budget patterns and related distinct count. A medium knowledge of the DAX language will let you get the best out of the session.

    DAX from the Field: Real-World Case Studies

    In this session, we will dive into lessons from the field, where real customers are using DAX to solve complex problems well beyond the Adventure Works scenarios. How do you make a database fit in memory if it doesn’t fit? How do you handle billions of rows with a complex calculation? What tools can you use to benchmark and choose the right hardware? How do you scale up performance on both small and large databases? What are the common mistakes in DAX formulas that might cause performance bottlenecks? These are just a few questions we will answer by looking at best practices that are working for real customers. In this session, you will learn efficient DAX solutions and how far you can push the limits of the system.

  • LASTDATE vs. MAX? CALCULATETABLE vs. FILTER? It depends! #dax #powerpivot #tabular

    A few days ago I published the article FILTER vs CALCULATETABLE: optimization using cardinality estimation, where I try to explain why the sentence “CALCULATETABLE is better than FILTER” is not always true. In reality, CALCULATETABLE internally might use FILTER for every logical expression you use as a filter argument. What really matters is the cardinality of the table iterated by the FILTER, regardless of the fact it’s an explicit statement or an implicit one generated automatically by CALCULATETABLE.

    In addition to the article, there is a digression related to the use of time intelligence functions, which returns a table and not a scalar values. These functions (such as DATESBETWEEN and LASTDATE) might seem better than FILTER, but this is not necessarily true.

    For example, consider this statement:

    CALCULATE (

        SUM ( Movements[Quantity] ),

        FILTER (

            ALL ( 'Date'[Date] ),

            'Date'[Date] <= MAX( 'Date'[Date] )

        )

    )

    Can avoid the FILTER statement using DATESBETWEEN? Yes, we can replace the filter with the following expression:

    CALCULATE (

        SUM ( Movements[Quantity] ),

        DATESBETWEEN (

            'Date'[Date],

            BLANK(),

            MAX ( 'Date'[Date] )

        )

    )

    Is this faster? No. DATESBETWEEN is executed by the formula engine, it’s not better than FILTER. But there is more. You might wonder why I’m using MAX instead of LASTDATE. Well, in the FILTER example there was a semantic reason, I would have obtained a different result. LASTDATE returns a table, not a scalar value, even if it is a table containing only one row, which can be converted into a scalar value. More important, LASTDATE performs a context transition, which would transform the row context produced by the FILTER iteration into a filter context, hiding the existing filter context that I wanted to consider in my original expression. Now, in DATESBETWEEN I don’t have this issue, so I can write it using LASTDATE obtaining the same result:

    CALCULATE (

        SUM ( Movements[Quantity] ),

        DATESBETWEEN (

            'Date'[Date],

            BLANK(),

            LASTDATE ( 'Date'[Date] )

        )

    )

    But this is not for free. The LASTDATE function produces a more expensive execution plan in this case. Consider LASTDATE only as filter argument of CALCULATE/CALCULATETABLE, such as:

    CALCULATE (

        SUM ( Movements[Quantity] ),

        LASTDATE ( 'Date'[Date] )

    )

    At the end of the day, a filter argument in a CALCULATE function has to be a table (of values in one column or of rows in a table), so using a table expression in a filter argument is fine, because in this case a table is expected and there are no context transitions. But think twice before using LASTDATE where a scalar value is expected, using MAX is a smarter choice.

  • Expert Cube Development new edition now available! #ssas #multidimensional

    It is available the new edition of the advanced OLAP book, now called “Expert Cube Development with SSAS Multidimensional Models”. The previous edition was titled “Expert Cube Development with Microsoft SQL Server 2008 Analysis Services” and the biggest issue of the book was… the title! In fact, there haven’t been major changes in Multidimensional since that release, despite there has been 3 new releases of SQL Server (2008 R2, 2012 and 2014). For this reason we removed the version of the product from the title. In terms of content, don’t expect any particular change. We only added a small appendix about support for DAX queries, available with Analysis Services 2012 SP1 CU4 and later versions.

    I would like to highlight that, as Chris said:

    • If you already have the first edition, probably you’re not interested because you will not find new content here (just bug fixes and new screenshots)
    • The book is about SSAS Multidimensional models, if you are interested to Tabular we have another excellent book on that topic!
    • This is an advanced book, if you are a beginner with Multidimensional, wait to be more proficient before starting this book. The few negative reviews we received were from readers who tried to use this book to learn Multidimensional or as a step-by-step guide. We’d like to set the right expectations, avoiding you buy a book you don’t need.

    If you want to buy it, here are a few useful links:

    Have a good reading!

  • Implement Parameters using Slicers in #powerpivot #dax #tabular

    Apparently you cannot pass an argument to a DAX measure. In reality, you can interact with a slicer that has the only purpose of representing a parameter used by a DAX formula. You just create a table in the Power Pivot or Tabular data model, without creating any relationship with other tables in the same data model. This techniques is similar to the tool dimension you can implement in Multidimensional, but is simpler and somewhat more flexible. We described the DAX technique in Parameter Table pattern, published on www.daxpatterns.com.

    In the article we provided several examples, including how to implement cascading parameters using two slicers and presenting only valid combinations of parameters. If you miss the DateTool Dimension in Tabular, you will also see how to implement a Period table in DAX using the Parameter Table pattern, which is a good technique for injecting arguments and selecting algorithms in simulation models.

  • Connecting to #powerpivot from an external program (such as #Tableau)

    Many people requested me how to connect to Power Pivot from an external program, without publishing the workbook on SharePoint or on Analysis Services Tabular. I always said it is not possible (for both technical and licensing reasons), but someone observed that Tableau is able to extract data from a Power Pivot data model connecting directly to the xlsx file. I wanted to investigate how they solved the limitations that exists.

    From a technical point of view, you have to install a Tableau Add-In for Power Pivot Excel (it’s available for both 32 and 64 bit). Then, you connect using the Tableau Desktop software selecting the Microsoft Power Pivot connection. You choose a local Excel file and click Connect. The list of perspective appears in a combo box.

    connect

    You click ok and you navigate into the Power Pivot data model. But what’s happening? Tableau runs an Excel instance (probably by using Excel object model) and then connects through the Tableau Add-In for Power Pivot that you installed before. Probably this add-in acts as a bridge between the process running Excel and the process running Tableau. This solve the technical problem, and it would be interesting to know how to use the same add-in from other programs without having to write the same add-in again. I know many ISVs that would love to do that!

    But before starting your project in Visual Studio to do the same (after all, it shouldn’t be rocket science writing such a connector), consider the license agreement (EULA) of Office. It says that “Except for the permitted use described under "Remote Access" below, this license is for direct use of the software only through the input mechanisms of the licensed computer, such as a keyboard, mouse, or touchscreen. It does not give permission for installation of the software on a server or for use by or through other computers or devices connected to the server over an internal or external network.”. It seems we are in gray area here. The access to Excel is not direct. But at the same time, it is not made on another computer, and technically you are using keyboard, mouse and/or touchscreen when you are using Tableau Desktop.

    This is certainly an unsupported scenario (and if the background Excel process hangs for any reason, you have to kill it in Task Manager). But if the licensing allows that, or if Microsoft tolerate this, probably many companies writing software (I have a long list of requests I received…) could be interested in doing the same.

    I would love to hear some official word on this topic…

  • How to pass a #DAX query to DAX Formatter

    In its first two months, DAX Formatter served 3,500 requests and I see the daily trend slowly raising. If you have observed carefully the first articles published on DAX Patterns, you might have seen that you can click the link”Code beautified with DAX Formatter”.

    image

    When you click that link, you open the DAX Formatter page copying the query/formula shown in the box. The good news is that you can implement the same behavior in your articles/blogs/posts by using a GET or POST call.

    The easiest way is passing the query into the URL of a GET command:

    http://www.daxformatter.com/?fx=FORMULA&r=REGION

    The &fx argument is the dax code you want to format. The &r argument is optional and can be US (the default), UK or Other. Using Other you use the semicolon ( ; ) as a list separator, and comma ( , ) as decimal point, instead of the , and . settings used for US and UK. Here are two examples of the same query formatted with the two settings.

    http://www.daxformatter.com/?fx=EVALUATE%20calculatetable(Customers,Customer[Occupation]="Student")&r=US

    http://www.daxformatter.com/?fx=EVALUATE%20calculatetable(Customers;Customer[Occupation]="Student")&r=EU

    Using the URL might have different limits for its length, depending on the browser. We can consider 2000 characters as a practical limit. You can overcome this limitation by using a POST command. Here is an example of a simple html form that pass the content of a textbox as the query to format:

    <form action="http://www.daxformatter.com" method="post">
            <input type="hidden" name="r" value="US" />
            <textarea name="fx">EVALUATE calculatetable(Customers,Customer[Occupation]="Student")</textarea>
            <input type="submit" />
    </form>

    I have also received many feedback about many possible improvements of DAX Formatter – we’ll work on it, you just have to wait… but thanks for the support and appreciation!

    UPDATE Feb 27, 2014 

    You can now use the URL syntax with the additional arguments:

    embed=1 : request only the HTML formatted code

    font=n : optional - set the font size

    For example:  

    http://www.daxformatter.com/?embed=1&fx=EVALUATE%20calculatetable(Customers,Customer[Occupation]="Student")&r=US 

    http://www.daxformatter.com/?embed=1&font=22&fx=EVALUATE%20calculatetable(Customers,Customer[Occupation]="Student")&r=US

  • Distinct Count calculation on dimension attribute in #dax #powerpivot #tabular

    Creating Distinct Count calculations in DAX is pretty easy when the target of the operation is a column in the “fact table” of a star schema. When you want to apply the DISTINCTCOUNT function to a dimension attribute (such as Customer Country, Product Model, Employee Department, and so on), you need to apply one of the techniques described in the Related Distinct Count pattern we published on www.daxpatterns.com.

    Technically, this is an implementation of the many-to-many relationship calculation in DAX, but you can safely ignore the complexity behind the many-to-many patterns: just copy the DAX formula corresponding to the common scenarios:

    • Distinct Count on Attribute in Star Schema
    • Distinct Count on Attribute in Snowflake Schema
    • Distinct Count on Slowly Changing Dimension (SCD) Type 2

    We will continue publishing new patterns in the next weeks – let us know your feedback about the DAX Patterns website!

  • The Cumulative Total #dax pattern

    The first pattern published on www.daxpatterns.com is the Cumulative Total. Another common name of this calculation is Running Total, but the reason why we named the pattern in this way is that we identify all those scenarios in which you want to obtain a value, at a certain date, that corresponds to the result of a number of transactions executed in the past. Usually, this scenario is implemented using snapshot tables in the classical Kimball modeling. With a columnar table, you can afford avoiding the snapshot table even with a relatively large dataset.

    You might want to implement the Cumulative Total pattern to reduce the volume of data stored in memory, transforming snapshot tables into dynamic calculation in DAX. The examples shown in the article represent an implementation of the Inventory Valuation at any point in time. Remember, I am not saying snapshot tables can be avoided in Tabular, but you have to consider the alternative, especially when the size of snapshot table is an order of magnitude (or more) larger than the original transactions table. Do some benchmark and find the better solution for your model!

  • DAX Patterns website official launch! #dax #powerpivot #tabular

    I’m very proud to announce the official launch of the DAX Patterns website!

    http://www.daxpatterns.com

    I and Alberto Ferrari worked on this idea for a really long time. Many business scenarios can be solved in Power Pivot using always the same models, with just a few minor changes. This is especially true for the DAX formulas required. Sometime, you adapt the data model to the requirements and to the DAX needs. But many many times, you end up copying an old project, or an existing blog post or article, and try to adapt the old formula to the new data model. Sometime, you might lose time adapting details that were specific to a certain solution and do not fit well in your model. We just tried to put some order, giving names to common templates and creating more generic versions that can be easily adapted to specific requirements.

    Finding the name for this project has been the easiest part: DAX Patterns is simple, clear and direct. I don’t know if it is so intuitive for an Excel user as it is for a programmer, but we loved this name since the first moment and we never thought to anything else.

    The hard (and long) part is doing all the material job. Defining a format for each pattern, creating a web site, thinking about how to make the patterns easy to find. We based the search on the notion of “use case”. This should simplify web search.

    Every pattern has this format:

    • Basic Pattern Example: a very short and simplified application of the pattern in an easy example. You can understand the idea reading this short part, so you quickly realize whether the pattern might be good for your needs or not.
    • Use Cases: a list of practical examples in which you can apply the pattern. You can also search patterns by use cases in the website.
    • Complete Pattern: a complete and more detailed description of the patterns, with consideration about possible different implementations, performance, maintainability.
    • More Pattern Examples: other implementations of the pattern, sometimes required to show possible applications in a practical way (use cases are just descriptions).

    We already prepared many patterns that we will publish in the next weeks and months. This is still a work in progress, we don’t have all the patterns ready, but we reached a point where we have enough material to schedule for publishing while we are completing the job. Well, in reality I think we will never “complete” this web site, but we have a list of patterns we want to complete before publishing a book that will contain all of them, so that you will be able to have them also offline. In any case, we will look for new patterns based on feedback we gather from customers, students, readers. If you think there is a scenario you would like to be covered, write us a comment. I already have a list, but prioritization depends also on the feedback we get.

    You will find only a couple of patterns now, but you will see the list growing weekly, and I will mention new patterns on my blog when they will be available. If you find a pattern interesting, feel free to mention/link it from your blog, newsletter, forum posts, tweets, and so on. This will help other people finding the pattern when they look for a solution to a problem that the pattern can solve. And, of course, let us know if you have any feedback to improve DAX Patterns!

More Posts Next page »

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement