THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

  • Tips for adapting Date table to Power View forecasting #powerview #powerbi

    During the keynote of the PASS Business Analytics Conference, Amir Netz presented the new forecasting capabilities in Power View for Office 365. I immediately tried the new feature (which was immediately available, a welcome surprise in a Microsoft announcement for a new release) and I had several issues trying to use existing data models.

    The forecasting has a few requirements that are not compatible with the “best practices” commonly used for a calendar table until this announcement. For example, if you have a Year-Month-Day hierarchy and you want to display a line chart aggregating data at the month level, you use a column containing month and year as a string (e.g. May 2014) sorted by a numeric column (such as 201405). Such a column cannot be used in the x-axis of a line chart for forecasting, because you need a date or numeric column. There are also other requirements and I wrote the article Prepare Data for Power View Forecasting in Power BI on SQLBI, describing how to create columns that can be used with the new forecasting capabilities in Power View for Office 365.

  • Power Query in Modern Corporate BI–Copenhagen, June 3, 2014–#powerquery

    I will be in Copenhagen to deliver the SSAS Tabular Workshop on June 2-4, 2014 (few seats still available, but hurry up!).

    In the same week I will be a speaker in an evening community event, MsBIP møde nr. 21, delivering the Power Query in Modern Corporate BI session that I also presented at TechEd North America 2014 last week. It’s not just a session about Power Query, there is a broader scope related to Corporate BI vs. Self-Service BI, which could be open to many consideration. I think that the two worlds can (and should) collaborate, instead of fighting against each other, especially when there is an existing investment in Corporate BI. I hope to meet many of you there!

  • Implement Budget Allocation in DAX for Power Pivot and Tabular #powerpivot #tabular #ssas #dax

    Comparing sales and budget, or costs and budget, is a very common operation. However, it is often the case that you have different granularities for different tables containing budget and the data to compare with. There are two ways to do that: you can limit the comparison to the granularity that is common to the two tables, or you can allocate the budget where it’s not defined.

    For example, if you have a budget defined by quarter and category, you might want to allocate it by month and product. In this way, you will do the comparison as you had a more granular definition of the budget, without actually having to do the manual job of allocating data (usually in an Excel worksheet!).

    If you want to do budget allocation in DAX, you can use the Budget Patterns we published on DAX Patterns. If you come from and MDX/OLAP background, at first you might find it hard to solve the problem of not having attribute hierarchies that helps you in propagating the budget values to lower hierarchical levels. However, I think that once you get used to DAX, you will find the behavior very predictable and easy to “debug” also for more complex allocation formula. You just have to be careful in writing the DAX formula, but probably the pattern we wrote should help you designing the right data model, without creating physical relationships to the budget table!

    This pattern is also based on the Handling Different Granularities scenario I discussed a couple of weeks ago.

  • Meet me at TechEd 2014 – where and when #msteched

    If you are attending TechEd North America in Houston this week, stop me and say hello! I am always happy to meet blog readers, and of course if you have question or topic to discuss, try to join me at the BI booth expo. I tried to put a list of where and when you can find me (thanks to Kasper for the idea):

    • Tuesday, May 13, 10:45am-12:30pm at Microsoft booth in expo area (Data Platform and Business Intelligence: Data Platform)
    • Tuesday, May 13, 2:15pm-4:00pm at Microsoft booth in expo area (Datacenter & Infrastructure Managment: Application Solutions)
    • Tuesday, May 13, 6:30pm-8:30pm at Ask the Experts
    • Wednesday, May 14, 3:15pm-4:30pm in room 330 - DBI-B323 Power Query in Modern Corporate BI
    • Thursday, May 15, 8:30am-9:45am in room 330 - DBI-B322 Improving Power Pivot Data Models for Microsoft Power BI
    • Thursday, May 15, 12:30pm-3:15pm at  Microsoft booth in expo area (Datacenter & Infrastructure Managment: Application Solutions)

    If you are not attending TechEd, remember you will be able to see most of the recordings on Channel 9.

  • SSAS Tabular from the Trenches in London on June 11, 2014 #ssas #tabular

    I will be in London to teach the Advanced DAX Workshop on June 11-13, 2014 (if you are interested, there are still seats available).

    During my stay, I will also deliver an evening community session titles SSAS Tabular from the Trenches on June 11, 2014 at 6:30pm. The London Business Analytics Group organized this free event and you can find more a complete description of the content in a dedicated page, where you can also register (for free!). In a few words, I will share several experiences of SSAS Tabular adoption, in different scenarios, trying to help who is still struggling for adopting Tabular (or not). Questions and open discussions are always welcome in these events!

  • First steps with Scheduled Data Refresh for Power Query #powerbi #powerquery

    Just a few days before my session about Power Query at TechEd 2014, Microsoft released a new update that enables the scheduled data refresh of a Power Pivot workbook containing Power Query transformations.

    This is a very good news, because it enables the data refresh of a number of different data sources. Even if the number of providers supported by this release is limited (only SQL Server and Oracle), you can use a SQL Server database as a bridge to access different data sources through views using Linked Server connections.

    If you want to use this feature, first of all read carefully the Scheduled Data Refresh for Power Query blog post on MSDN web site. It guides you through are the steps required in order to enable the data source connection through the Data Management Gateway. As you will see, in reality you need to create the data source connections corresponding to the Power Query databases you use. Thus, in reality you might skip the data source configuration if you already have the corresponding databases enabled in the Power BI admin center. However, I suggest you to go through the steps described in that blog post at the beginning, because if the same database has two different drivers, it needs two different data sources. For this reason, I have a number of notes that might be helpful to avoid certain issues.

    • Power Query uses the .NET Framework Data Provider for SQL Server and Oracle Data Provider for .NET, whereas Power Pivot by default creates a SQL Server connection using the SQL Server Native Client 11.0 (SQLNCLI11).
      • Even if you already created a data source for a SQL Server database you refresh in a Power Pivot workbook, you have to create another data source for the same SQL Server database for Power Query, because you use two different drivers.
      • You might consolidate these data sources to only one, by changing the data provide in the advanced options of a Power Pivot configuration, but I am not sure this is a good idea. I would keep the two version of data sources, one for each provider, in case I use the same database in both connections
    • Power Query creates one connection string in Excel for each query you create. The connection string contains the entire transformation and when you copy it in the New Data Source page in Power BI admin, the internal query is analyzed to extract the required connection to SQL Server. If these connections are already configured as Power BI data sources, then you don’t need to do anything else. I suggest you to iterate all the queries you have following this step until you are confident of the internals and you are sure the required data sources are already available.
      • Even if you create a single query in M language accessing to different databases, the referenced connections will be found and each database will have a separate data source configuration in Power BI. I was worried that loading multiple tables from different database on the same server would have produced a single data source enabling to access all the databases on the server, but luckily this does not happen and security is preserved!
    • I spot an issue using certain DateTimeZone functions (DateTimeZone.FixedLocalNow, DateTimeZone.FixedUtcNow, DateTimeZone.LocalNow, and DateTimeZone.UtcNow) that seem not working with scheduled data refresh. You can read more about such issue in this thread on Power Query MSDN forum. I found a workaround using the Table.Buffer function, so that by stopping query folding the expression is not translated in SQL but evaluated directly by the Power Query engine. However, I hope this will be fixed soon.
    • A Power Query transformation that contains only a script, without accessing to any data source, currently is not refreshed. This would be useful for generating a Date table, I opened this other thread about this issue on the forum, I hope there will be news on that, too.
      • In the same thread you will find another tip: the literal in the form #literal, such as #table, are being mis-analyzed by scheduled refresh, but at least for this issue there are workarounds available, until the issue is not fixed by Microsoft.
    • You can use SQL Server views based on linked servers to overcome the limitation of providers currently supported by Data Management Gateway (which is the component used by scheduled data refresh).
    • Now that it is possible to publish SSIS packages as OData Feed Sources, you can expose a SQL Server view to Power BI, and accessing it from Power Pivot or Power Query, you can execute SSIS packages at refresh time. If the package is not too long to execute (it would timeout the connection), this is a smart way to arrange execution of some small “corporate ETL” in sync with the data refresh on Power BI, without relying on synchronized scheduling dates (which is always one more thing to maintain). This further extends the range of providers you can use with scheduled data refresh.

    I would like to get more detailed errors when something goes wrong and scheduled data refresh stops, but this is a good start.

  • LINQ to DAX project on CodePlex #dax #tabular #ssas

    Since its release, I've seen a number of scenarios where Analysis Services Tabular is the analytical engine of the reporting section in a larger system. In these conditions, at least a part of the queries sent to Analysis Services are DAX queries generated by code (as a consequence of user interaction or for other reasons).

    Since DAX knowledge is not very common among developers, having a LINQ to DAX Query provider is more than welcome to simplify DAX code generation. Dealogic, which is between the first companies I helped in Tabular adoption a few years ago, invested time in a first version of the LINQ to DAX provider that is now available on CodePlex and open to contributions.

    I looked at the features available and at the DAX code generated and it already looks very interesting. You can generate good DAX queries without knowing DAX (which I suggest to study anyway!), and LINQ to DAX does a lot of job splitting conditions between CALCULATETABLE, SUMMARIZE, ADDCOLUMNS and FILTER. Kudos to György Farkas for his effort!

  • How to handle fact tables with different granularities in #dax #powerpivot #tabular

    A common question I receive from Excel users learning Power Pivot is how to handle table that have different granularities. In reality, this terminology is not the one they use: the concept of “table granularity” is used mostly by Kimball practitioners, who immediately identify this scenario in a “two fact tables with different granularities” pattern. In Power Pivot this situation is often the reason of many troubles for Excel users, mostly because it is not clear how to correctly apply data modeling.

    Moreover, also who comes from a Multidimensional background does not know how to handle relationships between fact tables and a dimension at different granularities. You have the ability to define the dimension relationship at any (attribute) hierarchical level in Multidimensional, but it seems that this feature is not available in Tabular. In reality, we have two options, for example when you have data at the product category level and you want to join the product dimension:

    • Conform the relationship at the dimension granularity level (product category), hiding the measures coming from the fact table when the value is not valid (product name)
    • Import the fact table without defining a relationship in the data model, and simulate the relationship (at the product category level) using a DAX expression that applies a corresponding filter at query time

    I wrote an article about handling different granularities in the website, describing these two options in more details and providing practical examples. I think that both techniques are useful, because simulating the relationship in DAX is more flexible for many reasons, but there could be scenarios where the data volume suggests using an approaches based on a physical relationship creating a dummy value in the dimension. As always, I would use the simpler approach unless you think that performance are not good enough, and only at that point you evaluate which patterns performs better.

    As a side comment: I don’t know what approach is “simpler”, because simulating a relationship requires a more verbose DAX formula, whereas the relationship based on a dummy item requires some work at the ETL level, and pollutes the dimension table with items that are not strictly required (with the relevant exception of a Date table, where you might use an existing day as a dummy element). But the budget pattern will be the subject of a dedicated pattern very soon…

  • Upcoming conference speeches and workshops #ssas #tabular #dax #powerpivot

    Between May and July I and Alberto will be speaker at several conferences, and I think it could be useful to write a single blog post with a recap:

    We will also deliver several courses:

    See you around the world! 

  • The ISEMPTY function in #dax #powerpivot #tabular

    Microsoft silently added the ISEMPTY function to the DAX language in Analysis Services build 11.00.3368 (SQL Server 2012 SP1 CU4). This function is particularly important in DAXMD (when you use DAX to query a Multidimensional model), because produces a much better execution plan in OLAP than the alternatives based on COUNTROWS.

    There is an advantage in using it in Tabular/Power Pivot models, too, even if there is an issue using it in Power Pivot. You can upgrade Power Pivot on Excel 2010 (you can download the new version of Power Pivot for Excel 2010 as part of the cumulative update released for SQL Server), but you cannot upgrade Excel 2013. This is done only through Office updates and up to now such a feature has not been added to the released versions of Excel 2013 (at least until version 15.0.4605.1003). The funny thing is that, if you have Power Pivot for SharePoint, you can have a server that would be able to use new features (such as ISEMPTY function) but you are not able to create an Excel 2013 file using them!

    Last week I wrote a an article on SQLBI that describes the available syntaxes you can use to check empty table condition in DAX. You will find a few code examples there. I hope that Microsoft will soon release an upgrade in Power Pivot for Excel 2013, too.

  • ABC Analysis in #dax: complete pattern and other links #powerpivot #tabular

    I recently published the ABC Classification article in, which is a more structured and organized way that recap what I already described in this blog a few years ago (see ABC Analysis in PowerPivot). The pattern describe how to implement the classification through calculated columns, so we consider it a specialization of the Static Segmentation pattern. You can implement it also as a measure, implementing a Dynamic Segmentation, and Gerhard Brueckl already described such implementation in his blog. I am not sure about creating a pattern for the dynamic version, because of the performance issue that could arise even with a few thousands of items to classify.

    Any feedback on this is welcome, we already have other patterns in the working, but we can always change prioritization based on comments!

  • Create Excel Dashboards working on Excel for iPad #excel #ipad #dashboard

    I recently tried Excel for iPad and tried opening several workbooks. The results are pretty good, but I’ve found that it wasn’t possible to display certain workbooks. For example, opening a workbook that contains many CUBEVALUE formulas, I should have seen this result:


    However, sometime I’ve got an error saying that the workbook cannot be updated (the screenshot is in Italian language because the test was made on an iPad configured with such a language)


    What’s happened? Thanks to Dan Parish (Microsoft), I realized that the problem was that opening the workbook, an automatic refresh was happening, and this stumbled into the CUBEVALUE function I extensively used. If this happen, you have two possible workarounds:

    1. Disable automatic refresh

    The reason why conditional formatting is going away is because conditional formatting doesn’t work on error cells, and neither do charts.  If opening the workbook the automatic recalc starts, every CUBE* function fails (it’s not supported on Excel for iPad, because of the lack of support for external data in this version of Excel) because they can’t fetch any data. At this point I wondered why the automatic recalc was happening, because I didn’t enabled such a condition in the Excel workbook. However, Dan’s explanation is: The reason it wasn’t happening for me is because Excel (all the Excel’s across all platforms) also have very complicated logic to determine when they need to recalculate.  If you saved this from a previous version of Excel (earlier than 2013), or one of several other things occurred, we’ll recalculate it on open.  In some cases where we feel it’s safe not to however, we don’t as a performance improvement.  That’s what I was seeing.

    I am not sure that the different locale settings were the reason why automatic refresh was happening. However, when I set the calculation mode to Manual (Formulas ribbon –> Calculation Options –> Manual) the problem of automatic refresh went away, and I was able to open the workbook. If you don’t want to rely on disabling automatic refresh (after all, changing the setting to Manual affect usability when you open the workbook from Excel on Windows), another approach is relying on GETPIVOTDATA functions instead of CUBE*. The operation can be time consuming, but in reality it’s not so different than using CUBE* functions if you start designing the dashboard in this way. This will be a topic for a future, longer article. But there is an interesting advantage of using GETPIVOTDATA if you use Power Pivot or Tabular: performance are better than CUBE* functions (the opposite is true for OLAP cubes).

  • Create Custom Time Intelligence Calculations in #dax #powerpivot #tabular

    The recent Time Patterns article published in contains many DAX formulas that I hope will be useful to anyone is interest in implementing time-related calculations in DAX without relying on the Time Intelligence functions. There are several reasons for doing that:

    • Custom Calendar: if you have special requirement for your calendar (such as week-based and ISO 8601 calendars), you cannot use standard DAX time intelligence functions.
    • DirectQuery: if you enable DirectQuery, time intelligence functions are not supported.

    I chose to use a standard month calendar for the complete pattern, because it’s a more complete example of the calculation required. In fact, the ISO calendar has a simpler requirement for comparisons over different periods, and I also have another example for that published on the Week-Based Time Intelligence in DAX article published on SQLBI more than one year ago.

    As usual, feedbacks are welcome!

  • Calculate Distinct Count in a Group By operation in Power Query #powerquery #powerbi

    The current version of Power Query does not have a user interface to create a Distinct Count calculation in a Group By operation. However, you can do this in “M” with a simple edit of the code generated by the Power Query window.

    Consider the following table in Excel:


    You want to obtain a table containing the number of distinct products bought by every customer. You create a query starting from a table


    You keep in the query only the columns required for the group by and the distinct count calculation, removing the others. For example, select Products and Customers and right-click the Remove Other Columns menu choice.


    Select the Customer column and click the Group By transformation. You see a dialog box that by default creates a count rows column.


    This query counts how many transactions have been made by each customer, and you don’t have a way to apply a distinct count calculation. At this point, simply change the query from this:

        Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
    emovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
        GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(_), type number}})

    To this:

        Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
        RemovedOtherColumns = Table.SelectColumns(Source,{"Product", "Customer"}),
        GroupedRows = Table.Group(RemovedOtherColumns, {"Customer"}, {{"Count", each Table.RowCount(Table.Distinct(_)), type number}})

    The Table.RowCount function counts how many rows exist in the group. By calling Table.DistinctCount here, you reduce the number of rows in the table to a list of distinct count values, returning a correct value.


    I hope Power Query team will implement a distinct count option in the user interface. In the meantime, you can apply this easy workaround.

  • Optimize DISTINCTCOUNT in #dax with SQL Server 2012 SP1 CU 9 #ssas #tabular

    If you use DISTINCTCOUNT measures in DAX, you know performance are usually great, but you might have also observed that the performance slow down when the resulting number is high (depending on other conditions, it starts decreasing between 1 and 2 million as a result).

    If you have seen that, there is a good news. Microsoft fixed this issue (KB2927844) in SQL Server 2012 SP1 Cumulative Update 9. Performance improvement is amazing. With this fix, I have queries previously running in 15 seconds (cold cache) now running in less than 5 seconds. So if you have databases in Tabular with a column containing more than 1 million distinct values, probably it’s better you test this update. It’s available also for Power Pivot for Excel 2010, but not for Excel 2013 (as far as I know – Power Pivot for Excel 2013 updates are included in Excel updates). You can request the SP1CU9 here:

    Please consider that the build of Analysis Services 2012 that fixes this issue is 11.0.3412 (so a following build should not require this hotfix – useful note for readers coming here in the future, when newer builds will be available).

    UPDATE 2014-07-22: for the following major release, Analysis Services 2014, the fix has been released after RTM. You need the Build 12.00.2342 (or a more updated one), which is available in Cumulative Update 1 for RTM.

This Blog



Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement