THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

Please visit the same blog at - comments are now disabled here but you can comment on the new blog (all blog posts available here are available there, too).
You can follow me on Twitter: @marcorus

  • Same blog, new home

    The historical content of my blog is also available on, and future posts will appear there first.

    I will continue to keep this blog (on updated when I will publish new posts, including a link to the full post published on SQLBI.

    You can register the new blog RSS feed or register by mail, too (the feed includes both blogs and articles published on SQLBI).

  • New function and the importance of variables in DAX #dax

    I recently wrote a few articles about a new function in DAX and the use of variables. For those of you using Power BI or Azure Analysis Services, you always have access to the latest version of the language. Well, in reality I see that new features appears in Power BI Desktop first, and shortly after in Azure Analysis Services. I’m also working on something that will make it easy to recognize which functions is available in each product/version, but this will still require a few months…

    So, what is new in DAX? The SELECTEDVALUE function! It’s only syntax sugar, so you can write:

    SELECTEDVALUE ( Table[column], "default value" )

    instead of

    IF ( HASONEVALUE ( Table[column] ), VALUES ( Table[column] ), "default value" )

    You will find more details and use cases in the Using the SELECTEDVALUE function in DAX article I recently wrote on SQLBI. My only concern is that this could increase the use of a bad practice, which is getting the value of the current row after a context transition. Something like:

    Company := SELECTEDVALUE ( Customer[Name] )

    AvgLength := AVERAGEX ( Customer, LEN ( [Company] ) )

    instead of:

    AvgLength := AVERAGEX ( Customer, LEN ( Customer[Name] ) )

    If you think nobody would do that… yes, you’re right, not for the LEN function. But think again to all the calculations that requires a particular granularity and that you would like to include in a measure instead of a calculated column, and then you will realize that making this mistake is more common that you might think. Yes, in more complex expressions, of course.

    The other topic that I covered in two articles is the use of variables, which you will find discussed in:

    The basic idea is that you can simplify the syntax of complex DAX expressions by using variables, and the result you obtain is something that is much more similar to an M script rather than the classic nested sequence of function calls in DAX, as I described in another article, DAX coding style using variables. You can see the benefits of this approach in complex calculations, so this is not something that is very visible in short articles, and it will be the major change of future books and patterns about DAX (which is what I’m working on these days).

  • Learn DAX, data modeling, and data visualization after MS Data Insight Summit #dax #powerbi #dataviz

    I spent three days in Seattle at Microsoft Data Insight Summit, delivering a preconference day about data modeling for business users, and two workshops about DAX. All of these were sold-out, so even if you attended the conference, you might be interested in content that cover the same topics. And if you were in one of my sessions, you probably would like to get material to study. For this reason, I think it is useful to provide a complete recap here:


    • Data modeling: the preconference day was based on the book Analyzing Data with Power BI and Power Pivot for Excel. We do not have a corresponding classroom course at the moment, but if you are interested in this preconference day, I and Alberto Ferrari will deliver an updated version of the Data Modeling with Power BI at PASS Summit 2017 in Seattle, on October 30, 2017.
    • Learning DAX: I and Alberto Ferrari delivered two workshops introducing the DAX language to beginners. The workshops were not registered, but we have a corresponding free video course that has the same content and similar examples. Attending this free online course is a suggested preparation for the more advanced Mastering DAX workshop.
    • Data Visualization: this is the last day to take advantage of the launch offer for the video course Power BI Dashboard Design, saving 20$ from its list price.

    You do not have any excuse now: you can get some lecture and then make practice!

  • User group meetings and #dax training in Minneapolis and Chicago

    In the next three weeks I will deliver many speeches and training in United States. Here is the list:

    The Microsoft Data Insight Summit is completely sold-out, so if you didn’t register more than one month ago, you will not be able to attend. Luckily, several sessions will be broadcasted live, and others will be recorded (slides and audio, no video) and made available a few weeks after the conference.

    There are still seats available for the workshops in Minneapolis and Chicago. If you want to attend to the user group session on June 19 organized by the Microsoft BI User Group Minnesota, take a look at the entire program for the afternoon (there is another presentation delivered by Dan English before mine). The community event is free but you have to register in advance (seats are limited there, too!).

  • How to design beautiful dashboards in Power BI – the cure: 15 rules for everyone #powerbi

    Is there a secret recipe to create a beautiful dashboard? Why do we need to use “beautiful”? Would “useful” be a better adjective?

    If you think for a moment, you know that we are attracted by the “beautiful”. It is important for a first impression, but after that, you want to discover more, and at that point other factors are considered. I usually work in many of these “other factors”. The data model, the accuracy of the numbers, the performances that could affect usability. You need all of this, and as “technical” people (data analysts and BI developers are both classified as “technicians”), too often we simply underestimate the importance and the efforts required to work on the “last mile” of a BI system.

    I always tried to understand the “secret recipe” of a good dashboard. I’m not talking about a Power BI dashboard now, I’m considering the broader definition of a dashboard that you might implement in a single page of a Power BI report. Is it the graphic? The colors? The numbers? The position? The alignment?

    I’ve never been a good dashboard designer, but over time I understood how to recognize dashboards that were not working well. However, fixing them is not easy. When I started working with Daniele Perilli, I’ve seen that “beautiful” was not enough. A dashboard is a communication tool, and there is a huge bibliography of what you should do and what you should not. Personally, I think that Stephen Few created the best reference for this industry, and if you spend 40 hours a week designing charts, reports, and dashboards, you should stop reading this post now, and spend your time reading his books.

    However, I don’t spend all the time designing dashboards, but that “last mile” is what communicate to end users the result of a large effort made by DBAs, developers, and data analysts. It is also the way to show the result of a one-man-show model you worked on for hours, days, or weeks. Where to start? How to apply rules that are easy to learn and to implement, in a limited amount of time?

    I struggled with these questions for years. Working with Daniele, we started to think about a learning path that would be accessible to anyone, providing the basic principles that you need to apply in a good dashboard in a very direct and practical way. This initial idea started to grow, and it was one of the reasons we invested in the custom visuals that you can find on okviz website. Sometimes we didn’t have the right tool to implement a few principles required for a good dashboard, so we (well, Daniele) created it.

    Nevertheless, custom visuals are not the solution. They could be a tool, a useful and necessary tool, but you can always obtain a good dashboard with a very limited set of features (just think about Power BI Desktop in October 2015, if you tried it at that time). How is it possible? The reason is very simple. You can have a very powerful tool, but just because you have a feature, it doesn’t mean you need it or you have to use it. You have to make choices, and you need a plan.

    Thus, Daniele created the master plan of the 15 rules that all of us can apply to any dashboard to obtain a beautiful and useful result. When I say “all of us”, I include myself. I don’t have 1% of Daniele’s design skills, but I can apply his 15 rules to my reports, and after “the cure”, my reports are always much better. Not as beautiful as those designed by Daniele, but “good enough” to be shown to a larger audience without hearing people laughing or being distracted by useless details. They see the numbers, the trends, and get the message from the dashboard. This is the goal, after all.

    Once Daniele completed this huge amount of work, we faced the issue of delivering such a content. After considering many other options, our decision has been to learn the content from Daniele, make practice, and record a video training using slides and demos prepared by him. This helped us (me and Alberto Ferrari) using a language that is probably less “technical” than what we typically use in our DAX classes, and this was another goal of our choice.

    The third of the 15 rules is “keep it simple”. Here is how a dashboard looks like before and after implementing this rule (please note there are other 12 rules to apply after that to complete “the cure”).



    Applying the rules could be harder than you think. You can watch the videos related to the “keep it simple” rule in the free preview of the Power BI Dashboard Design Video Course.

    The course also includes a reference that will help you choosing the right chart, and you can find a complete description of the course content here.

    It has been a very long journey, but I am really proud of this work. Now, I really look forward to read feedbacks and reviews!

  • The new data modeling book for Power BI and Power Pivot users

    In the last few years, I and Alberto Ferrari assisted many users of Power Pivot and Power BI who wanted to create their reports using these tools and were struggling with getting the desired numbers from their data. The first approach of an Excel user is to look for a function, or a more complex DAX expression, that can implement the calculation required. In a similar way, this is also the approach of Power BI users that don’t have a database design background.

    Several times, before looking for a correct DAX expression, the problem is defining the correct data model, putting data in the right tables, and creating the proper relationships. The tools to create queries in M (Power Query in Excel, now called Get Data, and the Query Editor in Power BI) are wonderful to help the users in doing the proper massage to the data. However, these tools cannot help users that don’t know how to properly define the right data model. For this reason, when we thought to a new version of the book about Power Pivot (we previously wrote one for Excel 2013/2016 and one for Excel 2010) we considered that the DAX chapters were no longer necessary (because we have The Definitive Guide to DAX for that), so we decided to dedicate an entire book to the data modeling, targeted to business users that usually do not have such a skill, or that learned that by trial and errors and without a more structured approach.

    Because the concepts are identical for Power Pivot and Power BI, we wrote a single book that target both products: Analyzing Data with Power BI and Power Pivot for Excel. Several examples are created in Power BI Desktop (because it is free and available to anyone), but certain reports are created using pivot tables in Excel. The goal is to teach the concepts, rather than providing formulas to copy and paste. In fact, the reader should try to apply the same ideas to its own data, recognizing the data modeling patterns described in the chapters of the book.

    We tried to minimize the use of theoretical terms, trying to introduce with very practical examples and design patterns the terminology that is commonly used in data modeling (such as fact, dimensions, normalization, denormalization, star schema, snowflake schema, and so on). Thus, is this book for you?

    • If you are an advanced Excel user that adopted Power Pivot or Power BI, definitely yes.
    • If you are a BI developer, you should already know the theory, and this book could be useful to see practical examples using a model in these tools. Thus, it could be useful (well, several models are really easy to implement as design patterns), but you should not learn new concepts.
    • If you are a data analyst that moved to Power BI from other tools, then it depends on your background. You might be in the middle of the two cases described above, but in any case the book should be useful for you, too.

    At this point, you might wonder why we did not include the words “data modeling” in the book title. The simple answer could be “marketing”, but the real answer is more complex. A business user hardly recognizes the need of data modeling skills. He/she just want to obtain a result, analyzing its own data. He/she might think that data modeling is something for DBAs, and is not related to reporting. However, when you create a report in Power Pivot and Power BI, usually you are creating a data model (unless you create a live connection to an existing Analysis Services database). For this reason, the title should help these users to find this book and to take look at its description. Yes, it is a book for data modeling targeted to readers that do not know that they need such a skill. Thus, the reason for this title is “communication”: We hope that the book description is clear enough to avoid any misunderstanding!

    This is the table of contents of the book:

    • CHAPTER 1 – Introduction to data modeling
    • CHAPTER 2 – Using header/detail tables
    • CHAPTER 3 – Using multiple fact tables
    • CHAPTER 4 – Working with date and time
    • CHAPTER 5 – Tracking historical attributes
    • CHAPTER 6 – Using snapshots
    • CHAPTER 7 – Analyzing date and time intervals
    • CHAPTER 8 – Many-to-many relationships
    • CHAPTER 9 – Working with different granularity
    • CHAPTER 10 – Segmentation data models
    • CHAPTER 11 – Working with multiple currencies
    • APPENDIX A – Data modeling 101

    Remember, when the data model is correct, the DAX code you need is simpler, shorter, and more efficient. The first rule is always “keep it simple”, and a good data model is always the first step in the right direction!

  • Small benchmark for Tabular and Power BI formula engine (FE) #ssas #tabular #powerbi

    If you run SSAS Tabular or Power BI, please help me in gathering some performance data for a study I'm doing about Formula Engine performance following these steps: 

    1) Copy the PowerShell code below in a script Benchmark.ps1, modifying serverName, ssasInstanceName and databaseName so that they point to an existing database of an instance of Analysis Services Tabular. If you use a default instance, assign an empty string to ssasInstanceName. Any database that exists is valid. If you want to connect to Power BI, retrieve the address using DAX Studio and write the complete address in the serverName variable (e.g. "localhost:1234")

    2) Run the PowerShell code redirecting the output to a Benchmark.json file (eg. Test.ps1 > Benchmark.json)

    3) Send me the JSON file in a mail (even copy/paste the content is ok) to marco (at) sqlbi [dot] com

    Thanks in advance for your help!


    ## Specify the name of the server where SSAS runs
    $serverName = "." 
    ## Specify the instance name of SSAS Tabular (use empty string to use the default instance)
    $ssasInstance = "TAB16"
    #Specify any database available on SSAS Tabular (the database must exists, no data is read from there)
    $databaseName = "Contoso"
    ## End of parameters

    if ($ssasInstance.Length -ne 0) {
        $ssasInstance = "\" + $ssasInstance
    $null = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")
    $server = New-Object Microsoft.AnalysisServices.Server

    $null = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices.AdomdClient")
    $connStr = "data source=$serverName$ssasInstance;Initial catalog = $databaseName"
    [Microsoft.AnalysisServices.adomdclient.adomdconnection]$cnn = new-object Microsoft.AnalysisServices.adomdclient.adomdconnection($connStr)
    $cmd = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdCommand
    $cmd.Connection = $cnn
    $cmd.CommandText = @"
    ROW (
        "x", COUNTROWS (
            CROSSJOIN (
                SELECTCOLUMNS ( CALENDAR ( 1, 10000 ), "Num1", INT([Date]) ),
                SELECTCOLUMNS ( CALENDAR ( 1, 10000 ), "Num2", INT([Date]) )
    $sw1 = New-Object Diagnostics.Stopwatch
    $sw2 = New-Object Diagnostics.Stopwatch
    $sw3 = New-Object Diagnostics.Stopwatch
    $dr = $cmd.ExecuteReader()
    $dr = $cmd.ExecuteReader()
    $dr = $cmd.ExecuteReader()
    $colItems = Get-WmiObject -class "Win32_Processor" -namespace "root/CIMV2" -computername $serverName 
    $s_version = $server.Version
    $run1 = $sw1.Elapsed
    $run2 = $sw2.Elapsed
    $run3 = $sw3.Elapsed
    foreach ($objItem in $colItems) { 
        $cpuId = $objItem.DeviceID
        $cpuModel = $objItem.Name
        $cpuCores = $objItem.NumberOfCores
        $cpuMaxSpeed = $objItem.MaxClockSpeed
        $cpuCurrentSpeed = $objItem.CurrentClockSpeed
        $cpuStatus = $objItem.Status
        ## We only consider the first CPU
    Write-output "{"
    Write-output "    ""Benchmark"": ""FE CROSSJOIN 10k*10k"","
    Write-output "    ""SSAS_Version"": ""$s_version"","
    Write-output "    ""Run_1"": ""$run1"","
    Write-output "    ""Run_2"": ""$run2"","
    Write-output "    ""Run_3"": ""$run3"","
    Write-output "    ""CPU_ID"": ""$cpuId"","
    Write-output "    ""CPU_Model"": ""$cpuModel"","
    Write-output "    ""CPU_Cores"": ""$cpuCores"","
    Write-output "    ""CPU_MaxSpeed"": ""$cpuMaxSpeed"","
    Write-output "    ""CPU_CurrentSpeed"": ""$cpuCurrentSpeed"","
    Write-output "    ""CPU_Status"": ""$cpuStatus"""
    Write-output "}"

  • Avoid measure and column with the same name in #dax

    I recently wrote an article about duplicated names in DAX to highlight a possible case where you can have a measure with the same name of a column. This is not a good idea, mainly because the syntax to reference a measure and a column is identical, and this similarity is the root of a number of issues. It would be much better if the measure reference had a syntax different from the column reference.

    The syntax table[identifier] could identify either a column or a measure, and there is no secure way to understand that by just reading the code (unless you look at the color highlighting). I thought to a list of techniques to disambiguate that syntax, and the only available one is to modify the measure reference (not the column reference) in one of these ways:

    • CALCULATE ( table[identifier] ) is a syntax that certainly reference a measure, if table[identifier] is a column, the lack of a row context in CALCULATE would generate an error
    • table[identifier]() is another way to invoke a measure, which in reality corresponds to CALCULATE ( table[identifier] ); if you use DAX Formatter, a syntax like [measure]() is transformed in CALCULATE ( [measure] ); however I might consider modifying DAX Formatter if it was necessary. However, I don’t think that a syntax such as [measure]() is more readable.

    If it was possible to refactor the DAX syntax, I would use a different syntax for a measure reference:

    • (measure)
    • {measure}
    • [other ideas?]

    It’s probably too late for a similar discussion, but you never know.

    Feedbacks are welcome.

  • New edition of the Analysis Services Tabular book #ssas #tabular

    Last year I and Alberto Ferrari wrote a new edition of the Analysis Services Tabular book, which official title is Tabular Modeling in SQL Server Analysis Services (2nd Edition). Some delay in editing and production delayed the printing, but the book is finally available, so I can present it with a dedicated blog post, explaining its goals and contents, so you can evaluate whether it could be interesting for you.

    First of all, this is the second edition of the book we wrote with Chris Webb a few years ago for Analysis Services 2012. Several chapters did not require particular changes, and even if we updated many parts of these chapters to the new version, a great thank you goes to Chris, who allowed us to base our work on his content, too. For an unfortunate series of events and crossing edits, we didn’t include Chris in the acknowledgments of the book, so this blog post is just a first attempt to publicly address that – forgive me, Chris!

    Compared with the previous edition, we removed all the chapters about DAX, keeping only a small introduction of this language, because we have an entire book dedicated to it! This choice provided us additional time to cover other topics in more detail. We covered only the compatibility level 1200 in this new book, so if you have to develop a model using the model 1100 or 1103, you probably will find a better reference in the previous edition.

    This is the list of the chapters of this book, followed by a small comment of the main changes made in this edition.

    CHAPTER 1 Introducing the tabular model
    CHAPTER 2 Getting started with the tabular model
    CHAPTER 3 Loading data inside Tabular
    CHAPTER 4 Introducing calculations in DAX
    CHAPTER 5 Building hierarchies
    CHAPTER 6 Data modeling in Tabular
    CHAPTER 7 Tabular Model Scripting Language (TMSL)
    CHAPTER 8 The tabular presentation layer
    CHAPTER 9 Using DirectQuery
    CHAPTER 10 Security
    CHAPTER 11 Processing and partitioning tabular models
    CHAPTER 12 Inside VertiPaq
    CHAPTER 13 Interfacing with Tabular
    CHAPTER 14 Monitoring and tuning a Tabular service
    CHAPTER 15 Optimizing tabular models
    CHAPTER 16 Choosing hardware and virtualization

    The introduction has the same structure of the first chapter in the previous edition, and it covers the architecture of Analysis Services Tabular and its role in the Microsoft BI platform. Power BI was not even in the plans when we wrote the first edition, so you can imagine this was not just a simple search and replace of version numbers, many things changed since 2012.
    The chapter 4 is similar to the DAX Basics of the previous edition, but we updated a few details about the syntax, providing the foundation to write very simple calculations. This is the only chapter dedicated to DAX in this new book.

    The chapter 5 about hierarchies and the chapter 6 about data modeling were present also in the previous edition. The chapter 7 is brand new chapter about the Tabular Scripting Model Language. Beside the technical information, I think that this chapter is useful to clarify the role of TMSL in the new compatibility level (1200). The chapter 8 has been updated considering Power BI as the main client tool for Tabular, instead of Power View (which was the reference in 2012).

    The chapter 9 about DirectQuery has been completely rewritten, and we also wrote a separate whitepaper about DirectQuery that complements the content of this chapter in the book.

    The chapters 10-11-12 have been updated (considering also relationships with bidirectional filters) describing security, processing, partitioning, and VertiPaq internals. In particular, the VertiPaq chapter has many more details that will help you also in troubleshooting and optimization.

    The chapter 13 is another chapter that has been written from scratch, because the previous APIs are no longer available, and a new (better but different) one is now the reference library for PowerShell and C# scripts.

    The chapter 14 is an update of the chapter present also in the previous edition, whereas chapters 15 and 16 provide more space to topics (optimization of large structure and hardware/virtualization selections) that were cited in the previous edition very shortly, without the insights that you will find in this edition

    I hope you will enjoy this book!

  • Using and optimizing DirectQuery in #powerbi and #ssas #tabular

    More than one year ago, I and Alberto Ferrari started to work on DirectQuery, exploring the new implementation appeared in Analysis Services 2016 for the Tabular models, which uses the same engine of Power BI. The previous version of DirectQuery appeared in SSAS Tabular 2012 was very slow and suffered of performance issues also using small models. We have never seen it adopted in a production environment, it was very hard to optimize, and there were too many restrictions to the data model (for example, no MDX queries = no PivotTable in Excel as a client, and no time intelligence = complex DAX code to write even for trivial calculations).

    For these reasons, when we worked on Analysis Services 2016 and Power BI, we wanted to push the new version of DirectQuery, which solved many of the limitations of the first implementation, discovering its limits and where it could be used in a real production environment. The results of our efforts is now available in a white paper published by Microsoft: DirectQuery in Analysis Services 2016. The document contains a detailed step-by-step guide to implement DirectQuery in your model, and then investigate on how DirectQuery transforms a DAX or MDX query into one or more query in SQL. If you want a quick answer to the question “should we use DirectQuery?”, then the answer is…. it depends!

    You have to set your expectations to a correct level. You can use DirectQuery successfully in certain conditions (database size, frequency of refresh, capacity and performance of your RDBMS server). You certainly have to test the particular workload you want to apply to DirectQuery. The best suggestion is: try and measure it. The whitepaper will provide you many hints about what you should try and what you should expect. Specific measures we made could be different today, because there are often new release of this technology, and we can expect more improvements going forward. But you can certainly start to spend time testing it, and if you understand how it work, you can figure out what are the scenarios where you might want to adopt it.

    The white paper can be used also for Power BI: the engine is the same, we will probably see many improvements very soon, and maybe that the version of the engine you are using is already much better than the ones we used writing the whitepaper. However, the basic concepts are the same, and you can see and measure the improvements by repeating the same tests we described in the document.

    Out of the records, and in a very informal way, I would like to highlight that you should try DirectQuery with a correct attitude. Don’t expect the magic box, and don’t assume it will be bad, it could surprise you in both cases! Most important, think about why you might want to use this technology.

    The right reasons for using DirectQuery are:

    • You need real-time queries. Processing window creates latency that you cannot accept. Good reason. Just ask to yourself if your customers are ready for real-time queries in a dashboard. You might discover they are not.
    • You have a small database to query, that changes often.
    • You have a large amount of data, too big to fit in memory, and you *don’t* need queries returning a result in less than 2 seconds.

    There are also two very wrong reasons for choosing DirectQuery:

    • Saving money. I think that the in-memory Tabular model has a lower cost considering hardware+software+maintenance in order to provide the same level of performance.
    • Improving query performance. On the same hardware, it’s very hard. On a different hardware, maybe… but see the previous point about the cost. And also consider network latency (you run the RDBMS on another server, right?).

    I don’t want to scare you. I just want to set the right level of expectations. With that in mind, you can implement successful projects using DirectQuery today.

  • Unexpected side effects of table filters in #dax

    One of the side effects that you might not expect from a table filter in a CALCULATE statement is the removal of an existing filter. In reality, this is a condition that happens when you write an explicit CALCULATE in a row context, because the table filter overrides the context transition in case you filter the same table that generated the context transition.

    In practice, the following query return the same value for every color, ignoring the context transition for Product[Color]:

        VALUES ( Product[Color] ),
        "Sales", CALCULATE(
            [Average Sales Amount],
            FILTER ( Sales, Sales[Quantity] > 3 )

    If you know the context transition and you think this is strange, Alberto wrote a good article explaining this particular situation: read Context Transition and Expanded Tables on SQLBI web site!

  • New syntax in DAX: IN operator, table and row constructors #dax #powerbi

    I recently wrote an article to described in more details the syntax of the IN operator in DAX, which is available in Power BI and in the CTP of the next release of Analysis Services.

    Jeffrey Wang wrote a blog post about the IN operator a few months ago, so I waited some time to include more details about how it internally works and how to use it when there are two or more columns involved. 

    If you just need the simpler syntax for a single column, well, now instead of an infinite list of OR condition, you can write:

    table[column] IN { 1, 3, 5, 7, 11 }

    However, if you want to understand the implications of the new syntax of the IN operator and of the related table and row constructors, then I suggest you to read this article carefully!

  • Mark as Date table in Power BI #dax #powerbi

    One year ago I wrote an article describing how the time intelligence DAX functions work in Power BI (I just updated the article including the example described below). In a recent event in Sydney I observed a strange (or at least unexpected) behavior of Power BI during a demo, and I think it is interesting to share a few considerations about it.

    As you can read in the article linked above, I wrote that when you apply a filter to the date column of a table, and this column is used in a relationship, then the table is considered a Date table even if it is not marked as a Date table (you cannot do that in Power BI). During my demo, the Date table had two relationships: one with Sales using an integer column, and one with Purchases using a date column. My expectation was that when I was using a calculation involving only Date and Sales tables, a time intelligence filter would have required an ALL ( Date ) condition in the CALCULATE statement, but this was not the case. Let me show this with an example.

    Consider this formula:

    SalesYTD1 := CALCULATE ( SUM ( Sales[Amount] ), DATESYTD ( 'Date'[Date] ) )

    If the relationship between Sales and Date is made using an integer column in Power BI, I would expect the formula to not provide any YTD calculation unless I rewrite it as:

    SalesYTD2 := CALCULATE ( SUM ( Sales[Amount] ), DATESYTD ( 'Date'[Date] ), ALL ( 'Date' ) )

    However, SalesYTD1 was working during my demo, despite I had a relationship based on a DateKey column that was an integer value (such as 20170222). After some investigation, I realized that the other relationship connecting Purchases and Date, even if not used in the two measures above, was enough to consider the Date column as a primary key of the Date table, and this produces an automatic addition of the ALL ( 'Date' ) function in the CALCULATE statement when a filter is applied to 'Date'[Date].

    Thus, the lesson is that even if you cannot have Mark as Date table in Power BI, you can obtain the same result by creating an hidden empty table with a column Date in the data model, creating a 1-to-many relationship between Date[Date] and the table you created. For example, you can use MarkAsDateTable=FILTER(CALENDARAUTO(),FALSE) to obtain such a table and then create the relationship and hide the table.

    You can download a working version of this example in the updated demo of the article Time Intelligence in Power BI Desktop

  • Using SUMMARIZECOLUMNS instead of SUMMARIZE in #dax #powerbi #ssas #tabular

    One of the latest addition to DAX has been the ability to apply an external filter context to SUMMARIZECOLUMNS, which makes this function a perfect replacement for SUMMARIZE in DAX. In The Definitive Guide to DAX I introduced such a function, but at the time of writing it was used only by Power BI and it wasn’t possible to include it in a measure, because of the limitation that now has been removed. Using SUMMARIZECOLUMNS you can write:

    'Date'[Calendar Year], 
        "Sales Amount", SUMX ( Sales, Sales[Quantity] * Sales[Unit Price] )

    instead of:

    'Date'[Calendar Year], 
        "Sales Amount", SUMX ( Sales, Sales[Quantity] * Sales[Unit Price] )

    A more complete description of this new function is available in the Introducing SUMMARIZECOLUMNS article I wrote on SQLBI.

  • Year-over-year comparison using the same number of days in #dax

    When you use the time intelligence functions in DAX, it is relatively easy to filter the same dates selection in the previous year by using the SAMEPERIODLASTYEAR or DATEADD functions. However, if you follow the best practices, it is likely that you have a full date table for the current year, which includes many days in the future. If you are in the middle of March 2017, you have sales data until March 15, 2017, so you might want to compare such a month with the same number of days in 2016. And the same when you compare the Q1, or the entire year.

    A common solution is to translate this request in a month-to-date (MTD) or quarter-to-date (QTD) comparison, but depending on how you implement this, you might not obtain a reliable result. For example, you might assume that the current date on your PC is the boundary of the dates you want to consider, but you probably have a few hours if not days of latency in the data in your database, so you should constantly fix the offset between the current day and the last day available in your data.

    Thus, why not simply relying on the data you have to make an automatic decision? This is the purpose of the technique described in the article Compare equivalent periods in DAX that I wrote on SQLBI, where I show several approaches optimized for Power BI, Excel, and SSAS Tabular, which are different depending on the version you use.

    Personally, the version I prefer is the one with the variables in DAX:

    [PY Last Day Selection] :=
    VAR LastDaySelection =
        LASTNONBLANK ( 'Date'[Date], [Sales Amount] )
    VAR CurrentRange =
        DATESBETWEEN ( 'Date'[Date], MIN ( 'Date'[Date] ), LastDaySelection )
    VAR PreviousRange =
        SAMEPERIODLASTYEAR ( CurrentRange )
        IF (
            LastDaySelection > MIN ( 'Date'[Date] ),
            CALCULATE ( [Sales Amount], PreviousRange )
    It’s longer, but much more readable. More details and examples you can download for Excel and Power BI are available in the article.

This Blog



Privacy Statement