SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus
The 2015 has been an amazing year for Power BI. One year ago, I was commenting on this blog the public preview of Power BI dashboards. A few days ago, Microsoft released the December update of Power BI Desktop and several updates of the Power BI service. In one year, we moved from a situation where Microsoft seemed (and actually was) too slow in the BI market, to a weekly (and sometimes daily) wave of announcements that are becoming hard to follow also for those enthusiast early adopters (but don’t misunderstand and read my lips: D O N ‘ T S T O P R E L E A S I N G N E W F E A T U R E S !!).
It’s clear that Microsoft is moving quickly to get the cloud market of BI services, which is still unexplored (after all, most of the companies keep their data on premises), but it is the only player providing a credible story of integration between cloud and on premises BI services (see “The Box is Back” and “Happy Together” pictures used at last PASS Summit). I finally see both strategy and execution working together. And execution is happening at the right speed, which has to be very fast nowadays.
There are a long number of features, improvements, technical details, and future directions that justify this analysis, but there is still a long road ahead before good comments become actual revenues and adoption in production. However, there are key changes I have seen in this year that strongly support a positive outlook for the entire Microsoft BI stack of technologies:
- No more “by design” answer: first and foremost, the entire Microsoft BI team removed the “by design” answer for bug/suggestions submissions. The Power Query team started with this positive attitude of considering feedback as something to use for real. Now the entire Microsoft BI team listen to the community, consider feedback and prioritize features based on comments received. Don’t underestimate this point. This, and an increased transparency, is the foundation of the credibility that leverages the word of mouth and community support. Other Microsoft products don’t have this asset, and they should do something to get it. Microsoft is behaving as a startup here, and the results are visible. One year ago, this was a promise. Today, it is a fact.
- Constant and continuous innovation: weekly and monthly releases creates a push to keep the software up-to-date. This impacts in particular Power BI Desktop, and backwards compatibility is no longer a constraint for innovation. There are pros and cons in this, but by now there are more goods than bads.
- DataZen acquisition: I’ve seen many comments that were not convinced about this move, because it was creating confusion and overlaps with other Microsoft products. Now that Microsoft released SQL Server 2016 CTP 3.2 that includes the first version of DataZen (now named “mobile report type”) in Reporting Services, most of these considerations are over. However, I always thought that DataZen acquisition was an excellent strategic move. Microsoft bought a fast-growing company that was not quoted, including the technology and the development team in a single move. While most of the comments were about the effective value of the technology and the overlap between DataZen and Power BI mobile app, my appreciation was more about buying a company to remove it from the market, so that other possible big competitors wouldn’t have been a chance to buy it. Doing that when the company was mature enough to have a real customer base (and a real working product), but small enought to not be under the radar of many analysts, is an additional bonus. Great move.
- Custom visuals in Power BI: disclaimer: I might be biased for having inspired the creation of Synoptic Panel. But I think that making the visual components an open source part of Power BI and enabling a large community to contribute to a public gallery is a super smart move. If the community contributes, this is good. But even in case the community would not respond, Microsoft opens a door to release part of software complementing its product (Power BI), without requiring the cost related to official release of software (which higher costs are internationalization and documentation). Think about the cost of releasing and supporting softare in hundreds of languages (think to Office) and you should quickly realize why this move is smart. Not to mention the number of opportunities that it opens to Microsoft partner, leveraging the existing ecosystem.
- Adoption or “R”: Is the support of the “R” language really required for a tool like Power BI? Let me say an unpopular opinion: there are too many buzzwords in the BI arena, and it’s not the first time. It has been always the same, just changing the buzzwords. We started with DSS (Decision Support Systems), we recently heard about big data, I’ve seen that BA (business analytics) rocks more than BI (business intelligence), you cannot live without a data scientist, and yes, you need “R” in your company. The list could continue for pages. But is this really true? I am convinced that “R” is very good in certain domains, and at the same time I see it used (or even just considered) also when it is completely useless. But I am a consultant, I don’t sell licenses, I help customers building solutions that actually work, possibly reducing development cost and time. “R” is an option, but is not necessarily important or relevant in many scenarios. But can Microsoft ignore the important of buzzwords and trends in the market? Absolutely no. So adopting “R” for data sources and visualizations (see latest announcement for R Visuals in Power BI) is another great move that will help Microsoft sales (and this is true for all Microsoft partners, too).
I have seen in one year a multiple of what I have seen in the previous ten years. In reality, Microsoft is leveraging many assets that were already in-house (the Tabular engine, Power Pivot, Power Query, Power View, and many other Azure’s based services), which development required several years. But the weak point has always been the “last mile” of data visualization and presentation. Other players were years-ahead in this area. And I think they are still ahead of Microsoft. You might have a scenario that already fits well with Power BI. Maybe your scenario will be “Power BI” friendly in a few months. But the trend is clear and the improvement is continuous. So, when I think to the considerations I will do in 12 months, I am very positive and I am also scared about the number of new features I have to learn. Since this is what I (probably we) asked for many years, I cannot complain!
I always take a look at stock price of companies to check whether financial results and expected outcomes correspond to my point of view. Unfortunately, it’s not fair to compare Microsoft with other specialized companies that work only in the BI market. So we cannot draw any conclusion by observing that Microsoft (MSFT) is at its highest point since Tableau IPO (DATA), whereas Qlik (QLIK) didn’t add much value to stockholders in the last 3 years (source: http://www.nasdaq.com/symbol/msft/stock-comparison).
What can we expect in 2016? I expect Power BI to continue its growing as data visualization platform, ecosystem, and adoption. However, the upcoming Microsoft SQL Server 2016 will be a huge release for the BI services. It will be the more important release since 2005, and this is mainly because of the new SQL Server Reporting Services (see Microsoft Business Intelligence reporting roadmap to read more). In reality, also SQL Server Analysis Services is a very important release, being the first major upgrade since SQL Server 2012 for Analysis Services Tabular (see latest announcements for SQL Server 2016 Analysis Services CTP 3.2).
Without going in detail, there are many new features that will make Tabular a more “enterprise ready” platform. It’s not that the current version is not good, I helped many companies in succesful adoption of Tabular, but there are many improvements in productivity and performance that completely justify the upgrade even before considering new features available in the new version. There will be a lot to write about that, and I really look forward to start doing that in a few weeks.
The magic key will be “Hybrid BI”. It’s already happening. I expect most if not all of the companies to move in an “hybrid” area for their BI solutions, where certain part are implemented on premises, and others in the cloud. And the reason to move on premises is not always preserving legacy investement, it could be requiring top performance of hardware for in-memory databases. For a certain size, you need to do an accurate hardware choice, where spending more is not necessarily better. In the very long term, I expect everything to move to the cloud, but the reality is that the two technologies will live together for a very long time, and Power BI is a good candidate to play a major role in this scenario.
Last week Amir Netz presented in the Power BI blog a detailed introduction to the Synoptic Panel written by Daniele Perilli. This component won Power BI best visual contest in October, but since then we received many feedback and Daniele worked on it improving the editor and the available features. The big news is that now the Synoptic Panel supports the SVG format. You can still start from a bitmap and draw areas you want to bind to data in Power BI, but now the result of this job is an SVG file created by the Synoptic Designer. Graphical features of the designer are limited, so if you want a real arc, circle or complex shapes, you can create an SVG file with an online editor (such as http://editor.method.ac/ – but you can find many others) and then import it in Synoptic Designer. Since the SVG includes areas, the designer simply import these entities and allows you to edit their names.
Now, beside the technical details, why the impact of this feature is huge? Well, practically any vector format can be converted to SVG, so if you have a technical draw of a building, any planimetry built using a professional drawing software… chances are you can import them in Power BI now! But what if you don’t think this impact you?
Well, you might need to display data on a map: by country, territory, state, city, district, whatever… Yes, we have Map and Filled Map components in Power BI, but they are good if you want to display the exact geographical details and if you are online. If you want to create custom areas, you want to remove certain details, or you want to use it also offline (or without Internet) using Power BI Desktop, you were unable to do that.
For example, if you want to display USA data by state, you might want to move Alaska and Hawaii in a different position in your map, such as in this example. And the good news is that you don’t have to create this file from scratch: we imported this example from Wikimedia, which already has a good SVG format that you can import in Synoptic Designer: https://commons.wikimedia.org/wiki/File:Blank_US_Map.svg
The time you have to spend to create a new map depends on the quality of the initial file. The USA map is a very good example, because there is already one entity for each state, and each state has the 2-letter state code associated to it.
- Go in http://synoptic.design/
- Paste the URL of the SVG file (https://upload.wikimedia.org/wikipedia/commons/3/32/Blank_US_Map.svg) and click GET (or browse from your local drive if you have it there)
- Now you will see in the editor all the states and selecting each one you will highlight the corresponding area on the right pane. You can type the complete name of each state here, or change the code to identify the state. You can store two names, one to bind data in Power BI and the other to display the name of the area in the visual (if blank, the same name used to bind data will be used by default).
- You can also clear some area if you don’t need it. If you want to change the drawing, you should edit the SVG file first in an external editor (such as http://editor.method.ac/). Please note that certain states in this specific map have an additional dash character you might want to remove.
- Click EXPORT TO POWER BI button (bottom right), and then press the DOWNLOAD SVG button
If you find a good map that you think could be useful to other people, please click SUBMIT TO GALLERY, too! We already have a number of maps available, and you can quickly use them by navigating in Gallery / Country/Territory:
You can also browse the gallery directly from the Synoptic Panel component. Once you chose the map, you bind to Legend a column containing state names, to Values the measure to display (it will appear in the popup window by moving the mouse over a state), and to Saturation Values the measure that defines the saturation of the color (you can configure limits in Format properties). In the following example you see the Quantity measure used for both Values and Saturation Values, while the mouse is on Florida.
I really look forward to see new territories published in the gallery!
While I am teaching Mastering DAX and Optimizing DAX classes, I am always available to speech at local user group events. This week, I will present an updated version of the “Toolkit for DAX Optimization” in Amsterdam, on Thursday, November 26, 2016.
At this link you can register (for free) and see logistics details (user group event is after work, starting at 17:30 serving some food – presentations will start at 18:30 and I will speak at 19.00). Please note there are very few seats still available, so register only if you are sure to attend!
I already presented this session at PASS Summit 2015, but I will have more time and I’d like to show some more detail about how to use these tools with Power BI, too. And, of course, I will be open to Q&A!
This is the description of the content of this session.
Toolkit for DAX Optimization
How do you optimize a DAX expression? This session introduces you to useful tools to help measure performance, gather data to find the bottlenecks, and write new optimized versions of DAX. Starting from SQL Profiler, you learn which events are relevant for DAX and how to collect them in different environments (SQL Server Analysis Services [SSAS], Power Pivot, Power BI). Explore DAX Studio, which simplifies and speeds up the data-collection process and makes it easy to find bottlenecks in the storage engine and formula engine. The goal of this session is to provide a methodology to measure performance of your DAX measures, to find bottlenecks, and to identify the main reason of performance issues.
I will focus on tools more than on optimization techniques, but considering the very good feedback I received at PASS Summit, it seems the session works very well – I hope you will enjoy it, too!
I recently write a small article on SQLBI to describe how to create a DAX measure that search for items that have all the items selected in a slicer, instead of any of them. For example, you might want to create a measure that display how many customers bought at least one product for each of the categories selected in a slicer. You cannot change the behavior of the slicer itself, that applies a filter to the calculation, but using DAX you can iterate the selected items and make sure that each one exists in another set of items (such as the products bought by a customer). I think it could be an interesting technique also for different applications (I would be happy to read your ideas in the comments!). You can read the complete article on SQLBI.
David DeWitt and Rimma Nehme delivered the day 2 keynote at PASS Summit 2015 talking about Internet of Things. Many PASS attendees know David for providing some of the best keynotes every, technical and without vaporware and marketing stuffs. So this is a good time to discuss about the real stat of Internet of Things (IoT) these days.
It’s easy to talk about IoT, but it’s hard to enter into this complex and heterogeneous world. There are many type of devices, and a simple categorization is consumer vs. industrial, which have very different requirements and features (not to mention cost, power, and standards used). The key to connect IoT devices is the cloud, but communication happens in two directions: from device to cloud, and from cloud to device. There requires different technologies and often different devices, because many of them have only a single role (sending data or receiving commands, sensors or actuators, but certain devices might do both).
As you can imagine, this is a real source of Big Data. I often see Big Data used to manage data that are generated into a structured relational database, which seems to be a non-sense. But with the volume of data generated by IoT this technology makes perfect sense. Of course, Azure has a lot of technologies that helps you manage this amount of data, but since the topic of this blog is Business Intelligence, I’m more interested to what happens when you want to analyze data.
Here, a few technologies that have a certain history (if you know former names) came into play. Data Mining (also known as Machine Learning these days) can be fundamental to make predictions based on previous behaviors. David is great in providing simple examples to explain the concept: a boiler has a pressure sensor and you have to open valve before boiler explodes. You can train the algorithm for predicting boiler failure, or you can provide built-in intelligence in the algorithm, with a predefined limit of bar to open valve. The analysis of this data requires real-time stream analysis, relying on the cloud for this real-time analysis would generate too much traffic and would also have higher latency (dangerous for this type of applications).
Here it comes a new “definition”: fog computing, also known edge computing. The idea is to not move the data to computation, but to move computation to data. However, the IoT is a database problem, which is not managed in this way these days. And this is the main point of this keynote. Proposing a Polybase for IoT that includes:
- Declarative language: today IoT is based on imperative languages, whereas the goal is to introduce declarative language, such as IoT-SQL (imagine to add a WINDOW and ACTION suffix to a classical SQL query, so that it can act on a range of time, triggering an action when certain conditions happens)
- Complex object modeling: define a standard structure to identify IoT locations in a hierarchical structure (imagine an object model with an API to navigate hierarchies of objects, traversing path and similar stuffs – similar to many MDX statements we know well)
- Scalable metadata management: simple abstraction (Metadata, statistics, access privileges) unified to access different devices; metadata includes collection of standard and extended attributes
- Discrete & Continuous Queries: different query types, such as ExecuteOnce (like a standard SQL), ExecuteForever (continuous flow of responses from device), ExecuteAction (such as ExecuteForever plus an action to execute in defined conditions)
- Multi-purpose queries: here is the smart idea. With a definition of the process at an higher level, the decision of moving the work in the cloud or at the edge (in the fog) is made by the query optimizer, creating a real query plan that distributes the actual work to different parts of the system depending on the requests
This approach is really ambitious, but an important part of it is the idea of embedding security in the system. If you think about the future of IoT, security is of paramount importance. I don’t know if this will be the future of IoT, but this speech raised points that have to be faced, sooner better than later.
You will be able to watch this keynote soon at PASStv on demand.
I’m sitting at the bloggers table at the Day 1 keynote of PASS Summit 2015 and I try to recap the important news for the BI community.
- SQL Server Analysis Services 2016 CTP 3 will be available in a few days here, but Microsoft already announced the news in this release:
- Previously announced news (most of them already described in this blog and/or available in Power BI Desktop):
- Extended Events for SSAS available in SSMS
- Parallel partition processing
- New DAX functions
- DirectQuery over SQL Server 2014
- New DAX functions and syntax (including variables)
- New SQL Server 2016 compatibility level: it’s no longer based on XMLA (that was created for Multidimenisonal) and we have a new Tabular Object Model (TOM). No problem for existing models if they keep the 1103 compatibility level, but once you migrate to the new model, you have to use the new API (but you also have access to the new features). API for TOM is available in JSON, just as in Power BI APIs.
- JSON-based tabular scripting language: while there is still a single file for the entire model, the code is more readable and (most important) it’s easier to be managed in a version control system, possibly merging changes applied by different developers to the same data model
- Bi-directional cross filtering: just as in Power BI Desktop, with an equivalent diagram view layout
- MDX Support for direct query: this is a *BIG NEWS*. Implication is that you will be able to use Excel pivot tables (generating MDX queries) with Direct Query. But wait, there is more about DirectQuery…
- Oracle, Teradata, and APS supported in DirectQuery: while I’m still confident that an in-memory database is very useful when you have a certain amount of data, of for certain calculations, the ability to create real-time dashboards connecting your relational database to a thin semantic layer is wonderful. Just be careful, you have to use compatibility level 1103 for this feature in CTP3 (this will certainly change before RTM).
- SSDT for Visual Studio 2015 available: this is just necessary and expected. The good news is that you don’t need to download 1GB of file if you already have VS2015. Important news here are better performance of SSDT when you design a data model (but I want to test it… again, it’s easy to improve what we have today) and finally we’ll be able to use Visual Studio Configuration manager to create test and production deployment configurations. Really really needed!
- Improved DAX formula editing: now, it’s easy to do something better than what we had in the past! I’m really demanding here and I hope there will be a way to integrate DAX Formatter (I will investigate on that this week). By the way, the new development environment has formula fixup – you rename a measure and an automatic refactor happens, just as in Excel 2016 and Power BI Desktop.
- DBCC for Analysis Services: you can check the consistency of your tabular and multidimensional models.
The keynote was more about SQL Server 2016 as a platform, on premises and on cloud. SQL Server is now better than Oracle in Gartner quadrant, in TPC-H benchmark (for data warehouse) and for security (it is the database with the lower number of vulnerabilities in the last 5 years). This would have been a joke 10 years ago. There are also many improvements such as integration with R, many security, scalability, reliability and performance improvements (in-memory technology in SQL Server is always improving).
You can watch PASS keynote on PASStv: http://www.sqlpass.org/summit/2015/PASStv/Keynote.aspx
That’s all for today’s keynote. More news later this week watching sessions, I hope!
I don’t usually blog about news, unless there is a relevant impact on future directions. Last week we have seen a huge number of announcements for Power BI (Custom Visuals now available to anyone through a gallery online, you can import them in also in Power BI Desktop, and a number of improvements in Power BI Service and a new version of the Personal Gateway).
I am sure we will see a number of announcements this week at PASS Summit 2015. If you look at the public schedule of the sessions, you see that we will finally know more about Analysis Services and Reporting Services, which are expected to show new features in SQL Server 2016. We will finally talk about Corporate BI again, even if we should talk about Hybrid BI these days.
The keynote will start at 8:15 AM Pacific Time on Wednesday, October 28, 2015. I will cover announcements made during the keynote on twitter (@marcorus) and with more details on this blog.
If you will be in Seattle this week, I and Alberto will present these sessions:
I hope Darren will be able to release an update to DAX Studio, which is one of the tools I will use in my session. I already have an early build to show in the session, just in case, but the goal is to make it broadly available before the session.
UPDATE 2015-10-26: DAX Studio 2.3.0 is available!
If you use Analysis Services Tabular, you should dedicate a good amount of time in hardware selection. Unfortunately, throwing money in expensive hardware configuration could be a very bad idea, resulting in your 1,500$ desktop running faster than your expensive 100,000$ server. Moreover, if you use virtualization you have to be very careful in certain settings, that might affect the performance in a very bad way. When I say this, I’m describing queries running up to 2-3 times slower than in optimal conditions. So, why spending time to gain 10% when you have a bigger issue to solve?
I described the main best practices in the article Optimize Hardware Settings for Analysis Services Tabular on SQLBI. This is the result of helping many companies to detect hardware bottleneck and to plan the right hardware configuration. My experience says that the time you spend to correctly allocate the budget has a huge return of investment. Usually you cannot change the CPU or the RAM of a brand new server, so this step is critical. The next step is to check that hardware configuration is correct. It’s incredible how many times I discovered that BIOS settings of an expensive server were the reason of slow performance, so now it’s my first priority when I see a benchmark with suspicious numbers (compared to tech spec of the CPU).
Now, a common discussion I had is that “we have a standardized hardware and virtualization platform”. I completely understand that, but I like to remind that the goal is to get a better return of investment, and standardization has the ultimate goal to reduce costs. So we start to evaluate the cost of a solution that is compliant with the standards, but allocates different hardware to specific workload. The result of a this is spending less (in hardware and licenses) getting more (performance).
I’d like to hear your stories about that – write your experience in the comments!
Last week we launched a new page in SQLBI website: DAX Puzzle (you can also use www.daxpuzzle.com).
The idea is very simple: we describe a scenario, we ask to solve a problem in that scenario and we might provide some hint to help you finding the solution. You can download a sample file (in Power BI Desktop, which is freely available to anyone – but we might consider Excel too, please provide your feedback) and spend some time finding the solution. When you are done, or if you are curious and don’t have enough time, you can access to the page with the solution, read our description, download the file with the solved problem.
There are no prizes. It’s just workout for your mind. But it’s a good way to check whether you have something to learn in DAX. For every puzzle, we also provide some link to particular section of our Definitive Guide to DAX, which described the topic related to the puzzle in a deeper way. Yes, I admit, it’s also a marketing initiative, but it’s funny if you like DAX!
The first puzzle we published is about USERELATIONSHIP. We already received several comments and I suggest you to use the scenario page only for comments about the question, whereas the solution page is the right place to discuss alternative solutions. There are interesting conversations about the performance of different approaches, and I would like to advise you that this first puzzle is not about performance. In fact, the faster solution doesn’t use neither USERELATIONSHIP at all, and requires the new GROUPBY function. Now you also have another reason to read not only the solution, but also all the comments!
The next puzzle will be published in a few days… subscribe to our newsletter to be notified about new puzzles!
I am so happy to announce that The Definitive Guide to DAX is finally available!
I and Alberto Ferrari spent one year writing this book, and several years collecting the knowledge necessary to do that. The complete title is The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI. You can imagine why we like to shorten it! However, the complete title gives you an important hint: this book cover the new DAX syntax of Excel 2016, Power BI Desktop, and Analysis Services 2016. For example, we covered all table functions useful for calculated tables, which is a feature released in Power BI Desktop after we completed the book writing. This has been an additional challenge, but our goal was to publish a book dedicated to the DAX language, independent from the product and completely up-to-date.
But everything has a cost. It took us a huge amount of time to reach the depth and completeness we wanted in this book. And it will took you weeks if not months to read it cover-to-cover. Yes, I know, you no longer read technical books in this way. You open it at the right chapter and you get the content you need, you copy the pattern, you get the good hint. I do that at least once a week. But you will be able to use this book in that way once you have a solid understanding of DAX. At the beginning, my suggestion is to start from chapter 1, even if you are an experienced DAX developer.
What if you are a DAX beginner? This book will be your guide, but you might consider a more introductive book to start (you can find other books from us and from Rob Collie, depending on the product you use and the writing style you prefer). This is particularly important because we don’t spend a line in the book discussing about user interface. We wrote a book about the DAX language, so you have to know in advance the UI of a product that use this language. Today, the list ranges from Excel (2010/2013/2016), Analysis Services (2012/2014/2016), and Power BI Desktop.
Why am I so excited about this book? After all, I wrote many books (this should be the 10th in English, and I wrote other three books in Italian). Well, first of all, after a few months after completing the writing, I and Alberto would not add or modify anything in this book. As you will read in the introduction, we made no compromises. We thought the size would have been 450-500 pages, but the result is 530 pages of content (plus indexes, table of contents, and so on). Is it the perfect book? No, I am pretty sure we will discover some error and something to clarify and to fix. It always happens. But we set the bar very high this time, and we are very satisfied about the final result. Only reviews will tell us if our perception I right, but we know this is the best result possible today. We had technical reviewers that helped us so much in getting the point of view of the reader, and I would like to mention the incredible job made by Gerhard Brueckl. Believe me, if you wrote a technical book, your worst nightmare is the technical reviewer that review too much, so that you spend more time explaining why you were right instead of fixing the content. Well, Gerhard had the skills and the ability to highlight the right thing. Thanks Gerhard, you deserve a public mention!
After this self-celebration, let me spend some paragraph about the content. We use this book as companion content for our courses Mastering DAX *and* Optimizing DAX. During the courses we have hands-on-labs and a lot of interactions, but we constantly refer to the book to get more detailed information about specific functions and behaviors. Thus, if you attend these courses, you will find it easier to read the book. But you will not be able to skip it! Here is the table of contents, with some comments:
- Foreword: three of the authors of the DAX language and the VertiPaq engine wrote the foreword of our book: Marius Dumitru, Cristian Petculescu, and Jeffrey Wang.
- Introduction: read the introduction before buying the book. You will understand if it is the book of you or not.
- Chapter 1: What is DAX?
- Chapter 2: Introducing DAX
- Chapter 3: Using basic table functions
- Chapter 4: Understanding evaluation contexts
- Chapter 5: Understanding CALCULATE and CALCULATETABLE
- Chapter 6: DAX examples
- Chapter 7: Time intelligence calculations
- Chapter 8: Statistical functions
- Chapter 9: Advanced table functions
- Chapter 10: Advanced evaluation context
- Chapter 11: Handling hierarchies
- Chapter 12: Advanced relationships
- Chapter 13: The VertiPaq engine
- Chapter 14: Optimizing data models
- Chapter 15: Analyzing DAX query plans
- Chapter 16: Optimizing DAX
Topics in chapters 1 to 12 are covered in our Mastering DAX workshop. We organized the content so that you can read them one after the other. The content is very dense, at the beginning we use simpler examples, but we never repeat the same concepts, so if you skip one chapter you might miss some knowledge to fully understand the following topics. Even in chapter 6, which tries to consolidate previous content with practical examples, you will find something new in terms of ways you can use DAX.
Topics in chapters 13 to 16 are covered in our Optimizing DAX workshop. Please, don’t jump to this part if you didn’t read the previous chapters before. Also for attendees of the course, we suggest to complete the self-assessment for prerequisites to attend the course, and you can try to do the same for the book. If you are not ready, you will simply see a huge amount of numbers, without understanding how to connect the dots. You need a solid and deep knowledge of how evaluation context works in DAX before doing any optimization.
My personal estimate is that if you dedicate one week to every chapter, you will be able to complete the learning in 4 months. Read the book, absorb the content, make practice. You might be faster at the beginning if you already know DAX. But be careful, you never read anywhere what we describe in chapter 10 (we rewrote that chapter 3 times… but this is another story), and this is of paramount importance to really “understand” DAX. You hardly have seen the complete description of all DAX table functions in chapter 9. You will not find an extensive use of variables, but the VAR / RETURN syntax is described early in the book and you will see this used more and more with the advent of Excel 2016 / Power BI Desktop / SSAS 2016.
Finally, the goal of the book is not to give you patterns and best practices, but to teach you how DAX works, how to write good code, and how to measure the performance, find the bottlenecks and possibly optimize it. As I always say, do not trust any best practice when it comes to DAX optimization. Don’t trust blogs, articles, books. Don’t trust my writings, too. Simply, measure and evaluate case by case. And the reason is the first answer to any question that the consultant receive: it depends!
If you want to order the book on Amazon, here is a quick reference to links in all the available versions of this site:
- North America
- South America
Have a nice reading!
This is one of the few non-technical posts of this blog. Just skip it if you want to quickly come back to 100% BI related topics.
Last Saturday we run the SQL Saturday 454 in Turin. I was part of the organization, and actually I was one of the promoters for this event, running on the same city just a few months after SQL Saturday 400. The reason for that was an idea we had a few months ago. Running a SQL Saturday very close to Milan, the city hosting Expo 2015 until October 31, 2015. In our plans, we should have been able to attract a large number of foreign attendees interesting in combining a week-end in Italy, one day in Turin for SQL Saturday, and one day in Milan for Expo 2015. The initial target was more than double the attendees of a “regular” SQL Saturday in Italy, reaching 250 people and maybe also 300. After all, everyone was looking forward to visit Expo 2015, right?
Unfortunately, I was wrong.
Part of my job is reading through the numbers. It took me just a few hours after opening a survey through our SQLBI newsletter and other social media to realize that Expo 2015 was not the worldwide attraction we assumed initially. Our ambitious goal was completely unreachable, and this was clear to me before anyone else accepted that. So we downsized the venue, but we wanted to run the best event we can. After all, it was still the SQL Saturday close to the Expo 2015. And we kept the event in English. We requested all the speakers to delivery their speeches in English, regardless of the fact 90% of attendees would have been Italian.
Now, if you never visited Italy, you might be not aware of the lack of English skills of the majority of the population. You might think that people working in IT should have English skills in their CV by default. While this is true for reading technical documents, it is not entirely true for listening and speaking. From this point of view, the situation in Europe is very different between different countries. Smallest countries have better English skills. My guess is that movies are not dubbed, many have just subtitles, whereas largest countries (Germany, France, Spain, and Italy) tend to distribute only the dubbed version of the movies, keeping the original version only for limited number of cinemas in large cities. This fact alone makes a big difference in listening and speaking capabilities. I don’t have any study to demonstrate this correlation, it’s just my experience as a frequent traveler.
I wanted to write this disclaimer to describe another challenge we had for SQL Saturday 454. We were at risk of not having enough foreign attendees (a certainty for me) and not having a good number of Italian attendees, frightened by the fact that all the sessions would have been in English. In the past, we had only a few sessions in English, but a complete conference in a foreign language without simultaneous translation was an unprecedented experiment. However, I was confident this would have stopped someone, but not many of the interested attendees.
At this point, you might be curious to know whether the event was a success or a failure. Well, in terms of numbers, we reached our predicted (downsized) target. It was an event slightly larger than the average in Italy and, ignoring our initial unreachable dreams of glory, it has been a success. But what impressed me was something unexpected.
There is a number of IT professionals in Italy that can attend an event, following all the sessions, engaging the speakers, making questions and keeping the conversation without the language barrier I was used to see a few years ago. I was wrong again, but this time in a pleasant way.
The economic turmoil of the recent years has been very though in this country. I have a privileged position and a particular point of view, clearly seeing the issues that limit the competitiveness of companies and professionals in the global market, especially in IT. Language barrier is one of the many issues I see. Lack of self-investment in education is another one. And the list does not end here. I am an optimist by nature, but I am also realistic in any forecast. People around me know I don’t predict anything good for Italy in the short and medium term. However, even if I still don’t have data supporting that, I feel something has been changing.
I have a new hope.
There is a number of people spending a sunny Saturday in Italy to attend a conference in English, and they are able to not only listen, but to interact in a foreign language. I am sure nobody (myself included) would have bet anything on that ten years ago. For one day, I felt at home in my city doing my job. If you attended SQL Saturday 454 in Turin, I would like to thank you. You made my day.
In the September release of Power BI Desktop, Microsoft introduced a new important feature: calculated tables.
Chris Webb wrote a quick introduction to this feature, and Jason Thomas published a longer post about when to use calculated tables.
The reason of this excitement about this feature is that it adds an important tool to the data modeling capabilities of DAX based tools (even if, at the moment, only Power BI Desktop shows this feature, but I guess that at least Analysis Services 2016 will provide the same capability). Using calculated columns you can materialize the result of a table DAX expression in the data model, adding also relationships to it. Without this tool, you should read the data from outside Analysis Services and then push the data back - and this wouldn't be possible in Power BI. I implemented similar techniques in the past by using SQL Server linked servers, materializing the result of a DAX query in a SQL Server table, and then importing that table again in the data model. Thanks to calculated columns, today I wouldn't to this roundtrip and I would save processing time and reduce data model complexity.
Alberto Ferrari wrote an article describing a good use case for calculated tables. The article presents an implementation of a transition matrix between customer categories evaluated automatically based on other measures (for example, the revenues). I suggest you reading Transition Matrix Using Calculated Tables and then try to implement the same intermediate table for the calculation with other techniques (ETL, SQL, ...). You will discover that calculated tables help you writing cleaner and faster code for a transition matrix pattern.
Today (September 30, 2015) is the last day to submit an entry in the Power BI Best Visual contest. I and Daniele Perilli (who has the skills to design and implement UI) spent hours thinking about something that would have been challenging and useful at the same time. Daniele published a couple of components (Bullet Chart and Card with States) that have been useful understanding the interfaces required to implement a Power BI visual component. But the “big thing” that required a huge amount of time was another.
We wanted a component to color areas of a diagram, of a planimetry, of a flow chart, and of course of a map. From this idea, Daniele developed (and published today – what a rush!) the Synoptic Panel component for Power BI.
The easiest way to see it is watching the video. However, an additional description can help. Let’s consider a couple of scenario. For a brick and mortar shop, you can color the areas corresponding to categories (and subcategories) of products, using either saturation of colors or three-state logic (red-yellow-green, but you can customize these colors, too).
But what if you are in the airline industry? No problem, it’s just another bitmap.
Wait a minute, how do you map your data to the graphics? How can you start from a bitmap, and define the areas that you want to relate to airplane seats or product categories and subcategories? We don’t have coordinates like latitude and longitude, right?
Well, you can simply go in http://synoptic.design, import a bitmap and design your area, straight in the browser, no download, no setup, no fees required. Each area has a name, that you will use to connect data to your data model. Yes you read it right. You will not change your data model to use the Synoptic Panel. For example, here you draw seats area in an airplane:
And with some patience you locate all the areas of a shop, too:
In the right panel you have the coordinates you can modify manually, and the editor also has grid to help you in alignment (snap to grid feature is also available).
Once you finished, you export the area definition in a JSON file that you have to save in a public accessible URL so that it will be read by the component (we will add the capability to store this information in the database, too – yes, dynamic areas will be available, too).
At this point, in Power BI you insert the component, specify the URL of the bitmap, the URL of the JSON file with the areas, the category, the measure to display, the measure to use for the color (as saturation or color state), you customize the colors, and your data are now live in a beautiful custom visualization.
Thanks Daniele for you wonderful job!
If you use SSAS Tabular, this is a very important news!
Microsoft released a very important update for Analysis Services 2012 that provides performance improvements to pivot tables connected to an Analysis Services Tabular model: it is SQL Server 2012 SP2 Cumulative Update 8.
Microsoft discussed some of these improvements in this blog post: Performance problems on high cardinality column in tabular model
UPDATE 2015-09-22: I fixed this post in the following part.
In a previous version of this post, I wrongly reported that this fix was fixing the unnatural hierarchies problem, too! This is described in the article Natural Hierarchies in Power Pivot and Tabular. In reality, only Power Pivot for Excel 2016 and SQL Server Analysis Services 2016 fixed the issue, which is still present in previous versions of Analysis Services (2012/2014) and Power Pivot for Excel (2010/2013).
Almost 3 years ago I wrote an article with the rules for DAX code formatting. If you quickly look at the article, you might think that it is all about readability of the code, and this is fundamentally true. But there two rules that have a particular importance for performance, too:
- Never use table names for measures
- Always use table names for column reference
- Even when you define a calculated column within a table
Well, it is not that writing/omitting table name has a direct impact on performance, but you can easily miss an important bottleneck in your formula. Let me clarify with an example. If I read this:
= [A] + SUMX ( Fact, Fact[SalesAmount] )
I would say that SalesAmount is a column of the Fact table, and the SUMX iteration will not perform a context transition. But if I read this:
= [A] + SUMX ( Fact, [SalesAmount] )
I would start to be worried about the number of rows in Fact table, because each one will invoke a context transition for the measure SalesAmount evaluated for each row of the Fact table, creating a different filter context for each evaluation.
This simple detail makes a huge difference in performance. Context transition is fast, but doing it million times require time.
Wait a minute: are you asking yourself what is a context transition and why a measure generate it? No problem: read the article Understanding Context Transition to get a quick recap of the question (and if you want to dig deeper, preorder The Definitive Guide to DAX, available in October 2015!