THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server

  • CREATE GLOBAL CUBE statement does not support cascading M2M relationships?

    Thanks to a comment sent by a reader of my M2M paper, I just found a possible bug (or "not supported by design feature") of the CREATE LOCAL CUBE mdx statement. In short, it seems that a simple many-to-many dimension relationship is supported, but if you have a cube with multiple many-to-many dimension relationships chained together, this statement does not work. More details are described in the bug I opened on Connect site: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=354521

    I know that CREATE LOCAL CUBE is not the best way to create local cubes. This paper explains why. However, this is the statement that Excel 2007 uses to create Offile cubes. So, this bug compromise the ability to put a cube offline with Excel 2007 if you have cascading many-to-many relationships chained together.


  • Sample chapters from Programming LINQ

    Two sample chapters of my Programming Microsoft LINQ book are finally available. Links to download pages in the chapters title.

    Chapter 6 - Tools for LINQ to SQL

    In this chapter, we took a look at the tools that are available to generate LINQ to SQL entities and DataContext classes. The .NET Framework SDK includes the command-line tool named SQLMetal. Visual Studio 2008 has a graphical editor known as the Object Relational Designer. Both allow the creation of a DBML file, the generation of source code in C# and Visual Basic, and the creation of an external XML mapping file. The Object Relational Designer also allows you to edit an existing DBML file, dynamically importing existing tables, views, stored procedures, and user-defined functions from an existing SQL Server database.

    Chapter 16 - LINQ and ASP.NET

    This chapter showed you how to leverage the new features and controls available in ASP.NET 3.5 to develop data-enabled Web applications, using LINQ to SQL and LINQ in general. Consider that what you have seen is really useful for rapidly defining Web site prototypes and simple Web solutions. On the other hand, in enterprise-level solutions you will probably need at least one intermediate layer between the ASP.NET presentation layer and the data persistence one, represented by LINQ to SQL. In real enterprise solutions, you usually also need a business layer that abstracts all business logic, security policies, and validation rules from any kind of specific persistence layer. And you will probably have a Model-View-Controller or Model-View-Presenter pattern governing the UI. In this more complex scenario, chances are that the LinqDataSource control will be tied to entities collections more often than to LINQ to SQL results.

    The following is the complete list of the chapters included in the book.

    Programming Microsoft LINQ

    • Part I LINQ FOUNDATIONS
      • 1 LINQ Introduction
      • 2 LINQ Syntax Fundamentals
      • 3 LINQ to Objects
    • Part II LINQ to Relational Data
      • 4 LINQ to SQL: Querying Data
      • 5 LINQ to SQL: Managing Data
      • 6 Tools for LINQ to SQL
      • 7 LINQ to DataSet
      • 8 LINQ to Entities
    • Part III LINQ and XML
      • 9 LINQ to XML: Managing the XML Infoset
      • 10 LINQ to XML: Querying Nodes
    • Part IV Advanced LINQ
      • 11 Inside Expression Trees
      • 12 Extending LINQ
      • 13 Parallel LINQ
      • 14 Other LINQ Implementations
    • Part V Applied LINQ
      • 15 LINQ in a Multitier Solution
      • 16 LINQ and ASP.NET
      • 17 LINQ and WPF/Silverlight
      • 18 LINQ and the Windows Communication Foundation
    • Appendixes
      • A ADO.NET Entity Framework
      • B C# 3.0: New Language Features
      • C Visual Basic 2008: New Language Features

  • To join or not to join: that is the question (in LINQ)

    A comment received by one reader of Programming LINQ suggested me to underline a concept that is not so intuitive using LINQ, especially if you come from years of SQL coding.

    The idea is very simple. Two entities in LINQ might be related in the model. Whenever this happen, usually it is better to leverage on this existing relationship and not to write the join syntax in an explicit way. If you are using LINQ to SQL, the generated SQL code might be more performant or at least correspondant to the one generated by writing an explicit join in your LINQ query. The less constraints in your query, the better.

    Let's look at an example on the Northwind database. Imagine you want to see a list of all categories with a flag set for the one which a particular product belongs to. This is a SQL query we could write:

    SELECT
        c
    .CategoryID, 
        c
    .CategoryName,
        CASE WHEN p.ProductID IS NULL 
            THEN 0
            ELSE 1
        END AS Selected
    FROM Categories c
    LEFT JOIN Products p
        ON p.CategoryID = c.CategoryID
        AND p.ProductID = 10
    ORDER BY CategoryName

    Ok, we can write the same query in many other ways, but there are several more complex situations where a LEFT JOIN is used to test the presence of an element in a related table. A correspondant LINQ query might be the following one:

    from c in dc.Categories
    orderby c.CategoryName
    join p in dc.Products.Where(p => p.ProductID == 10)
        on c.CategoryID equals p.CategoryID 
        into pj
    from x in pj.DefaultIfEmpty()
    select new {
        c.CategoryID,
        c.CategoryName,
        Selected = x != null
    };

    The LINQ query above will generate a SQL query containing a LEFT JOIN statement. However, a relationship exists between Categories and Customer, and you can leverage on this relationship in the point where you really need to traverse the relationship (in the projection statement). The following one is a better way to get the same result:

    from c in dc.Categories
    orderby c.CategoryName
    select new {
        c.CategoryID, 
        c.CategoryName,
        Selected = c.Products.Any( p => p.ProductID == 10 ) ? true : false
    };

    This new version has two advantages. First, it is shorter and express its intent more explicitly.  Second, it generates a SQL query with an EXISTS statement, similar to the following one.

    SELECT CategoryID, CategoryName,
        (CASE
            WHEN EXISTS(
                SELECT NULL AS [EMPTY]
                FROM Products AS p
                WHERE (p.ProductID = 10) AND (p.CategoryID = c.CategoryID)
                ) THEN 1
            ELSE 0
        END) AS Selected
    FROM Categories AS c
    ORDER BY CategoryName

    The execution plan used by SQL Server might be similar if not equal. However, using the implicit relationship between Categories and Products in the LINQ query is usually better, because it gives more freedom to the LINQ provider to generate a more efficient SQL code.


  • Great video resource to learn Data Mining

    I worked on Data Mining solutions years before Microsoft added it to Analysis Services. It is an interesting domain for everyone, but it becomes really hard to explain how it works when you scratch the surface. At the last TechEd Developers I watched a great presentation held by Rafal Lukawiecki - it was a level 400 session but he's able to explain the concepts very well, going deep at the right time. Now, there is a complete for part series recorded and available for free here: http://www.microsoft.com/emea/spotlight/sessionh.aspx?videoid=865

    It is a valuable resource and you cannot afford to stay away from Data Mining very long if you are into the BI deal.

     


  • What means writing a book

    I and Paolo Pialorsi have been interviewed at TechEd by Ken Rosen. We talked about our experience as book authors of Programming LINQ.

    If you are interested in writing a book, or if you simply want to see our faces and hear our italian accent, you can watch the video available in both low resolution and high resolution. Enjoy!


  • TechEd 2008 book

    I'm already in Orlando for TechEd 2008 Developers. Tomorrow I and Paolo Pialorsi will be at the TechEd bookshop for a book signing of our just released Programming Microsoft LINQ, scheduled at 4:00PM-4:30PM. I wrote a post a few days a go with the list of chapters included in the book. LINQ to SQL and LINQ to Entities are two technologies that are significative to access data, even if you don't have to use them in every possible scenario.

    If you are attending to TechEd, meet us tomorrow at the bookshop to talk about LINQ!


  • Programming Microsoft LINQ finally shipping

    Finally, the Programming Microsoft LINQ book is available. We (I and Paolo Pialorsi, the other author) updated the website that supports our books (http://programminglinq.com), where you can download the sample code of the book.

    What’s in this book? Well, we tried to cover everything that was in RTM, but we also introduced technologies that are still in beta or in early CTP stages, like LINQ to Entities and Parallel LINQ. To give you an idea of the content, at the end of this post there is a list of the chapters included in the book.

    Now, the next news is that I will be at TechEd Developers next week in Orlando. Feel free to contact me if you want to arrange a meeting talking about LINQ or SSAS. Moreover, I and Paolo will be at the bookshop for book signing on June 3rd from 4:00pm to 4:30pm. I hope to see you there!

    In these days I'm already working on a new project, this time on Data Warehousing and Analysis Services. More news on this blog in a few weeks...

    Programming Microsoft LINQ

    ·         Part I LINQ FOUNDATIONS

    o   1 LINQ Introduction

    o   2 LINQ Syntax Fundamentals

    o   3 LINQ to Objects

    ·         Part II LINQ to Relational Data

    o   4 LINQ to SQL: Querying Data

    o   5 LINQ to SQL: Managing Data

    o   6 Tools for LINQ to SQL

    o   7 LINQ to DataSet

    o   8 LINQ to Entities

    ·         Part III LINQ and XML

    o   9 LINQ to XML: Managing the XML Infoset

    o   10 LINQ to XML: Querying Nodes

    ·         Part IV Advanced LINQ

    o   11 Inside Expression Trees

    o   12 Extending LINQ

    o   13 Parallel LINQ

    o   14 Other LINQ Implementations

    ·         Part V Applied LINQ

    o   15 LINQ in a Multitier Solution

    o   16 LINQ and ASP.NET

    o   17 LINQ and WPF/Silverlight

    o   18 LINQ and the Windows Communication Foundation

    ·         Appendixes

    o   A ADO.NET Entity Framework

    o   B C# 3.0: New Language Features

    o   C Visual Basic 2008: New Language Features


  • Multiple Hierarchies from SQLCAT

    I'm still late with blog reading and sometimes I discover interesting things one or two months later than the original posting. I just read the SQLCAT analysis of the several solutions available to handle multiple parent-child hierarchies within a single dimension. The multiple hierarchies pattern described in my many-to-many paper has been used (with some variations) and now I have the "SQLCAT certification" that this model is faster than others!

    I think there is space for improvement in this area and the many-to-many space is still a relatively unexplored space. If you have experience adopting those models, please share your knowledge - and if you are shy, write me directly and I'll post the interesting data.


  • Many-to-Many Session at European PASS Conference 2008

    I'll be a speaker at the European PASS Conference 2008 next week in Dusseldorf. I will talk about advanced dimensional modeling using many-to-many relationship. The content is based on "The many-to-many revolution" paper I wrote two years ago. If you will attend that conference and you already used some of the models I described in the paper, I will be happy to get direct feedback from you.

    My session is scheduled on Wednesday at 14:30. Unfortunately, I will not have much time after the session because I will have to go to the airport - for this reason, contact me in advance if you want, I'll be at the conference starting from Monday.


  • LINQ to extract object permission from SSAS

    Yesterday one customer of mine encountered the issue described in KB933836 and posted also by Chris Sells.

    The issue is that the deployment of a SSAS database always overwrite object permissions defined on database objects like cubes. The "keep existing roles" setting of deployment wizard does not preserve those information. I needed to solve the issue and I simply made a simple program that creates an XMLA file with object permissions that I need to restore after database deployment. I used LINQ and XML Literals in Visual Basic - I wrote a book about LINQ and I have to find a reason for this, but I really think that LINQ saved my time.

    I will write a more complete article about this issue one day, may be... by now, if you have the same issue, you can contact me to get the code I wrote. If you don't need it, look at the readability of LINQ in a case like this.

    image 

    If I will receive some request, I will consider to publish this simple tool.

     


  • PASS Europe 2008 and many-to-many relationships

    It's a shame I haven't blogged for two months, but I've been really (I mean - REALLY) busy finishing the upcoming Microsoft Programming LINQ book. Now I'm coming back to my BI world and the first news is that I'll speak about SSAS models based on many-to-many relationships at the European PASS Conference 2008, April 14-16 2008, Neuss, Germany (Europe, of course...). For those of you who want to know what I'm going to talk about... you can read The Many-to-Many Revolution paper I wrote more than one year ago. It is based on SSAS 2005, but there is nothing to add with SSAS 2008 too.

    In reality I and Alberto developed some variation of the base models I described in the paper. In particular, we had great success with the Balance Reclassification model (sorry, we still don't have an article about it) that allows defining multiple parent-child hierarchies on a balance sheet, also using sign rollup operator (which is not natively supported if you have a many-to-many relationship with a parent-child hierarchy). Unfortunately, we desisted using it in a Gb-sized fact table for performance reason. But it is working well in many other typical scenarios, where balance sheet fact table has a granularity that gives a lower size of the fact table.

    Let me know if this is an argument (Balance Reclassification model) that is interesting for you - I could give some anticipation in a blog post or in a short article.


  • Many-to-Many Dimensions: Query Performance Optimization Techniques

    More than one year ago I published a paper about design patterns for many-to-many dimension relationships. Since then, I built other models and I got more experience about possible performance optimizations. Unfortunately, I still hadn't time to write about it and this topic requires verbose document and analysis to be reproducible and understandable by everyone. Luckily, now there is a white paper (Analysis Services Many-to-Many Dimensions: Query Performance Optimization Techniques) downloadable from Microsoft site that explore this world and offers an interesting set of scenarios and possible optimizations, with numbers that explain when, what and how you can expect to optimize a model based on one or more many-to-many dimension relationships.

    I already read the document and it is pretty good. Unfortunately, in the real world there are still complex scenarios that cannot be optimized with the techniques used here. I hope that in future versions the SSAS engine will provide some more advanced optimizations for similar cases. One of the first step could be simply automating the optimizations made "by hand" like the Matrix Relationship Optimization shown in the paper. Another would be the simplification for defining efficient aggregations when many-to-many relationships are involved (now you could create a lot of aggregations that are unused when M2M are involved, and sometimes users query a cube only using M2M relationships - you have to tune the aggregations manually).


  • Free Introducing LINQ book

    Usually I write in this blog about Business Intelligence without too much consideration for "traditional" access to SQL Server. However, in a parallel universe I also wrote a book about LINQ and now that we are in full Christmas mode, I hope you may be interested that you can download the Introducing Microsoft LINQ book for free!!

    I wrote this book with Paolo based on Beta 1. Some contents are now outdated, but general concepts and 80% of the syntax is still valid. And, remember, now it's free, so you shouldn't claim too much! We are currently working on a more advanced and complete book about LINQ (Programming Microsoft LINQ), which is expected to be available by mid-May 2008.

    Have a good read, a Merry Christmas and a Happy New Year!


  • Real Report Builder 2008 improvements for Analysis Services

    Brian Welcker (who is leaving SSRS team - good luck Brian!) has announced a long explanation of the Report Builder 2008 positioning.

    For me (and all of you who live in the multidimensional business...) the most important part of the post is this simple sentence:

    [...] However, for folks building reports against Analysis Services cubes, auto-generating a model is no longer required.

    This is really a good news who will drive Report Builder adoption in SSAS shops. I still haven't tested Report Builder 2008 very much, but this news will help me to find the time to do it.


  • Generate date dimension that support working days calculation

    I just wrote a T-SQL query based on CTE that generates support information to calculate working days in a period. Simply look at the WorkingDaySequential measure, which difference between two dates is the number of elapsed working days.

    In the real world:

    • The Holidays CTE could be replaced by a real table with user-inserted data.
    • The temporary [#Calendar] could be a real table, providing necessary data access from DM

    and probably much else.

    The following code is provided "as is", without any warranty about its behavior.

    UPDATE 2007-11-28 - fixed two missing conditions in the query

    SET STATISTICS IO ON
    GO
    
    SET DATEFIRST 7  -- Default is 7 (US / Sunday) 
    GO
    
    DECLARE @StartYear AS INT 
    DECLARE @EndYear AS INT 
    
    SET @StartYear = 2006; 
    SET @EndYear = 2015; 
    
    WITH    Holidays
              AS ( -- Italian Holidays (use year 1900 for recurrencies dates)
                   SELECT   Date = CAST('19000101' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19000106' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19000425' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19000501' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19000602' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19000815' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19001101' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19001208' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19001225' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('19001226' AS DATETIME)
                   UNION ALL
           -- Holidays changing date each year
                   SELECT   Date = CAST('20060417' AS DATETIME)
                   UNION ALL
                   SELECT   Date = CAST('20070409' AS DATETIME)
                 ) ,
            Years
              AS ( SELECT   YYYY = @StartYear
                   UNION ALL
                   SELECT   YYYY + 1
                   FROM     Years
                   Where    YYYY < @EndYear
                 ) ,
            Months
              AS ( SELECT   MM = 1
                   UNION ALL
                   SELECT   MM + 1
                   FROM     Months
                   WHERE    MM < 12
                 ) ,
            Days
              AS ( SELECT   DD = 1
                   UNION ALL
                   SELECT   DD + 1
                   FROM     Days
                   WHERE    DD < 31
                 ) ,
            DatesRaw
              AS ( SELECT   YYYY = YYYY,
                            MM = MM,
                            DD = DD,
                            ID_Date = YYYY * 10000 + MM