|
|
|
|
SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server
-
To make a long story short: the ADO.NET team is now responsible of ADO.NET Entity Framework (including LINQ to Entities) and of LINQ to SQL (the last one was originally in charge of the a separated team, tied to the C# compiler).
There is an evident overlapping between LINQ to SQL and LINQ to Entities and since the first day, Microsoft said that in the long run, these two solutions would have been merged into a single one. Now, the roadmap that is arising is: Entity Framework will be improved adding features that will be necessary to cover scenarios where LINQ to SQL today is preferred over LINQ to Entities and Entity Framework.
There are a lot of comments - I suggest you starting here to get a good recap and pointers to many others.
My personal opinion is that LINQ to SQL is very good in some scenarios and should not be dropped until a good alternative (in EF?) is available. For example, I use LINQ to SQL to implement nightly processes that are part of ETL solutions. In these cases, I use LINQ to SQL to read data (expecially configuration data, but sometimes also source data) and use the SqlBulkCopy API to write data into destination tables. Having all the necessary into a single executable file, without external dependencies, is a big advantage for deployment (a single file to copy). Today LINQ to Entities would be slower, would have more files and would require .NET 3.5 SP1 on production servers (the last one would not be a real issue in my case). There are of course other scenarios when there is something that makes LINQ to SQL a better choice against the current version of Entity Framework.
My hope is that a convergence of two partially overlapped frameworks is good, but at the same time this shouldn't be a penalization for the current users of the "losing" part. This will require several releases of .NET to be done, and I hope that in the meantime the LINQ to SQL engine will have a decent evolution to keep its current position of "light LINQ oriented DAL replacement to SQL Server".
|
-
The PDC 2008 is finishing and there are some news for developers that involves BI too. The big news of this PDC is Windows Azure. Microsoft enters the hosting service market, but unlike other players, there is a new programming model “for the cloud”.
However, it has been disappointing the way Microsoft presented these new technologies to more than 6,000 developers attending PDC from all over the world. Ray Ozzie talked in two keynotes in two different days that seemed to be dedicate to non-so-technical press and analyst. This was the first PDC for Ray Ozzie and probably nobody explained him what kind of highly skilled developers the audience of PDC is. People who really want to get the big picture but also the inner details. In these keynotes, nobody really explained well what is the roadmap. Ok, some day every application will be a service in the cloud, but today we live in a different world. What are the intermediate steps in the years to come? What should be the transition from the present to the future? What is the vision of Microsoft in this migration?
I haven’t heard strong messages. I haven’t seen a clear roadmap. Like me, other attendees were frustrated. We had to understand what’s happening reading between the lines. Inferring information from breakout sessions without a precise guide. Not a good result, Microsoft. Believe me, these are the facts you should have emphasized during the keynotes.
· WPF: it’s here to stay. I would have asked to attendees “how many of you still use Windows Forms?” – and then “Guys, I hope you have a strategy to migrate to WPF, we’re going there and we’re serious about it”. Visual Studio 2010 will have a WPF interface. Completely pluggable. This is an important message that goes beyond the Visual Studio extensibility. It’s a signal to the market. WPF is ready for big games. But only 2 minutes were dedicated to this.
· Oslo: this is the most revolutionary thing for developers that will change the way they work, probably more than the transition from COM to .NET. And, guess what? No mention of Oslo in two keynotes. But, wait, there was a 90 minutes keynote from Don Box and Chris Sells. And what happened? They had to shrink their presentation to 60 minutes, talked about Windows Azure and didn’t mentioned Oslo. This has been premeditated. Why they did so? I would like to know the answer. The result: a lot of developers (may be the majority) are still wondering what is the purpose of “M”. It’s wonderful, but nobody state a clear message about its positioning in the long term strategy.
· .NET 4.0: sessions around had good information about it, but even in this case, more information, strategy and future directions during keynotes would have been appreciated.
· PDC 2009: yes, there will be another PDC on 17-20 November, 2009. Still in Los Angeles (I would prefer Las Vegas, anyway). But this “short” distance means that something that Microsoft is disclosing is still in an early stage. Otherwise it means that there are other new things coming soon. May be both. However, this could be not a good signal to the market – more like “wait another year before making strong investment on anything”. At least, to me it sounds like this.
So, this is my feedback about PDC 2008. Now, what’s coming for BI? There are several services including the integration of SQL Data Services as a data source to Reporting Services and an incubation project about Data Mining provided “by the cloud”. The first becomes interesting once some public shared data would become available on the cloud (imagine updated demographic data to be compared with internal sales data). The second is really interesting because it opens the doors to a wider use of Data Mining tools. But today we can only play with it, because there are no information about the cost of such a service whenever it will go in production. However, the message is that the cloud is relevant for BI too.
|
-
If you are at PDC 2008, I'll be at book signing for Programming Microsoft LINQ at bookstore on Tuesday 28, during the coffee break between 3:00 and 3:30 PM. I and Paolo will be happy to meet you and receive your direct feedback about our LINQ book.
|
-
Still some consideration about Gemini. In the last few days, I've seen some interesting posts and comments.
Chris is still worried about the lose of control, and I agree with him that metadata are a key part of a complete BI solution, and today we are missing it - this is one of the reason that inspired many choices of the SQLBI Methodology.
Chris still doesn't agree with Amir Netz, who commented on my and his blog with another analogy ("analogy game" in the title explained) that is so interesting (you might agree or disagree - but it's a funny game) that I hope it deserves more visibility than in a blog comment.
In many ways, today's BI world behaves very much like a 1920's centralized communist regime. The central BI government owns all of the legal supply side. It believes it knows better than anyone what the citizens needs and wants and set up to provide everyone’s needs. While it does not have the capacity to handle anything close to the full needs of the population, it prefers that some will go hungry and barefoot rather than let them supply their own needs. In fact, it will prosecute anyone who shows any enterprise spirit offering an alternative supply route. It will demonize the free enterprises, call them “capitalist pigs”, “traitors of the cause”. The central goverment believes its suppression of the free enterprises is for the “common good of the motherland”, even at the cost of the suffering of the citizens, without realizing that the motherland is supposed to serve the citizens first and foremost.
The central regime will find itself with an endless attrition war against the population. The black market will thrive unless ruthless measures are taken. Back to our BI world – these iron fist measures are mostly unacceptable in today’s enterprises. Therefore the central IT government often finds itself without much teeth. It can wage a propaganda war but the citizens will mostly ignore it and will continue with their black market activities.
Gemini is the “glasnosts”. It is the relaxation of the central monopolization of the supply taps. It is about allowing free enterprises to operate, as long as they are not overstepping reasonable bounds. It is about oversight rather than suppression. Gemini is about trusting the citizens to do the right things once given a chance (presumed innocent unless proven guilty). In such a market the citizens and the government work together for the benefit of the population. This new marketplace is about creating efficiencies and encouraging creativity. It is about letting new ideas float and let the market pick the winners. The government will still hold firm control over the most essential services: military, judicial, education and healthcare. But at the same time, it will allow almost everything else to be generated by the free population. It will hold a regulatory and police force to verify that the bounds are not stepped over but at the same time, will be non-intrusive until violations occur.
So, which regime do you endorse?
Amir.
As I said, I share the fears of Chris, but I'm still convinced that this is not important. Gemini is inevitably. The question is how to manage it and its possible misuse. If we will be unable to do it, we will not have time to build "trusted" solutions because all of our time will be lost trying to fix problematic Gemini-Excel reports. A strong metadata initiative from Microsoft would be a relief for many of us in this direction.
|
-
I looked at some of the Gemini comments wrote in the week-end. I'm sincerely worried about the possible abuse of this tool, just like everyone over there. However, I think that Microsoft shown Gemini so early (read my previous post about what Microsoft has not shown) just to make a position in the market. There are other tools out there that are similar to Gemini but that are proposed as an economical alternative to classical DW/OLAP solutions. The fact that also Microsoft will be part of this market should clarify to several customers that these two technologies are not necessarily concurrent but could be complementary. Of course, the problem is that this technology is particularly subject to misuse and abuse. I liked the analogy made by Chris between Gemini and illegal drugs. What is the best way to control them? Legalization or prohibitionism? It's a really hard question. Considering we are talking about software and you can't have rules and police to enforce prohibitionism, it seems to me that legalization is the only way to control it. However, my worst nightmare is another one: looking at some MS marketing presentation selling Gemini as the "final solution" to every BI needs. Will Microsoft be able to correctly instruct their sales rep and their partners about how to correctly position its product? In my opinion, this will be the most critical part of the Gemini debut.
|
-
I’m a little bit disappointed of not getting information about PerformancePoint v2 at the Microsoft BI Conference 2008. Why to announce Gemini that will be shipped in 2010, and not mention a single word about PerformancePoint v2, which whould be shipped probably earlier than that date?
In reality, the reason is clear to me. Gemini has been announced now to make a position against competitive product that are already on the market. Of course, Gemini will be much more integrated with the overall platform, but it will not available to customer in production for at least 18 months, I think. Thus, this announcement is a tactical move for Microsoft.
However, what we can do today? We need to deliver “traditional” BI Solution and the new release of PerformancePoint would be important in all these scenarios where the actual version is not mature enough to meet existing requirements. The official word should be “PerformancePoint will be part of the Office 14 wave, and it has still to be announced”. However, I would have expected a much more detailed explanation of products coming in the near future.
Now, let’s make a summary of the Microsoft BI Conference 2008. Most of attendees I talked with (including me) think that three keynotes with this kind of content for three days of conference were not a good idea. Only the keynote of the first day makes sense. The other were unproductive and/or not so important at all. The session about Gemini held yesterday by Amir Netz was way way way more visionary of a lot of buzzwords we heard on following keynotes. Microsoft did a very good job of improving technical level of sessions, but this good job has not be done for keynotes. Moreover, all these keynotes removed time for breakout sessions, allowing an attendee to see only 9 breakout session in three days. You can see the same number of breakout sessions in two days of a “typical” Microsoft technical conference. Need several improvements here, but the technical content was really improved over the previous edition.
A good news is that the third Microsoft BI Conference is already scheduled: October 6-8, 2009, in Seattle. Save the date, it should be a conference delivering real bits to attendees…
|
-
I just found this useful Add-In for Excel 2007 that checks the compatibility of a spreadsheet for Excel Services publication. I'd like to get feedback if you know other similar tools and/or if you have experience with this one.
|
-
I got some more information about Gemini. As Mosha said, it is Analysis Services. As you know, SSAS has always been available also as a client tool, using the same engine of the server product to create local cubes. The Kilimanjaro release will add several features to this engine that, integrated with the client part of Gemini (which allows end users to define their own model), will give high speed performance to cubes which are stored in-memory and without any pre-calculated aggregation.
According to Ashvini Sharma, these new features might not be available to traditional SSAS developers in the Kilimanjaro release. However, cubes created from end-users will expose all SSAS metadata as usual. For this reason, these cubes will be immediately available to any SSAS-enabled client.
Now, it is questionable to call these entities cubes, since (but this is my speculation) they could have a design very unusual if compared to traditional cubes.
Anyway, in the long term, all these new features (which have still not be disclosed in detail) will be available at any level. Just in the first release, the server instance of SSAS will be used by Excel Services. When you create a model into Excel, the local engine will be used. When the same spreadsheet will be published to SharePoint, the Excel Services engine will use an instance of SSAS (but may be more than one instance on more servers) to make the calculation on these data.
Unfortunately, not all information about internal details can be disclosed now. However, it seems that SSAS will have a bright future.
|
-
We’ve seen it on stage. It’s Excel on the steroids. I’m talking about “big announcement” here at Microsoft Business Intelligence Conference. The several announcements are:
· Madison: the announcement of integration between SQL Server and DATAllegro, release expected within 1H 2010. Look at it if you have a datawarehouse of hudreds of Tb or more.
· Kilimanjaro: a sort of “R2” version of SQL Server expected to be released within 1H 2010. It includes self-service analysis (project code Gemini) and self-service reporting
The most interesting part is of course the Gemini project. It’s a column-based in-memory storage that seems to be part of Analysis Services (not so much details by now) and that can be used locally with Excel (just like you can use local cubes, I believe). At the end, you have an add-in for Excel that allows you to build relationship between data coming from different sources and simply “pasted” into the model. No model design required. Relationships between data are inferred by the engine and the navigation tool provided in Excel is an evolution of the PivotTable that you can then publish in SharePoint. In a few words, Excel is the reporting tool of the future. Just like today, but much more powerful.
I would like to use this user interface with existing SSAS cubes – the user interface to filter data is so much better than today… It’s better to not show this thing to end user until it will be released – in the meantime, we have to look at what really does and what are the real limits of this approach. I suspect that data cleansing and more “traditional” BI solutions are not going to disappear, but this approach can integrate very well and satisfy customer needs that cannot afford a complete project lifecycle before it can be delivered. More details in the next few days…
|
-
Today we published the draft of the paper "SQLBI Methodology at work". We applied the SQLBI Methodology to the well-known Adventure Works.
As usual, we look forward to get your feedack in the dedicated forum.
|
-
The first cumulative update for SQL Server 2008 has been shipped. It can be downloaded here.
This cumulative update contains several hotfix released for AS2005 (CU8, CU9, CU10) but not included in SQL Server 2008 RTM. There is also some bug fixed.
|
-
The next Microsoft Business Intelligence Conference is coming and I just looked at sessions to include in my agenda. Every time I do this work, I find something in sessions descriptions that anticipates something that is coming out as news announcement in some of the keynote. But this time, session titles are enough to see that big news are coming.
CL211 New Horizons for Microsoft Business Intelligence with Self-Service Analysis Technologies presented by Donald Farmer and Amir Netz
PL308 New Horizons for Microsoft Business Intelligence with Self-Service Reporting presented by Lukasz Pawlowski, Carolyn Chau, Sean Boon, Roger Sanborn, Chris Baldwin, Thierry D'hers
I don't know if there are new products or new releases of existing products involved in these announcements. However, the keyword here is "self-service BI". Other vendors already used this term in the past, now it's the Microsoft turn. I hope to see real products and not only buzzwords.
BTW: I will attend the conference in Seattle - drop me a line if you will be there and you'd like to discuss the SQLBI Methodology with us (Alberto is coming too). You can find post about it using the Methodology tag.
|
-
This post is part of a Methodology discussion - other posts will follows. I will be happy to get your feedback!
I am proud to announce a public draft of the first paper about the SQLBI Methodology. I and Alberto Ferrari tried to define a consistent methodology which covers the construction of the back-end of a BI Solution using Microsoft SQL Server and its complementary services. Now, we are looking for comments and feedback about it.
But, wait. Is this “yet another data warehouse methodology”? It depends from your point of view, because it strongly inherits concepts from both Kimball’s and Inmon’s methodologies, but it also includes concepts that are very specific to serve a multidimensional database like Analysis Services. However, before you start to read the whole paper, we start to make a fast comparison in this post between different methodologies we are talking about.
On the web, there are plenty of discussions regarding the Inmon vs Kimball debate. Both authors have fans that seem to believe that the choice between the two approaches to a data warehouse project is much more a religious war than a technical decision.
We (I’m writing these posts with Alberto Ferrari) do not want to start another war with this post, mainly because we do not believe that such a war has any reason of being, as is the case with any religious war.
Both Inmon and Kimball developed methodologies that work well in different kind of data warehouses. However, there are situations where the Kimball approach brings a fast and effective data warehouse and there are others where the Inmon approach leads to a cleaner solution. Both methodologies have drawbacks too and this post is not the right place to list all of them.
What is really important, in our opinion, is something else. If you choose the wrong approach, you will discover (usually too late) that you cannot complete your data warehouse project on time and on budget. Moreover, since the requirements in the BI world change frequently, it is often the case that you have to choose the methodology when the real requirements are still not clear.
Having worked with both methodologies for different projects, we believe that a careful analyst should take a different approach other then the “religious war”. For example, it would be a good idea to use the best of both methodologies, starting with the easier Kimball method but being ready to introduce Inmon structure if and when needed.
Our experience is that a clean analysis of the entire BI solution (including the data warehouse) can produce a very effective database and a set of ETL procedures that will let you get the best from both methodologies. Most important, this can change the general approach. The ability to change idea very late in the data warehouse building is very important, because it will let you follow the user requirements even when they deeply change.
If all the ETL process is designed with a strong and clean architecture, then it is possible to switch from Kimball to Inmon, in a smooth and easy way. Clearly, this is feasible if and only if you start – from the beginning – with a mixed approach in mind. This mixed approach is the one that is described in the first paper about the SqlBI methodology, downloadable here.
As said at the beginning, since this is the first public draft of the paper, we will be glad to receive feedback about it. You can contact us directly and you can also participate in the dedicate forum SQLBI Methodology forum.
|
-
Bart De Smet just wrote a long post about LINQ predicates that can be defined without returning a boolean value. This is something I partially evaluated writing the Programming Microsoft LINQ book, but in his post Bart goes very deep on this topic and shows a lot of interesting details and ideas.
|
-
This post is part of a Methodology discussion - other posts will follows. I will be happy to get your feedback!
Building a BI solution (like any kind of software project), it is normal to look for existing methodologies and best practices, just to avoid pitfalls and get better results. Usually, a methodology is not related to a particular technology and/or software product. However, considering some details and/or features that will be available in your solution might affect important decisions that will be made. If we apply these concepts to a BI solution, we might discover that features of a particular product might have a wide impact in the overall architecture.
In this post, we want to consider the impact that the adoption of the SQL Server BI stack (and in particular of Analysis Services) can have on a methodology.
The first part of a modern BI solution is typically a data warehouse. Even if you adopt Analysis Services, usually you will have a relational star schema as a data source. This star schema will be part of a Data Mart extracted from the Data Warehouse, or sometime will be part of the Data Warehouse itself. If we talk about relational star schemas, and more in general if we talk about Data Warehouse modeling, there are well-known best practices to follow. Nevertheless, these papers and books are generally not tied to a particular platform; the only requirement is the use of a relational database. The model you will create will be not affected by the product of a particular vendor you will use as a DBMS. If the user is going to write its own queries against the Data Warehouse, everything is ok.
Now, when Analysis Services comes into the game, our previous choices might affect our final result. Analysis Services has several modeling capabilities that might change the relational Data Mart design we would have been thought otherwise. Just to name a few of these features:
- Many-to-Many Dimension Relationships
- Reference Dimensions
- Parent-Child Dimensions
- Join to dimension at different level of granularity
- Data Source View (which allows the definition of views outside from the database)
- MDX Scripts
At this point, several issues arise. Are the reference dimension a signal that snowflake schema might be better than star schema? Are the parent-child dimensions a way to avoid the construction of a fixed number of hierarchy levels corresponding to the maximum deep of the existing hierarchies? Should we prefer named queries in Data Source View instead of creating regular (and centralized) views into the relational database? Can the power of MDX Scripts substitute some part of the ETL processing?
Answering yes or not to the each of the previous questions might affect our architecture, and in particular the relational design (but also ETL). We didn’t mention the many-to-many relationships. When used in a simple way, they reflect the existing many-to-many dimension relationships existing in a regular Kimball design. However, as shown in the “The Many-to-Many Revolution” paper, we might create very atypical relational schema, just to satisfy our multidimensional modeling needs.
Thus, we need to consider how these changes affect our methodology of choice. If we want to leverage on features of a particular product, we will need to make several exceptions to a standard methodology. Not having a guide for these exceptions often brings to inconsistent results, where several people of the team (or, sometime, the same people over time) use different techniques to implement something that is not well described in the original methodology.
Moreover, if you extend these considerations to the client used to navigate OLAP cubes, there are more substantial differences. These differences impact both the user experience in terms of speed and ease of use and the format of the query sent to the server, forcing you to adapt the cube structure to the specific client.
For these reasons, in the last years we defined a set of rules, patterns and best practices that forms a specific methodology to implement a BI solution with the Microsoft SQL Server BI stack of services. We named it “SQLBI Methodology” and we will publish within September 2008 on the SQLBI web site.
|
|
|
|
|
|