THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jorg Klein

Jorg Klein, Microsoft-only BI consultant from the Netherlands

  • SSIS - Blowing-out the grain of your fact table

    Recently I had to create a fact table with a lower grain than the source database. My source database contained order lines with a start- and end date and monthly revenue amounts.

    To create reports that showed overall monthly revenue per year, lowering the grain was necessary. Because the lines contained revenue per month I decided to blow out the grain of my fact table to monthly records for all the order lines of the source database. For example, an order line with a start date of 1 January 2009 and an end date of 31 December 2009 should result in 12 order lines in the fact table, one line for each month.

    To achieve this result I exploded the source records against my DimDate. I used a standard DimDate:
    clip_image001[4]

    The query below did the job; use it in a SSIS source component and it will explode the order lines to a monthly grain:

    Code Snippet
    1. SELECT OL.LineId
    2.       ,DD.ActualDate
    3.       ,OL.StartDate
    4.       ,OL.EndDate
    5.       
    6.   FROM OrderLine OL
    7.   INNER JOIN DimDate DD
    8.       ON DD.Month
    9.       BETWEEN
    10.       (YEAR(OL.StartDate)*100+MONTH(OL.StartDate))
    11.       AND
    12.       (YEAR(OL.EndDate)*100+MONTH(OL.EndDate))
    13.       
    14.   WHERE DD.DayOfMonth = 1

     


    Some explanation about this query below:

    · I always want to connect a record to the first day of the month in DimDate, that’s why this WHERE clause is used:

    Code Snippet
    1. WHERE DD.DayOfMonth = 1


    · Because I want to do a join on the month (format: YYYMM) of DimDate I need to format the start and end date on the same way (YYYYMM):

    Code Snippet
    1. (YEAR(OL.StartDate)*100+MONTH(OL.StartDate))

    The source, order lines with a start and end date:
    clip_image002[4]


    The Result, monthly order lines:
    clip_image003[4]

  • SSIS - Let the Excel connection manager pick the right column data types from an Excel source

    The excel connection manager scans every first 8 rows to determine the data type for a column in your SSIS source component. So if an Excel sheet column has integers on the first 8 rows and a string value on the 9th row, your data flow task will crash when executed because SSIS expects integers.

    Fortunately you can change the number of rows that Excel will scan with the TypeGuessRows registry property.

    Change TypeGuessRows:

    1. Start Registry Editor by typing "regedit" in the run bar of the Start menu.

    2. Search the register (CTRL-F) on "TypeGuessRows".
    Search

    3. Double click "TypeGuessRows" and edit the value.
    Edit

    Todd McDermid (MVP) commented the following useful addition:
    "Unfortunately, that reg key only allows values from 1 to 16 - yes, you can only increase the number of rows Excel will "sample" to 16."

    Robbert Visscher commented:
    "The reg key also allows the value 0. When this value is set, the excel connection manager scans every row to determine the data type for a column in your SSIS source component."

    Thanks Robbert, I think setting it to 0 can be very powerful in some scenario's!
     

    So the conclusion of the comments of Todd and Robbert is that a value from 0 to 16 is possible:
    • TypeGuessRows 0: All rows will be scanned. This might hurt performance, so only use it when necessary.
    • TypeGuessRows 1-16: A value between 1 and 16 is the default range for this reg key, use this in normal scenario's.
  • SSIS – Unpack a ZIP file with the Script Task

    A while ago I needed to unpack a couple of zip files from SSIS. There is no Microsoft SSIS task that contains this functionality so I searched the Internet. It seems that there are quite some third party tools that offer this functionally. It's also possible to download custom SSIS tasks. I personally always try to avoid third party tools and custom tasks so I searched on.
    It seemed there is a way to unzip files from SSIS with the Script Task. With some Visual Basic code using the Visual J# Library you can do the job. In this blog post I will use a Foreach Loop Container to loop through a folder that contains multiple zip files and unzip them one-by-one.

    Make sure you have the Microsoft Visual J# Redistributable Package installed because a reference to vjslib.dll (Visual J# Library) is needed in the Script Task. Download it here for free.

    Drag and drop a Foreach Loop Container on the Control Flow and create three variables with scope on the Foreach Loop Container:
    Variables

    Now configure the Foreach Loop Container:
    - Enumerator: Foreach File Enumerator
    - Files: *.zip
    - Retrieve file name: Name and extension
    Editor

    Next click on the + next to Expressions add the following expression to connect the SourceFolder variable to the Directory property of the Foreach Loop Container:
    Properties

    Now go to the Variable Mappings and select the FileName variable on Index 0. Doing this we will be able to access the current file name when the Foreach Loop Container enumerates the zip files.
    Editor2

    Now drag and drop a Script Task on the Control Flow, inside the Foreach Loop Container:
    Control Flow

    Open the Script Task Editor and do the following:
    - Set the ScripLanguage on: Microsoft Visual Basic 2008
    - Select our three ReadOnlyVariables using the new SSIS2008 Select Variables window:
    SelectVariables

    Now click Edit Script and copy/paste the following script:

    Imports System
    Imports System.Data
    Imports System.Math
    Imports Microsoft.SqlServer.Dts.Runtime
    Imports java.util.zip

        Public Sub Main()

            Try

                Dim strSourceFile As String
                Dim strDestinationDirectory As String

                'MsgBox("Current File: " & Dts.Variables("FileName").Value.ToString)

                strDestinationDirectory = Dts.Variables("DestinationFolder").Value.ToString 
                strSourceFile = Dts.Variables("SourceFolder").Value.ToString & Dts.Variables("FileName").Value.ToString

                Dim oFileInputStream As New java.io.FileInputStream(strSourceFile)
                Dim oZipInputStream As New java.util.zip.ZipInputStream(oFileInputStream)
                Dim bTrue As Boolean = True
                Dim sbBuf(1024) As SByte

                While 1 = 1

                    Dim oZipEntry As ZipEntry = oZipInputStream.getNextEntry()

                    If oZipEntry Is Nothing Then Exit While

                    If oZipEntry.isDirectory Then

                       If Not My.Computer.FileSystem.DirectoryExists(strDestinationDirectory & oZipEntry.getName) Then

                            My.Computer.FileSystem.CreateDirectory(strDestinationDirectory & oZipEntry.getName)

                        End If

                    Else

                        Dim oFileOutputStream As New java.io.FileOutputStream(strDestinationDirectory.Replace("\", "/") & oZipEntry.getName())

                        While 1 = 1

                            Dim iLen As Integer = oZipInputStream.read(sbBuf)

                            If iLen < 0 Then Exit While

                            oFileOutputStream.write(sbBuf, 0, iLen)

                       End While

                        oFileOutputStream.close()

                   End If

                End While

                oZipInputStream.close()
                oFileInputStream.close()

            Catch ex As Exception

                Throw New Exception(ex.Message)

            End Try

        End Sub

    End Class


    Now only one thing needs to be done, add a reference to vjslib.dll (Visual J# Library):
    Add Reference

    &
    Libary

    Your unzip solution is ready now! For testing purposes you can uncomment the following line in the script to see the file name of each processed zip file in a message box at runtime:

    'MsgBox("Current File: " & Dts.Variables("FileName").Value.ToString)

    MsgBox



    You can use this solution in many ways, for example, I used it in the solution below where I download multiple zip files from an FTP. These zip files contain CSV's that are used as source for the loading of a data warehouse.
    Solution

     

  • SSIS - Lookup is case sensitive

    A while ago I figured out that the lookup transformation is case sensitive.
    I used a lookup to find dimension table members in for my fact table records. This was done on a String business key like ‘AA12BB’. I attached a table for the error output and after running the package I found one record in this table.This record had a business key like ‘Aa12BB’. I searched the dimension table for this missing record and it surprised me, it DID exist but with the following business key: ‘AA12BB’. It seemed the lookup transformation is case sensitive. Next thing I tried was a T-SQL query in the management studio of SQL Server 2005. In the WHERE clause I referred to the business key: ‘Aa12BB’. The query returned the record with business key ‘AA12BB’. Conclusion: SQL Server is not case sensitive but the SSIS lookup component IS case sensitive… Interesting.

    Solution:
    After some research I found a few solutions for this interesting feature of the lookup transformation. Before I explain these solutions you must know something about the inner working of the lookup component.

    A lookup transformation uses full caching by default. This means that the first thing it does on execution, is loading all the lookup data in its cache. When this is done it works as expected, but with case sensitivity.

    The solution is to set the CacheType property of the lookup transformation to Partial or None, the lookup comparisons will now be done by SQL Server and not by the SSIS lookup component.
    Another solution is to format the data before you do the lookup. You can do this using the T-SQL LOWER() or UPPER() functions. These functions can be used in a query or for example in a derived column SSIS component.

  • SSAS - Clear SSAS cache with an SSIS package

    Often I see developers on different forums asking how they can clear the SSAS cache. You can achieve this by restarting SSAS, which is done quite often.
    Restarting is not necessary though, it's possible to clear the cache of an SSAS database with an XMLA script. Use the script below and replace "YourSsasDatabaseId" with the ID of your SSAS database and "YourSsasCubeId" with the ID of your SSAS cube.

    <ClearCache xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
      <
    Object
    >
        <
    DatabaseID>YourSsasDatabaseId</DatabaseID
    >
        <
    CubeID>YourSsasCubeId</CubeID
    >
      </
    Object
    >
    </
    ClearCache>

    Please note that you must use the ID's of the database/cube and not the name! You can find the ID's of these objects at the properties in BIDS or in SSMS. As you can see on the screenshot below, the cube/database name is not always the same as the ID. This particular cube has ‘Finance' as its name and ‘DW' as its ID!

    Cube Properties

    You can run this script manually from SSMS or automatically using SSIS. To run it from SSMS, right click your SSAS database and choose New Query > MDX. Although this is an XMLA script and not MDX it can be executed as MDX script.

    Running this script from SSMS is useful but I think most developers would want to clear the cache automatically. Fortunately SSIS has a ‘Analysis Services Execute DDL Task' that can execute this script!

    task1

     

     

     

    Just configure the task for your SSAS instance and copy/paste the script in the SourceDirect box as shown below.

    task2

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    With the possibilities SSIS offers you can now clear the cache anytime you want. For example you could run it right after you have processed the cube. A lot of developers restart the server after processing but with this script that is no longer necessary!

    Thanks to Jamie Thomson for sharing this script on his blog!

  • SSAS - Speed up dimensions using a NULL default cube measure

    Recently I faced some problems with the performance of SSAS dimensions. The cube users were using a large dimension with more than 100.000 members that didn't perform well.
    They tried to add a leaf dimension member on an Excel 2007 pivot table. When dragging this dimension member onto the rows they had to wait very long before the members returned from SSAS and showed on the screen.

    After some mailing with Chris Webb he thought this could have something to do with the default cube measure. It seems that when you query dimension members without picking a measure, SSAS takes the first measure from the first measure group as its default measure. So even when you only query a dimension, SSAS is still using a measure!

    You can find out which measure SSAS will take with the following query:

    SELECT [Measures].DefaultMember ON 0 FROM [YourCube]

    In this case the default measure that SSAS picked was from a measure group that was not connected to the dimension that was giving the performance problems. This, plus the fact that returning 100.000 times NULL is faster then returning some big float number, explained the performance issue.

    Chris advised me to use a NULL calculation as default measure, as he explains on his blog. The only problem here is that you can't select a calculation as default measure in the cube properties in BIDS (only normal measures are allowed):

    Default measure cube properties

     

     

     

     


    Fortunately pasting this MDX statement in the calculations script (use the script view on the calculations tab) did the trick. Just paste it right under the CALCULATE command (or somewhere else):

    CREATE MEMBER CURRENTCUBE.MEASURES.UseAsDefaultMeasure
     AS NULL,
    VISIBLE = 1;

    ALTER CUBE CURRENTCUBE UPDATE DIMENSION Measures,
    DEFAULT_MEMBER = [Measures].UseAsDefaultMeasure;

    When you return to the form view it should look like this:

    Calculations Script with default measure

     

     

     

     

    The next time you deploy the cube, SSAS will use the [UseAsDefaultMeasure] calculation as its default measure.


    Test results:

    I now had a nice opportunity to test the cube performance with and without the NULL calculation as default cube measure. I cleared the cache before each query to get a good comparison.

    In SSMS I used the following query:

    SELECT [Measures].DefaultMember ON 0,
    [Dimension].[AttributeHierarchy].Members ON 1
    FROM [Cube]

    The results are shown in the bar-chart below:

    • The query with the NULL default measure took 13 seconds.
    • The query without the NULL default measure took 4 minutes and 35 seconds.
    Chart

     

     

     

     

     

     

     

     

     

     

     

    Conclusion:

    Using a NULL default measure can be much faster when querying dimension members without selecting a measure. The result was shown more than 20 times faster in this particular case.

    Thanks to Chris Webb for helping me out on this issue!

  • Start / Stop SQL Server, SSIS, SSAS, SSRS and SQL Server Agent at once with a batch file

    A lot of developers have SQL Server and its different services running on their PC or notebook to develop or test BI solutions. Unfortunately this slows down your system quite a lot. To speed things up when not using SQL Server, I used to stop and start each service manual quite often.

    Recently I found out that it's possible to start and stop all services at once with a simple batch file. It now only takes a couple of seconds instead of a few minutes and some annoying steps.

    Copy/paste the following in a .txt file and rename it to .bat to make it a batch file, execute it by double clicking the file.

    START SCRIPT:

    NET START "SQL Server Agent (MsSqlServer)"
    NET START "MsSqlServer"
    NET START "MsSqlServerOlapService"
    NET START "ReportServer"
    NET START "SQL Server Integration Services"

    STOP SCRIPT:

    NET STOP "SQL Server Agent (MsSqlServer)"
    NET STOP "MsSqlServer"
    NET STOP "MsSqlServerOlapService"
    NET STOP "ReportServer"
    NET STOP "SQL Server Integration Services"

  • SSIS - Decrease your fact table loading time up to 40%

    Replace the multiple "Lookup Error Output" Derived Columns and Union All's with a single Derived Column and get a performance boost...

    Almost every BI developer needs to perform lookups while loading the Data Warehouse. When loading the fact tables for example, lookups are used to receive the (surrogate) primary keys of the dimension tables that are connected to the fact table.

    I, and I think most developers, perform the DWH fact-dimension lookups in the following way:
    1. Each fact record contains business keys to the different dimensions.
    2. Each dimension record contains a business key and a surrogate key (integer).
    3. Each dimension contains one unknown member, with a surrogate key of value 0.
    4. Match the dimension business key in the fact record to the business key in the dimension to receive the dimension surrogate key, using a Lookup Component. If the lookup gained no match, point to the unknown member in the dimension (surrogate key 0).
    5. Store the fact record in the DWH fact table with the gained surrogate keys of the dimensions.

    Loading a fact table in the way described above would typically look like this:


    I have tested this solution with 313.341 records. This took 1 minute and 23 seconds.
    This way of loading the fact table contains a lot of semi-blocking components: the Union All's. Read more about semi-blocking components in
    this blog, posted by me a little while ago. These components cause an unnecessary negative impact on the performance, as you can read in my other blog.

    There is a much more efficient way to load a fact table in a datawarehouse:
    1. Set the all the Lookup Error Outputs to "Ignore Failure".
    2. Delete all the Union All and Derived Column components.
    3. Add a Derived Column component to the end of the flow and add the unknown surrogate keys like this:

     der

    The data flow now looks like this:

    df2

    The loading of exactly the same amount of records (313.341) now took just 45 seconds! This is a performance boost of almost 40%. Loading your fact table like this does not only decrease your loading time, it also takes less development time. So a win-win situation!

    A minor downside might be that you can't easily see how much records have failed a particular lookup component, but it's not so hard to make a simple (SSRS) report which gives you this overview. It might be a good opportunity to include the information of all your fact tables in one single report, it will give you a much better overview of lookup-failures during the load of your fact tables then all the separate SSIS dataflow tasks.

  • MCTS - I passed the 70-556 “Microsoft Office PerformancePoint Server 2007” exam!

    My last blog is from a few months ago because I have been quite busy lately. Fortunately I have quite some topics that I want to blog about, starting with this short blog about the latest Microsoft certification I've gained.

    In March of this year I passed the 70-556 “Microsoft Office PerformancePoint Server 2007” exam. This makes me one of the 486(September, 2007) Microsoft Certified Technology Specialists worldwide! (Nr. of MCP's worldwide)

     


    It wasn't a tough exam compared to the previous exams I took. Time was no problem, and there were only 44 questions that were all of the multiple choice type where only 1 answer is correct. The exam contained almost no focus on PerformancePoint Planning, so make sure you focus on the Monitoring and Analyzing parts of PPS while studying for this exam.

    I used the Rational Guides to PPS books to study for the exam. Next to the books I followed a course and together this was enough to complete the exam successfully.

    Find more information here:
    Microsoft 70-556 Exam - MCTS Performance Point

    On the MCTS Exam for PerformancePoint Server 2007

    Preparation Guide for Exam 70-556

     

    My Microsoft Transcript is now:

    Microsoft Certification Status

    Credential

    Certification / Version

    Date Achieved

    Microsoft Certified IT Professional

    Business Intelligence Developer

    Mar 28, 2008

    Microsoft Certified Technology Specialist

    Office PerformancePoint Server 2007, Applications

    Jun 19, 2008

    Microsoft Certified Technology Specialist

    Microsoft® SQL ServerTM 2005 Business Intelligence Development

    Oct 19, 2007

     

    Microsoft Certification Exams Completed Successfully

    Exam ID

    Description

    Date Completed

    556

    TS: Microsoft Office PerformancePoint Server 2007, Application Development

    Jun 19, 2008

    446

    PRO: Designing a Business Intelligence Infrastructure by Using Microsoft® SQL ServerTM 2005

    Mar 28, 2008

    445

    TS: Microsoft SQL Server 2005 Business Intelligence-Implementation and Maintenance

    Oct 19, 2007

  • SSRS – Matrix that adds a new column each time 5 rows are filled with data

    What if you want a dynamic list of values in a matrix but with a maximum of 5 rows. How do you create a matrix like this? I thought this should be an easy job but I found out it was not really simple…

    I tried to create a matrix like this for a dynamic list of countries. In this blog I will explain how you can achieve this with a few simple steps.
    1. You need to create an MDX(I used a SSAS datasource) query that returns the list of countries with a numbering:


    2. Next thing you need to do is create a matrix:

    3. Next and last thing you need to do is the following:

    • Use the following expression for the row group: =(Fields!Country_Number.Value - 1) Mod 5
    • Use the following expression for the column group: =Floor((Fields!Country_Number.Value - 1) / 5)

    Result:

  • SSRS – Static column headers in a Matrix

    How do you create a static column header centered above your dynamic columns? One way to try achieving this is to place a textbox above your dynamic columns. One thing is for sure, the textbox will never be on the perfect centered location and what if the number of dynamic columns grow or shrink?


    Thing you need to do is to create a static column group. You do this by adding a new column group to the matrix and give it a static expression, for example: =”static”
    Now make it the top group by clicking Up for the static column group on the Groups tab of the matrix’s properties. You can also achieve this by just dragging the column group up in the layout view.

    The result, a centered and perfect aligned column header with the text “YTD” above some dynamic columns containing years:

  • SSIS - Convert various String date formats to DateTime with the Script Task

    The script below is really nice if you need to convert String dates that come in different (unexpected) formats.
    Just copy and paste the code in your Script Task (paste the function outside Main()) and it works right away.

    Supported date formats:

    • YYMMDD
    • YY-MM-DD
    • YYYYMMDD
    • YYYY-MM-DD

    Of course it's possible to add code for more date formats yourself. If you want to, copy and paste your code in a comment. I will then add the code to this blog.

    ---------------------------------------------------------------------------------------------------------------------------

    Public Shared Function GetDateFromString(ByVal stringDate As String) As DateTime

            Dim datetimeResult As DateTime

            Try

                Dim centuryToAdd As Integer = 1900

                If (Convert.ToInt32(stringDate.Substring(0, 2)) < 80) Then

                    centuryToAdd = 2000

                End If

                If (stringDate.Length = 6) Then

                    'Format is: YYMMDD

                    datetimeResult = New DateTime((centuryToAdd + Convert.ToInt32(stringDate.Substring(0, 2))), Convert.ToInt32(stringDate.Substring(2, 2)), Convert.ToInt32(stringDate.Substring(4, 2)), 0, 0, 0)

                    Return datetimeResult

                End If

                If (stringDate.Length = 8) Then

                    If (stringDate.IndexOf("-") > 0) Then

                        'Format is: YY-MM-DD

                        datetimeResult = New DateTime((centuryToAdd + Convert.ToInt32(stringDate.Substring(0, 2))), Convert.ToInt32(stringDate.Substring(3, 2)), Convert.ToInt32(stringDate.Substring(6, 2)), 0, 0, 0)

                        Return datetimeResult

                    End If

                    'Format is: YYYYMMDD

                    datetimeResult = New DateTime(Convert.ToInt32(stringDate.Substring(0, 4)), Convert.ToInt32(stringDate.Substring(4, 2)), Convert.ToInt32(stringDate.Substring(6, 2)), 0, 0, 0)

                    Return datetimeResult

                End If

                If (stringDate.Length = 10) Then

                    'Format is: YYYY-MM-DD

                    datetimeResult = New DateTime(Convert.ToInt32(stringDate.Substring(0, 4)), Convert.ToInt32(stringDate.Substring(5, 2)), Convert.ToInt32(stringDate.Substring(8, 2)), 0, 0, 0)

                    Return datetimeResult

               End If

                Return Convert.ToDateTime(stringDate)

            Catch e As Exception

            End Try

            'No date format found: Return unknown(1/1/1900)

            datetimeResult = New DateTime(1900, 1, 1, 0, 0, 0)

            Return datetimeResult

        End Function


    ---------------------------------------------------------------------------------------------------------------------------

    If you want to convert a String SSIS variable and load it into a DateTime SSIS variable, use the following code in your Script Task:

    Dts.Variables("someDateTimeVariable").Value = GetDateFromString(Dts.Variables("someStringVariable").Value.ToString)

     

  • MCITP - I passed the 70-446 “PRO: Designing a Business Intelligence Infrastructure by Using Microsoft SQL Server 2005” exam!

    Today I passed the Microsoft IT Professional(MCITP) 70-446 “PRO: Designing a Business Intelligence Infrastructure by Using Microsoft SQL Server 2005” exam.
    This makes me one of the 337(January, 2008) Microsoft Certified IT Professionals worldwide! (
    Nr. of MCP's worldwide
    )


    MCITP

    To try the 70-446 exam its required to gain the 70-445 MCTS certification first. I have
    passed the 70-445 exam in November last year.

    The first conclusion I can make is that, strangely, I found the 70-445 exam more difficult than the 70-446 exam. I scored 70% on the 70-445 exam and managed to score 93% on the 70-446 exam. Partially this is because I gained more work experience in the last months and partially it’s because the 70-446 exam contains easier question types.

    The exam topics of exam 70-446 (full list: Microsoft Learning)
    The following list includes the topic areas covered on this exam. The percentage indicates the portion of the exam that addresses a particular skill.

     

    ·         Planning BI Solutions (15 percent)

     

    ·         Designing SSIS Solutions (21 percent)

     

    ·         Designing SSRS Solutions (14 percent)

     

    ·         Designing SSAS Solutions (22 percent)

     

    ·         Deploying and Optimizing SSAS Solutions (15 percent)

     

    ·         Designing Data Mining Solutions (13 percent)

    If we compare this with exam 70-445 we see that data mining is still quite important. Further we see that SSRS gets less attention in exam 70-446.

     

    70-445 vs 70-446

     

    70-445

     

    70-446

     

    SSIS

     

    27%

     

    21%

     

    SSAS

     

    30%

     

    37%

     

    SSRS

     

    32%

     

    14%

     

    Data mining

     

    12%

     

    13%

     

    Planning BI Solutions

     

    -

     

    15%

     


    Data mining 
    Just like with exam 70-445 data mining is an important topic of the exam! It’s important to know when to use the different algorithms that are shipped with SSAS.

    Training Kit
    Unfortunately there is no training kit available for this exam. You can find the book on some online bookstores but all with the following message: not available. After some research I found out that this book will never come out. I think MS Press will wait until the 70-446 exam for SQL Server 2008 is available.

    Number of questions and completing time
    The exam contains 6 case studies with each 9 or 10 questions. You have 30 minutes for each case, any remaining time from a case won’t go to the next case, so just use your 30 minutes. The 30 minutes were enough for me. I did not study the entire case study in detail before I looked at the questions. I think this is the best thing to do; quickly scan the case study, then look at the questions and then scan the case study for required information regarding the question.
    Just like the 70-445 exam you will require a minimum score of 70% for the exam to pass.

     

    Question types
    I already mentioned that the 70-446 exam question types were easier than the 70-445 question types. In this exam I only found the question type below, where just one answer can be selected. 70-445 had another 3 question types that were a lot harder then the one below.

    QType

    Some useful tips

     

    ·         Make sure you know exactly how many cube processing types there are and how they work.

     

    ·         You need to be able to read database schemes.

     

    ·         The different OLAP types, check this blog of me for more information.

     

    ·         Data mining is important!

    What should you study
    The lack of a training kit and other study materials designed for this exam makes your studying difficult. What should you study and where should you start?

     

     

    Microsoft does have a preparation guide for this exam on the Microsoft learning website. On this page you’ll find the following interesting topics:

     

    Preparation tools and resources
    This section contains links to classroom courses, Microsoft E-learning resources, Microsoft Press books and practice tests.
    I did all the classroom courses and they helped me prepare for the exam! If you have the possibility to take those courses you definitely should!
    I personally don’t advise you to buy all the Microsoft Press books to study. It’s just too much information and it will cost a lot of study time.
    Practice tests are very useful as preparation, I would advice everybody to purchase a set of preparation questions.

    Skills being measured
    This list contains all the possible exam topics. If you want to prepare yourself efficiently you should just study every item on this list. Search on Google and books online and you will find enough information! I think this way of preparing for the exam is the best way!

     

    How did I study
    I studied all the topics of the Skills being measured list that you will find in the preparation guide. Next to studying the Skills being measured, I followed all Microsoft Official Courses that are recommended as preparation for this exam:

    Course 2794: Designing a Business Intelligence Solution Architecture for the Enterprise Using Microsoft SQL Server 2005 (two days)

    Course 2795: Designing an ETL Solution Architecture Using Microsoft SQL Server 2005 Integration Services (two days)

    Course 2796: Designing an Analysis Solution Architecture Using Microsoft SQL Server 2005 Analysis Services (three days)

     

    Course 2797: Designing a Reporting Solution Architecture Using Microsoft SQL Server 2005 Reporting Services (two days)

    Links
    Below some links to other useful sites/weblogs about the 70-446 exam:

     

    If you have any questions, leave them as a comment and I will answer them, if I can. Also, if you have any information regarding the 70-446 MCITP exam, please leave a comment.

    Good Luck!

  • SSAS – MOLAP, ROLAP and HOLAP storage types

    A big advantage of a BI solution is the existence of a cube. Data and aggregations are stored in a optimized format to offer very fast query performance.
    Sometimes, a big disadvantage of storing data and aggregations in a cube is the latency that it implies. SSAS processes data from the underlying relational database into the cube. After this is done the cube is no longer connected to the relational database so changes to this database will not be reflected in the cube. Only when the cube is processed again, the data in the cube will be refreshed.

    SSAS 2005 gives you the possibility to choose different storage types for the following objects:

    • Cubes
    • Partitions
    • Dimensions

    MOLAP (Multi dimensional Online Analytical Processing)
    MOLAP is the most used storage type. Its designed to offer maximum query performance to the users. Data AND aggregations are stored in optimized format in the cube. The data inside the cube will refresh only when the cube is processed, so latency is high.

    ROLAP (Relational Online Analytical Processing)
    ROLAP does not have the high latency disadvantage of MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will be zero latency between the relational source database and the cube.
    Disadvantage of this mode is the performance, this type gives the poorest query performance because no objects benefit from multi dimensional storage.
     

    HOLAP (Hybrid Online Analytical Processing)
    HOLAP is a storage type between MOLAP and ROLAP. Data will be stored in relational format(ROLAP), so there will also be zero latency with this storage type.
    Aggregations, on the other hand, are stored in multi dimensional format(MOLAP) in the cube to give better query performance. SSAS will listen to notifications from the source relational database, when changes are made, SSAS will get a notification and will process the aggregations again.
    With this mode it’s possible to offer zero latency to the users but with medium query performance compared to MOLAP and ROLAP.


    The different storage types of SSAS:

     


     

    Data storage

     

    Aggregations storage

     

    Query performance

     

    Latency

     

    MOLAP

     

    Cube

    Cube

     

    High

     

    High

    HOLAP

     

    Relational database

    Cube

    Medium

    Low (none)

    ROLAP

     

    Relational database

    Relational database

    Low

    Low (none)



    Conclusion

     

    SSAS offers three storage types that give you all the flexibility you need. You can choose between high performance and high latency on one side(MOLAP) and lower performance but low latency(ROLAP) on the other side. There is also a possibility to choose a way in between(HOLAP).

     

  • SSIS – Non-blocking, Semi-blocking and Fully-blocking components

    How can you recognize these three component types, what is their inner working and do they acquire new buffers and/or threads?

    Synchronous vs Asynchronous

    The SSIS dataflow contain three types of transformations. They can be non-blocking, semi-blocking or fully-blocking. Before I explain how you can recognize these types and what their properties are its important to know that all the dataflow components can be categorized to be either synchronous or asynchronous.

    ·         Synchronous components
    The output of an synchronous component uses the same buffer as the input. Reusing of the input buffer is possible because the output of an synchronous component always contain exactly the same number of records as the input. Number of records IN == Number of records OUT.

    ·         Asynchronous components
    The output of an asynchronous component uses a new buffer. It’s not possible to reuse the input buffer because an asynchronous component can have more or less output records then input records.

    The only thing you need to remember is that synchronous components reuse buffers and therefore are generally faster than asynchronous components, that need a new buffer.

    All source adapters are asynchronous, they create two buffers; one for the success output and one for the error output. All destination adapters on the other hand, are synchronous.


    Non-blocking, Semi-blocking and Fully-blocking

    In the table below the differences between the three transformation types are summarized. As you can see it’s not that hard to identify the three types.
    On the internet are a lot of large and complicated articles about this subject, but I think it’s enough to look at the core differences between the three types to understand their working and (dis)advantages:

     

    Non-blocking

    Semi-blocking

    Fully-blocking

    Synchronous or asynchronous

    Synchronous

    Asynchronous

    Asynchronous

    Number of rows in == number of rows out

    True

    Usually False

    Usually False

    Must read all input before they can output

    False

    False

    True

    New buffer created?

    False

    True

    True

    New thread created?

    False

    Usually True

    True



    All SSIS transformations categorized:

    Non-Blocking transformations Semi-blocking transformations Blocking transformations
    Audit Data Mining Query Aggregate
    Character Map Merge Fuzzy Grouping
    Conditional Split Merge Join Fuzzy Lookup
    Copy Column Pivot Row Sampling
    Data Conversion Unpivot Sort
    Derived Column Term Lookup Term Extraction
    Lookup Union All  
    Multicast    
    Percent Sampling    
    Row Count    
    Script Component    
    Export Column    
    Import Column    
    Slowly Changing Dimension    
    OLE DB Command    

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement