THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Davide Mauri

A place for my thoughts and experiences on SQL Server, Business Intelligence and .NET

  • DTLoggedExec 1.1.2008.4 SP2 released!

    I’ve released couple of hours ago the SP2 of my DTLoggedExec tool.

    For those who don’t know it, it’s a DTEXEC replacement, useful to execute SSIS and having logging provided right from the engine and not from the package itself.

    More info can be found here:

    http://dtloggedexec.codeplex.com/

    This SP2 release add an important feature to the CSV Log Provider. It's now possible to store a personalized label into each log, in order to make it easy to identify or group logs.

    Let's say, for example, that you have 10 packages in your ETL solution, and each time you have to load your data, you need to execute all those 10 packages. In other words , you have a batch made of 10 packages. It would be nice if all logs - one for each package - can be identified as a whole and grouped together, in order to quickly identify all the log of a articular batch.

    This will make things easier when you want to know the overal time consumed by each batch execution.

    The new "ExecutionLabel" attribute will help to achieve this. A useful ExecutionLabel can be obtained using SQL Server Agents Token. For example:

    LABEL=$(ESCAPE_SQUOTE(JOBID))_$(ESCAPE_SQUOTE(STRTDT))_$(ESCAPE_SQUOTE(STRTTM))

    ChangeLog

    • Updated the CSV Log Provider in order to write the status of the log file in the header.
      • An OPEN status means that the log file is being written.
      • A CLOSED status means that the log files has been written correctly.
      • A file can be loaded into the database log only if is in the CLOSED state.
    • CSV Log files have 2 additional rows in the header: one for the FileStatus and one for the ExecutionLabel values.
      • File format has been update to 4 from 3. (Only the header section of the file has been changed).
    • CSV Log Provider will now display EndTime value to the Console
    • Updated the import-log-data.sql to correctly load file with format 3 (the old one) and 4 (the new one).
    • Updated database schema to version 19 in order to store the new ExecutionLabel value
    • Update samples in order to show how to use the new ExecuteLabel option

    As usual the download is available for free here:

    http://dtloggedexec.codeplex.com/releases/81526/download/336490

  • SYS2 scripts updated (December 2012)

    I’ve updated my SYS2 scripts:

    • Added a new script to see how much buffer cache memory is used by each database
    • Updated the sys2.stats script in order to have only one row per statistics
    • Updated the sys2.query_stats script to use the sys.dm_exec_plan_attributes dmv to get better information on which database was used by the cached plans

    As usual they are available from CodePlex:

    http://sys2dmvs.codeplex.com/

    Enjoy!

  • IBM DB2 and the “'DbProviderFactories' section can only appear once per config” error

    IBM doesn’t like MS. That’s a fact. And that’s why you can get your machine.config file (!!!) corrupted if you try to install IBM DB2 data providers on your server machine.

    If at some point, after having installed IBM DB2 data providers your SSIS packages or SSAS cubes or SSRS Reports starts to complain that

    'DbProviderFactories' section can only appear once per config

    you may want to check into you machine.config, located in the %runtime install path%\Config

    http://msdn.microsoft.com/en-us/library/ms229697%28v=vs.71%29.aspx

    Almost surely you’ll find a IBM DB2 Provider into an additional DbProviderFactories section all alone. Poor guy. Remove the double DBProviderFactories entry, and merge everything inside only one section DBProviderFactories and after that everything will start to work again.

  • Execute a SSIS package in Sync or Async mode from SQL Server 2012

    Today I had to schedule a package stored in the shiny new SSIS Catalog store that can be enabled with SQL Server 2012. (http://msdn.microsoft.com/en-us/library/hh479588(v=SQL.110).aspx)

    Once your packages are stored here, they will be executed using the new stored procedures created for this purpose. This is the script that will get executed if you try to execute your packages right from management studio or through a SQL Server Agent job, will be similar to the following:

    Declare @execution_id bigint

    EXEC [SSISDB].[catalog].[create_execution] @package_name='my_package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'BI', @project_name=N'DWH', @use32bitruntime=False, @reference_id=Null

    Select @execution_id
    DECLARE @var0 smallint = 1
    EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'LOGGING_LEVEL', @parameter_value=@var0

    DECLARE @var1 bit = 0
    EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'DUMP_ON_ERROR', @parameter_value=@var1

    EXEC [SSISDB].[catalog].[start_execution] @execution_id
    GO

    The problem here is that the procedure will simply start the execution of the package and will return as soon as the package as been started…thus giving you the opportunity to execute packages asynchrously from your T-SQL code. This is just *great*, but what happens if I what to execute a package and WAIT for it to finish (and thus having a synchronous execution of it)?

    You have to be sure that you add the “SYNCHRONIZED” parameter to the package execution. Before the start_execution procedure:

    exec [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'SYNCHRONIZED', @parameter_value=1

    And that’s it Smile.

    PS

    From the RC0, the SYNCHRONIZED parameter is automatically added each time you schedule a package execution through the SQL Server Agent. If you’re using an external scheduler, just keep this post in mind Smile.

  • DAX editor for SQL Server

    One of the major criticism to DAX is the lack of a decent editor and more in general of a dedicated IDE, like the one we have for T-SQL or MDX.

    Well, this is no more true. On Codeplex a very interesting an promising Visual Studio 2010 extension has been released by the beginning of November 2011:

    http://daxeditor.codeplex.com/

    Intellisense, Syntax Highlighting and all the typical features offered by Visual Studio are available also for DAX.

    Right now you have to download the source code and compile it, and that’s it!

  • PASS Summit 2011 – Slides & Demos

    Just reading from all posts and tweets it’s quite clear that this year’s PASS has been a great success. Despite of the bad cold that I had from Monday Sad smile, I’ve enjoyed it at lot. All the parties, the friends, the community and the great content make PASS an appointment one that should not to be missed.

    I’ve also enjoyed a lot delivering “Temporal Snapshot Fact Table” session. From the feedback I had attendees enjoyed it too Smile.

    All the related material will be soon available here

    Temporal Snapshot Fact Table [BIA-406-S]
    http://www.sqlpass.org/summit/2011/Speakers/CallForSpeakers/SessionDetail.aspx?sid=1115

    and here

    Upgrading SSIS to Denali - Management Considerations and Best Practices [BIA-311-S]
    http://www.sqlpass.org/summit/2011/Speakers/CallForSpeakers/SessionDetail.aspx?sid=1762

    PS

    For all those who cannot wait to download the slides and the demos from PASS website, I’ve uploaded everything on my SkyDrive folder. If you need them just send me an email, I’ll be happy to send the public link to you Smile

  • “Introduction to Databases” from Stanford University

    A brillant idea and an opportunity that no-one that works with databases (DBAs or Developers) should miss:

    http://www.db-class.com/

    A free online course on databases:

    This course covers database design and the use of database management systems for applications. It includes extensive coverage of the relational model, relational algebra, and SQL. It also covers XML data including DTDs and XML Schema for validation, and the query and transformation languages XPath, XQuery, and XSLT. The course includes database design in UML, and relational design principles based on dependencies and normal forms. Many additional key database topics from the design and application-building perspective are also covered: indexes, views, transactions, authorization, integrity constraints, triggers, on-line analytical processing (OLAP), and emerging "NoSQL" systems.

    The istructor will be Professor Jennifer Widom, a ACM Fellow and a member of the National Academy of Engineering and the American Academy of Arts & Sciences; she also received the ACM SIGMOD Edgar F. Codd Innovations. I’ve already done my registration: I’m sure I’ll learn something new and useful and I’ll get a refresh of good old concepts…which is always a good thing.

    A praise to Stanford University for this excellent initiative! Smile

    A big thanks to my friend and colleague Luca Zavaralla for pointing out this opportunity!

  • Running and application with a domain user even if you’re not in a domain.

    I just discovered yesterday the possibility to run an application under the credential of a domain user, even if you’re not in a domain. This is a very useful feature for me: being a consultant I work with a lot of different customers, each one with its own domain, and each one (of course) with a different user account for myself.

    I cannot join all their domains so I have to work outside the domain, or I have to create a Virtual Machine with all the tools I need and then join their domain. This, unfortunately, means a lot of installation and maintenance work.

    But what I discovered yesterday simply changed my life Smile: to execute an application using a domain user, even if you’re not I a domain, all you have to do is to use the /netonly option of the runas command!

    To launch Excel, for example:

    runas /netonly /user:THEDOMAIN\theuser "C:\Program Files (x86)\Microsoft Office\Office14\EXCEL.EXE"

    And that’s it! Now you can browse the cubes on SSAS (for example) using the software on your machine. Cool!

  • DTLoggedExec 1.1.2008.4 Service Pack 1 released

    Today I’ve released the first Service Pack of DTLoggedExec (for those who doesn’t know what it is: DTLoggedExec is a DTExec replacement to run Integration Services packages):

    http://dtloggedexec.codeplex.com

    This Service Pack fixes some little problems with the .bat and .sql files that comes with DTLoggedExec. All the fixes were already published as single changesets (86188, 86299, 87778, 88124 and 91054) and the Service Pack put them all togheter for users convenience.

    You can download the full DTLoggedExec package with the SP1 already integrated or the single Service Pack 1 that you can integrate manually in your existing installations (all you have to do is to overwrite the existing files):

    http://dtloggedexec.codeplex.com/releases/view/70641

  • Connecting to DB2 from SQL Server 2005 x64 and after using IBM Client Access

    If you have to connect your SQL Server to a DB2 database server, you can quickly stumble upon this “nice” error:

    Cannot fetch a row from OLE DB provider "IBMDA400" for linked server.

    After spending some time trying to figure out what was wrong, I found this very complete and useful article:

    http://social.msdn.microsoft.com/Forums/en-US/sqldataaccess/thread/8514b4bb-945a-423b-98fe-a4ec4d7366ea

    The article is quite long and deals with all the problems you may find when trying to make SQL Server and DB2 work together, using SSIS or Linked Server.

    If you’re using Linked Server and you have the aforementioned error, all you have to do is to set the “magic” properties:

    USE [master]
    GO 
    EXEC master.dbo.sp_MSset_oledb_prop N'IBMDA400', N'AllowInProcess', 1
    GO  
    EXEC master.dbo.sp_MSset_oledb_prop N'IBMDA400', N'DynamicParameters', 1
    GO

    and – voilà – everthing will start to work.

  • SQL Server 2011 (Denali) CTP3 Download Links

    As already annouced by the blogosphere, the public CTP of SQL Server 2011 has been released. Unfortunately it seems that there is no “official” page where you can download *everything* related to that release and you have to dig the net to find little treasures in Twitter tweets and blogs posts. To avoid to waste my time again, I’ve created a little list with all the downloads available right now:

    SQL Server 2011 CTP3
    https://www.microsoft.com/betaexperience/pd/SQLDCTP3CTA/enus/default.aspx

    SQL Server 2011 CTP3 - Feature Pack
    http://www.microsoft.com/download/en/details.aspx?id=26726

    SQL Server 2011 CTP3 - PowerPivot
    http://www.microsoft.com/download/en/details.aspx?id=26721

    SQL Server 2011 CTP3 - Report Builder
    http://www.microsoft.com/download/en/details.aspx?id=26780

    SQL Server 2011 CTP3 - Semantic Language Statistics
    http://www.microsoft.com/download/en/details.aspx?id=26724

    SQL Server 2011 CTP3 - Reporting Services Add-in for Microsoft® SharePoint®
    http://www.microsoft.com/download/en/details.aspx?id=26779

    SQL Server 2011 CTP3 - Master Data Services Add-in For Microsoft® Excel®
    http://www.microsoft.com/download/en/details.aspx?id=26725

    SQL Server 2011 CTP3 - SQL Server Developer Tools, Codename "Juneau"
    http://msdn.microsoft.com/en-us/data/hh297027

    If you want to check what’s new here’s the link for you:

    What's New (SQL Server "Denali")
    http://msdn.microsoft.com/library/bb500435(SQL.110).aspx

     

    If you find some other download link not listed here (for example, I did not find anything related to Juneau), please leave a comment so that I can update the list.

  • Conferences after summer vacation

    After summer vacation I’ll be traveling around Eastern Europe for a while, and than I’ll head west to reach Seattle for PASS Summit. Here’s my schedule for next conferences, in case someone is interested to attend to a local conference:

    SQLU Summit
    21 - 23 September
    Budapest - Hungary
    http://www.sqlu.com

    SQL Tune In
    26 - 27 September
    Zagreb - Croatia
    http://www.sqltunein.com/en/

    Bleeding Edge
    29 - 30 September
    Gozd Martuljek - Slovenia
    http://www.bleedingedge.si/en/about-the-conference

    PASS Summit 2011
    10 - 14 October
    Seattle, USA
    http://www.sqlpass.org/summit/2011/

    For those who are interested in the “Temporal Snapshot Fact Table” session I’ll deliver at PASS Summit, but cannot be there but lives in Europe, I’ll be delivering an extended version at the SQL Tune-In Conference in Zagreb, during the full-day seminar:

    BI From the Trenches – Real World solutions for Real World Problems
    http://www.sqltunein.com/en/program/predkonferencijski-dan/bi-from-the-trenches

  • How the number of indexes built on a table can impact performances?

    We all know that putting too many indexes (I’m talking of non-clustered index only, of course) on table may produce performance problems due to the overhead that each index bring to all insert/update/delete operations on that table.

    But how much? I mean, we all agree – I think – that, generally speaking, having many indexes on a table is “bad”. But how bad it can be? How much the performance will degrade? And on a concurrent system how much this situation can also hurts SELECT performances? If SQL Server take more time to update a row on a table due to the amount of indexes it also has to update, this also means that locks will be held for more time, slowing down the perceived performance of all queries involved.

    I was quite curious to measure this, also because when teaching it’s by far more impressive and effective to show to attended a chart with the measured impact, so that they can really “feel” what it means!

    To do the tests, I’ve create a script that creates a table (that has a clustered index on the primary key which is an identity column) , loads 1000 rows into the table (inserting 1000 row using only one insert, instead of issuing 1000 insert of one row, in order to minimize the overhead needed to handle the transaction, that would have otherwise ), and measures the time taken to do it.

    The process is then repeated 16 times, each time adding a new index on the table, using columns from table in a round-robin fashion. Test are done against different row sizes, so that it’s possible to check if performance changes depending on row size.

    The result are interesting, although expected. This is the chart showing how much time it takes to insert 1000 on a table that has from 0 to 16 non-clustered indexes.

    image

    Each test has been run 20 times in order to have an average value. The value has been cleaned from outliers value due to unpredictable performance fluctuations due to machine activity.

    The test shows that in a  table with a row size of 80 bytes, 1000 rows can be inserted in 9,05 msec if no indexes are present on the table, and the value grows up to 88 (!!!) msec when you have 16 indexes on it

    image

    This means a impact on performance of 975%. That’s *huge*!

    Now, what happens if we have a bigger row size? Say that we have a table with a row size of 1520 byte. Here’s the data, from 0 to 16 indexes on that table:

    image

    In this case we need near 22 msec to insert 1000 in a table with no indexes, but we need more that 500msec if the table has 16 active indexes! Now we’re talking of a 2410% impact on performance!

    Now we can have a tangible idea of what’s the impact of having (too?) many indexes on a table and also how the size of a row also impact performances. That’s why the golden rule of OLTP databases “few indexes, but good” is so true! (And in fact last week I saw a database with tables with 1700bytes row size and 23 (!!!) indexes on them!)

    This also means that a too heavy denormalization is really not a good idea (we’re always talking about OLTP systems, keep it in mind), since the performance get worse with the increase of the row size.

    So, be careful out there, and keep in mind the “equilibrium” is the key world of a database professional: equilibrium between read and write performance, between normalization and denormalization, between to few and too may indexes.

    PS

    Tests are done on a VMWare Workstation 7 VM with 2 CPU and 4 GB of Memory. Host machine is a Dell Precsioni M6500 with i7 Extreme X920 Quad-Core HT 2.0Ghz and 16Gb of RAM. Database is stored on a SSD Intel X-25E Drive, Simple Recovery Model, running on SQL Server 2008 R2. If you also want to to tests on your own, you can download the test script here: Open TestIndexPerformance.sql

  • PASS Summit 2011 Spotlight sessions and pre-conference seminars announced!

    Spotlight sessions and pre-conference sessions has been revelaed, and you can read the complete list here Spotlight Sessions and here Pre-Conference Sessions.

    Just reading that list I’m already looking forward to such great event, since if this is just a starter taste of all the sessions to come….well PASS Summit 2011 is a event that cannot be missed!

    Beside this, I’ll be presenting a spotlight session on my own, as already mentioned here, but I’m also very proud and happy to announce that I’ll co-present the session

    Upgrading SSIS to Denali –Management Considerations and Best Practices

    with my friend and collegue Rushabh Mehta. As you may know he is also President of PASS so you can imagine how happy I can to have the honor to co-present with him. Thanks Rushabh!

  • Data Quality and Master Data Management with Microsoft SQL Server 2008 R2 free eBook

    Greg Low already posted about it and so I’m a bit late here, but nonetheless I’d like to share with you all the information that we’ve just released the free eBook dedicated to Data Quality and Master Data Management with Microsoft SQL Server 2008 R2.

    Since I’ve heard for the very first time the Master Data Management idea I felt in love with it. And if you’re doing BI, you know what I mean. The hardest part of a BI project is to cleans, normalize & check the data. If we could have data correct right from the beginning, that would be a dream! That’s why I really believe in MDM and its implementation.

    Microsoft started to follow that road too, so it has been natural for us to start to play with Master Data Services right from the beginning. Writing a book on it was again another natural thing that happened.

    We really enjoyed writing this book and we’re already seeing some customers interested in applying the described techniques, that can work despite the fact you’re using MDS or not. The book, in fact, also focuses on Data Quality, which is a much more broader topic than MDM and MDS. The books also cover usage of T-SQL, SSIS and various well-known algorithms useful to cleanse and normalize data.

    Here’se the link to the page where you can find more information and download the eBook:

    http://www.solidq.com/ce-en/News/Pages/Data-Quality-and-Master-Data-Management-with-Microsoft-SQL-Server-2008-R2.aspx

    Enjoy!

    PS

    Just noticed that we already have a review. Here’s the link:

    http://blog.ashdar-partners.com/2011/06/quick-review-of-solidqs-latest-e-book.html

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement