THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

  • First release of my own personal T-SQL Code Library on github

    Like many (most???) T-SQL developers I keep a stash of useful code that I’ve garnered down the years because I know its all going to come in useful at some point in the future. It includes code I’ve written myself and also code that others have shared on their own blogs. For example my code library includes the following:

    I’ve never seen the point of keeping one’s code library to one’s self, might as well share it in case anyone else might find it useful, so up to now I’ve kept my collection of scripts publicly available on SkyDrive (go see it if you like).

    That’s all fine and dandy but I figured this could be improved. SkyDrive is a file sharing site and whilst it includes a nice code viewer/editor it is not an ideal solution for storing code, code should be stored in a version control system (e.g. Git, TFS, Subversion, etc..). I opted to make my code library available on Github at because it provides:

    • file version history
    • releases
    • ability for anyone else to fork my code library and build upon it to maintain their own code library
    • lots of tools necessary for modern code development

    and moreover all the cool kids seem to be using Github so I figured I’d give it a bash as well.

    The code library exists as a collection of views, functions and stored procedures in an SSDT project. I’m a massive fan of SSDT so there were many reasons for my choosing to do this but the overriding reason was that SSDT provides a single binary (i.e. a dacpac file) containing the entire code library that can be distributed as easily as emailing the file to someone. Deploying a dacpac is pretty simple and so is a great method for sharing T-SQL code.

    What’s in my T-SQL code library?

    In this first release, not much. There are only nine objects though I hasten to add that this is only a first release and I have a backlog of stuff that I need to add in there. One of the many advantages of using SSDT is that it makes it easy to add extended properties to describe the objects and the code library includes a view that surfaces all of that extended property information:

    SELECT schema_name,object_name,object_type,CodeLibraryDescription
    FROM jt.[vwCodeLibraryDescriptions]


    How do you install the code library?

    Download the two binaries:

    • master.dacpac
    • TSQLCodeLibrary.dacpac

    and store them together in a folder. Open a command prompt at that folder and type:

    "%ProgramFiles(x86)%\Microsoft SQL Server\110\DAC\bin\SqlPackage.exe"
     /TargetDatabaseName: <prefered_database_name>

    (replacing <your_sql_instance> with the name of the SQL Server instance where you want to create the code library and <prefered_database_name> with whatever you want the database to be called. Get rid of the line feeds as well, they are just used here for clarity)

    This will create a SQL Server database containing my code library:


    If any of the code in my code library proves useful to you then that’s great however my wish here is that some of you other folk out there feel motivated to share your own code in a similar manner. If you do so please post a comment below and let me know.


  • Schema Compare or Publish [SSDT]

    Yesterday on Twitter Ryan Desmond asked Is there a good read for #SSDT regarding deploying changes via schema compare vs solution deployment?


    I don’t know of any article that covers this so in this blog post I offer my opinion on the subject.

    First some background. When building databases offline using the database project type (.sqlproj) in SSDT you have two options for deploying the DDL code in your project into a physical database:

    • Schema Compare
    • Publishing

    Under the covers both do the same basic operation; build a dacpac from your project, compare it to the target database, build a script that will make the requisite changes to the target database, execute that script.

    Ryan was asking which of these one should use. I suggested that publishing was a better option and here are two reasons why:

    1. Publish will include your pre and post deployment scripts as well whereas Schema Compare will not. (And if your retort is that you cannot run those scripts more than once then you’re doing it wrong, rewrite them.)
    2. If the debug target for your project is configured correctly then a publish operation can be triggered simply by pressing F5. That’s massively more productive than the point-and-click nature of Schema Compare. Its even better if you have multiple SSDT projects in your solution as you can publish all of them with a single key stroke.

    Does anyone out there have a different opinion? Let me know in the comments.


  • Beware the ß

    I stumbled upon an interesting little nuance of SQL Server’s behaviour over the past few days that I was not aware of and I figure its worth talking about it here so others are aware. It concerns the handling of the german character “ß” which I understand to be german shorthand for “ss” (I’m no expert on language or linguists so don’t beat me up if I’m wrong about that).

    In short, two NVARCHAR values that differ only by one using “ß” and the other using “ss” will be treated as the same. This is easily demonstrated using the following code snippet:

    SELECT 'Oktoberstrasse'
    SELECT N'Oktoberstrasse'

    which returns:


    (Collation on my database is set to ‘SQL_Latin1_General_CP1_CI_AS’)

    Notice that casting the values as NVARCHAR (which is what the N'' notation does) causes SQL Server to treat them as the same. Obviously this is going to cause a problem if you need to treat those as distinct values (such as inserting into a column with a unique key upon it – which is the problem I encountered that caused me to stumble across this)

    There is a bug submission to Connect regarding this issue at 'ß' and 'ss' are NOT equal in which a member of the SQL Server team says:

    Our current behavior follows the SQL/ISO standard and unless those standards are updated with the latest changes we don't intend to change the behavior in SQL Server. Changing existing SQL Server behavior has lot of implications and today we rely on Windows for all of our windows collation sorting capabilities. If and when in the future Windows adopts these new rules / Unicode standard we will incorporate it in SQL Server.

    In other words, SQL Server is merely following the behaviour as defined by the International Standards Organisation so its not a bug in SQL Server as such, just a nuance that one needs to be aware of. And now you are.


  • Really useful Excel keyboard shortcuts

    I love me a good keyboard shortcut and there’s some I would like to use in Excel all the time if only I could darn well remember them.

    CTRL+ Spacebar   Select the entire column containing the currently selected cell(s)
    Shift + Spacebar   Select the entire row containing the currently selected cell(s)
    CTRL + ‘+’   Insert cells
    CTRL + ‘-’   Delete cells

    If you combine these you’ll find they become really powerful. For example, CTRL+Spacebar followed by CTRL + ‘+’ inserts a new column into a worksheet (which is what I wanted to do this morning when I began googling this).

    I figured the only way I will ever ingrain these into my muscle memory is if I write them down somewhere, nowhere better than on my blog.

    If you’re a fellow keyboard shortcut fetishist and want to share any obscure ones that you know of then I’m all ears, please put them in the comments below.


  • Linqpad – bring the power of LINQ to SQL Server development

    One of my biggest laments about being a SQL Server developer is that the tools provided by Microsoft to do our thang compare woefully to the feature rich, (largely) bug-free, “it-just-works” impression that one gets from seeing the tools available to “other” developers in the Microsoft ecosystem (I’m thinking of our appdev brethren here, the ones that use C#, Javascript, and all that jazz). I could write a long blog post complaining about this (and perhaps one day I will) but in this blog post I want to shine a light on a tool called Linqpad.

    Linqpad enables you to issues queries written in LINQ against a database and in that respect is an alternative to SQL Server Management Studio (SSMS). What is LINQ? It stands for Language Integrated Query and is a technology that Microsoft brought out a few years ago for writing queries inside .Net code. The great thing about Linqpad is that it enables one to write LINQ queries without having to know anything about .Net.

    In the screenshots below I show a simple query against a database but written two ways, one using T-SQL that anyone reading this blog post will likely know, and one using LINQ:


    Some things to notice here. The two queries look very similar in that they contain the same keywords {SELECT, FROM}. Second thing to notice is that the FROM clause comes before the SELECT clause and if you know anything about the logical order of execution of a SELECT query you’ll realise that this intuitively makes sense. Lastly the table is called [dbo].[BulletinLine] but in the LINQ query its called [BulletinLines], its been pluralised (a convention that is common to .Net developers) and there’s no [dbo] prefix. Other than those things its intuitively clear that these two queries are doing exactly the same thing and its worth pointing out that under the covers the LINQ query is converted into a T-SQL query.

    So OK, if you accept that LINQ can do pretty much anything that a T-SQL SELECT query can do the next obvious question is “Why should I bother when T-SQL already does what I need?” The answer, in a word, is productivity. Or, to put it another way, intellisense works properly. Let’s say for example I want to select a subset of all the columns, intellisense comes to our aid:


    One might well retort “well that works in SSMS as well” but in my experience intellisense in SSMS is, at best, flaky. In some circumstances it simply doesn’t work and the worst part of this is that its often inexplicable as to why. (In case you can’t tell, intellisense in SSMS drives me up the wall and I’m sure I’m not the only one.)

    Some other nice things about LINQ. Here’s the equivalent of a WHERE clause to filter on [BulletinId]=6:


    If you don’t know LINQ then (in my opinion) its not intuitively obvious what’s going on here. What the above query is doing can effectively be described as:

    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6

    Where this gets really powerful is the ability to stack these things up like so:


    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6. From the resultant collection filter in WHERE Colour=”White”

    If we only want the top 3:


    Take the collection of BulletinLines, filter in WHERE BulletinId equals 6. From the resultant collection filter in WHERE Colour=”White”. From the resultant collection take the first 3

    I love how expressive this is and when you get fully conversant with LINQ its wonderfully intuitive too. If I haven’t convinced then, well, that’s OK. If you’re not convinced but do want to be convinced then check out Why LINQ beats SQL. The overriding argument there is “LINQ is in most cases a significantly more productive querying language than SQL”. Make sure you check out the section entitled “Associations” too which demonstrates how you can filter based on what is known about an entity in other tables, when you grok what’s going on there you’ll realise the power that LINQ brings.

    If you want to give this a go then go and download Linqpad now from Its free however some of the more advanced features (such as intellisense) only light up when you pay for the pro or premium edition which are priced at $39 & $58 respectively for a perpetual license, a bit more than the app generation are used to paying but hardly bank-breaking either.

    Are any other SQL developers out there using Linqpad? Please share your experiences in the comments below, I’d really like to read them.


  • Don’t learn SSDT, learn about your databases instead

    Last Thursday I presented my session “Introduction to SSDT” at the SQL Supper event held at the offices of 7 Digital (loved the samosas, guys). I did my usual spiel, tour of the IDE, connected development, declarative database development yadda yadda yadda… and at the end asked if there were any questions. One gentleman in attendance (sorry, can’t remember your name) raised his hand and stated that by attempting to evangelise all of the features I’d missed the single biggest benefit of SSDT, that it can tell you stuff about database that you didn’t already know.

    I realised that he was dead right. SSDT allows you to import your whole database schema into a new project and it will instantly give you a list of errors and/or warnings pertaining to the objects in your database. Invalid references (e.g a long-forgotten stored procedure that refers to a non-existent column), unnecessary 3-part naming, incorrect case usage, syntax errors…it’ll tell you about all of ‘em! Turn on static code analysis (this article shows you how) and you’ll learn even more such as any stored procedures that begin with “sp_”, WHERE clauses that will kill performance, use of @@IDENTITY instead of SCOPE_IDENTITY(), use of deprecated syntax, implicit casts etc…. the list goes on and on.

    I urge you to download and install SSDT (takes a few minutes, its free and you don’t need SQL Server or Visual Studio pre-installed), start a new project:


    right-click on your new project and import from your database:


    and see what happens:


    You may be surprised what you discover. Let me know in the comments below what results you get, total number of objects, number of errors/warnings, I’d be interested to know!


  • Clone an Azure VM using Powershell

    In a few months time I will, in association with Technitrain, be running a training course entitled Introduction to SQL Server Data Tools. I am currently working on putting together some hands-on lab material for the course delegates and have decided that in order to save time in asking people to install software during the course I am simply going to prepare a virtual machine (VM) containing all the software and lab material for each delegate to use. Given that I am an MSDN subscriber it makes sense to use Windows Azure to host those VMs given that it will be close to, if not completely, free to do so.

    What I don’t want to do however is separately build a VM for each delegate, I would much rather build one VM and clone it for each delegate. I’ve spent a bit of time figuring out how to do this using Powershell and in this blog post I am sharing a script that will:

    1. Prompt for some information (Azure credentials, Azure subscription name, VM name, username & password, etc…)
    2. Create a VM on Azure using that information
    3. Prompt you to sysprep the VM and image it (this part can’t be done with Powershell so has to be done manually, a link to instructions is provided in the script output)
    4. Create three new VMs based on the image
    5. Remove those three VMs


    The script has one pre-requisite that you will need to install, Windows Azure Powershell. You also need to be a Windows Azure subscriber which, if you're reading this blog post, I'm assuming you already are.

    Simply download the script and execute it within Powershell, assuming you have an Azure account it should take about 20minutes to execute (spinning up VMs and shutting the down isn’t instantaneous). If you experience any issues please do let me know.

    There are additional notes below.

    Hope this is useful!



    • Obviously there isn’t a lot of point in creating some new VMs and then instantly deleting them. However, this demo script does provide everything you need should you want to do any of these operations in isolation.
    • The names of the three VMs that get created will be suffixed with 001, 002, 003 but you can edit the script to call them whatever you like.
    • The script doesn’t totally clean up after itself. If you specify a service name & storage account name that don’t already exist then it will create them however it won’t remove them when everything is complete. The created image file will also not be deleted. Removing these items can be done by visiting
    • When creating the image, ensure you use the correct name (the script output tells you what name to use):


    • Here are some screenshots taken from running the script:



    • When the third and final VM gets removed you are asked to confirm via this dialog:


    Select ‘Yes’

  • SQL Relay 2013R2 – Bigger, faster, SQL-ier

    Fresh from a successful UK tour in June the SQL Relay team are at it again. This time there are ten venues all around the UK between 11th and 29th November.


    Head to to sign-up. I’ll be at the London one, hope to see you there!


  • SSDT gotcha – Moving a file erases code analysis suppressions

    I discovered a little wrinkle in SSDT today that is worth knowing about if you are managing your database schemas using SSDT. In short, if a file is moved to a different folder in the project then any code analysis suppressions that reference that file will disappear from the suppression file. This makes sense if you think about it because the paths stored in the suppression file are no longer valid, but you probably won’t be aware of it until it happens to you. If you don’t know what code analysis is or you don’t know what the suppression file is then you can probably stop reading now, otherwise read on for a simple short demo.

    Let’s create a new project and add a stored procedure to it called sp_dummy.


    Naming stored procedures with a sp_ prefix is generally frowned upon and hence SSDT static code analysis will look for occurrences of this and flag them. So, the next thing we need to do is turn on static code analysis in the project properties:


    A subsequent build causes a code analysis warning as we might expect:


    Let’s suppose we actually don’t mind stored procedures with sp_ prefixes, we can just right-click on the message to suppress and get rid of it:


    That causes a suppression file to get created in our project:


    Notice that the suppression file contains a relative path to the file that has had the suppression placed upon it. Now if we simply move the file within our project to a new folder notice that the suppression that we just created gets removed from the suppression file:


    As I alluded above this behaviour is intuitive because the path originally stored in the suppression file is no longer relevant but you’re probably not going to be aware of it until it happens to you and messages that you thought you had suppressed start appearing again. Definitely one to be aware of.



  • Enforce SSIS naming conventions using BI-xPress

    A long long long time ago (in 2006 in fact) I published a blog post entitled Suggested Best Practises and naming conventions in which I suggested a bunch of acronyms that folks could use to prefix object names in their SSIS packages, thus allowing easier identification of those objects in log records, here is a sample of some of those suggestions:


    If you have adopted these naming conventions (and I am led to believe that a bunch of people have) then you might like to know that you can now check for adherence to these conventions using a tool called BI-xPress from Pragmatic Works. BI-xPress includes a feature called the Best Practices Analyzer that scans your packages and assess them according to some rules that you specify. In addition Pragmatic Works have made available a collection of these rules that adhere to the naming conventions I specified in 2006


    You can download this collection however I recommend you first read the accompanying article that demonstrates the capabilities of the Best Practices Analyzer. Pretty cool stuff.


  • SSIS ReportingPack v1.1.1.0 – Execution Stats visualised in a Gannt chart via usp_ssiscatalog

    There are many SSRS reports in SSIS Reporting Pack and the one I’ve perhaps spent most time on is the executable_duration report which provides this visualisation:


    I happen to think this is really useful because it shows the start time and duration of each executable relative to all the other executables in an execution. It does it by leveraging the execution_path property of an executable (which I am on record as saying is, in my opinion, the best feature in SSIS 2012).

    I have wanted to provide a similar visualisation in usp_ssiscatalog for some time and today I have finally gotten around to implementing it. With this new release one can call up the executable stats using this query:

    EXEC usp_ssiscatalog
    @a                      =  
    ,   @exec_execution         =  
    ,   @exec_events            =  
    ,   @exec_errors            =  
    ,   @exec_warnings          =  
    ,   @exec_executable_stats  =  

    And the returned dataset will include a new column called Gannt. This screenshot demonstrates the value that this provides:


    On the left we have the execution_path of each executable and on the right is a bar indicating the start time and duration relative to all the other executables. Collectively these bars provide a nifty Gannt chart-alike view of your execution thus allowing you to easily identify which executables are taking up the most time in your executions.

    In addition there is a similar view for the list of executions that is gotten by executing:

    EXEC usp_ssiscatalog @a='execs'

    That return a new column, relative_duration:


    which simply provides a bar for each execution indicating how long it took relative to all the other executions.

    I hope these new visualisations in usp_ssiscatalog prove useful to you. If they do (or even if they don’t) please let me know, I love getting feedback and I don’t get nearly as much as I would like for SSIS Reporting Pack. Get the goods from SSIS Reporting Pack v1.1.1.0.


  • Learn from me about SSDT in London, March 2014

    Microsoft released SQL Server Data Tools (SSDT) along with SQL Server 2012 in the Spring of 2012. Since then I’ve noticed an upward tick in both the number of organisations that are using SSDT and the number of questions that are getting asked about it on forums. There is some confusion about what SSDT actually is (Microsoft hasn’t helped there), why people should be using SSDT and, most importantly, how to make best use of it.

    Its clear that people want to learn more about SSDT so I have joined forces with Technitrain to offer a 2–day training course, in London, in March 2014 called Introduction to SQL Server Data Tools



    The course will cover:

    Day 1

    • SSDT Positioning
    • IDE Tour
    • Connected Database Development
    • Declarative, offline, database development
    • Publishing
    • Continuous Integration (CI) and Continuous Deployment

    Day 2

    • Data Publishing
    • Refactoring
    • Database unit testing
    • References and composite projects
    • Database Drift
    • Code analysis

    If this sounds like your bag then please sign up on the Technitrain website.


  • SSISReportingPack with usp_ssiscatalog v1.1 available – first release with integrated installer

    Since December 2010 I have been maintaining a suite of reports that can be used atop the SSIS Catalog in SSIS 2012, I called it the “SSIS Reporting Pack”. While building those reports I realised that there was also a need to make it easier to issue common but ad-hoc queries against the SSIS Catalog and that realisation gave rise to a stored procedure called sp_ssiscatalog which I first released in November 2012. Both of these pieces of open source software are hosted at however they have always been distributed separately, with separate install mechanisms for each.

    Until today that is. As of version 1.1 a single installer exists that will provide all the artifacts required for both the reports and usp_ssiscatalog:


    Head to to get the goods. I have updated the home page in an attempt to make the installation instructions clearer for both the reports and usp_ssiscatalog.


    You probably want to know what new features are in this release. Well, not all that much if I’m honest. Most notable are:

    • A fix for a user reported issue where the reports would fail to deploy due to error The definition of the report '/folder-project-package' is invalid. The value expression for the text box 'breadcrumbFolder' referes to a non-existing report parameter 'FolderId'".
    • The eagle-eyed of you may have noticed that usp_ssiscatalog is a new name for this stored procedure, it used to be called sp_ssiscatalog. The reason for the change is simply to follow best-practice for naming of SQL Server stored procedures where any such object in a user database should not be prefixed “sp_”. The previous name came about because the original intention was for this stored procedure to live in [master], as it transpired this would not have been a wise decision however the name of the stored procedure had already been established. Any call to sp_ssicatalog will still work for the foreseeable future however I recommend that you should change to use usp_ssiscatalog instead, if you can.

    The big story here is that the reports and usp_ssiscatalog are now delivered in the same installer and to prove it here is a screenshot of the installer:


    This lays a foundation for future enhancements that I want to make. Watch this space.


  • An ETL joke #fridayfun

    I received a text from my bank recently where they appear to have mistaken me from someone with a rather strange first name. As an ETL developer by trade there really was only one possible response:



    This one amused me given the hype around Hadoop at the moment:


    Have a good weekend folks!


  • Declarative ETL

    T-SQL is a declarative programming language. This means that a T-SQL developer defines what (s)he wants the query to do, not how to do it. The secret sauce that turns the “what” into the “how” is a piece of software called the query optimiser and there are tomes dedicated to how one can make the query optimiser dance to their will. Generally the existence of a query optimiser is a good thing as in most cases it will do a better job of figuring out the “how” than a human being could*.

    SSIS dataflows on the other hand are an imperative programming language**, the data integration developer dataflow builds a data pipeline to move data exactly as (s)he desires it to happen. In SSIS, There is no equivalent of the query optimiser.

    I’ve often pondered whether there is an opportunity for someone to build a declarative data pipeline, that is, a method for a data integration developer to define what data should be moved rather than how to move it. Over the last few months I’ve come to the realisation that Power Query (formerly known as Data Explorer) actually fits that description pretty well. If you don’t believe me go and read Matt Masson’s blog post Filtering in Data Explorer in which he talks about query folding:

    [Power Query] will automatically push filters directly to the source query

    Even though we selected the full table in the UI before hiding the columns certain columns, we can see the source query only contains the columns we want

    “filters” aren’t the only type of query folding that Data Explorer can do. If you watch the queries, you’ll see that other operations, such as column removal, renaming, joins, and type conversions are pushed as close to the source as possible.

    Let’s not idly dismiss this. Query folding is very impressive and is (I think) analogous to predicate pushdown that is done by the query optimiser (for more on predicate pushdown read Nested Views, Performance, and Predicate Pushdown by Dave Wentzel). Does Power Query then have the equivalent of a query optimiser? I would say yes, it does although its more accurate to say that its the underlying query language called M for which this query optimiser exists!

    So, we have a declarative data integration language which is surfaced in Power Query and hence can only (currently) push data into an Excel spreadsheet. Imagine if M were extended were extended into a fully-fledged data integration tool that could move data between any heterogeneous source, would that constitute “declarative ETL” and is there a need for such a tool?

    I’ll leave that one out there for you to ponder. Comments would be most welcome.


    *OK, you might debate whether the query optimiser can do a better job than a human but let’s save that for another day.

    **We can debate whether or not SSIS dataflows are a programming language or not but again, let’s save that for another day. For the purposes of this discussion just go with me, OK.

This Blog


Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement