THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

  • Archiving Azure Automation logs to Azure BLOB Storage

    For the past few months at work I’ve been diving headlong into a service on Microsoft Azure called Azure Automation which is, to all intents and purposes, a service for hosting and running Powershell scripts (properly termed “runbooks”). I’ve got a lot to say and a lot to share about Azure Automation and this is the first such post.

    Each time Azure Automation runs a runbook (a running instance of a runbook is termed a job) it stores a log of that job however that log isn’t readily available as a log file on a file system as one might be used to. There are two ways to access the log, either via the Azure Management Portal at http://manage.windowsazure.com or programatically using Powershell cmdlets. Both are handy however I wanted the logs available as a file in Azure BLOB Storage so that they could be easily viewed and archived (in the future we’ll be building Hive tables over the top of the logs – remind me to blog about that later once its done). I have written a runbook called Move-AzureAutomationRunbookLogs that will archive logs for a given runbook and for a given time period into a given Azure BLOB Storage account and I’ve made it available on the Azure Automation gallery:

    image

    https://gallery.technet.microsoft.com/scriptcenter/Archive-Azure-Automation-898a1aa8

    It does pretty much what it says on the tin so if you’re interested hit the link and check it out. For those that are simply interested in reading the code I’ve provided a screenshot of it below too (suggestions for how to copy-paste Powershell code into Live Writer so that it keeps its formatting are welcome!!). There’s a screenshot of the output too if that’s the sort of thing that floats your boat.

    After this archiving runbook has run you’ll find two folders under the root folder that you specified:

    image

    and inside each of those is a folder for each runbook whose logs you are collecting (the folder contains a file per job). The “job” folder contains a short summary of each job however the “joboutput” folder contains the good stuff – the actual output from the run.

    One last point, if you’re going to use this runbook I highly recommend doing so in conjunction with a technique outlined by Eamon O’Reilly at Monitoring Azure Services and External Systems with Azure Automation. Eamon’s post describes a technique for running operations such as these on a schedule and its something that I‘ve found to be very useful indeed.

    Hope this is useful!

    @Jamiet

    image

    image

  • An appeal for SSDT developers – Red Gate want to hear from you

    I've been contacted by Filippa Vajda from Red Gate Software who wants me to pass on the following:

    I’m a UX Specialist at Red Gate, working on a plugin for Visual Studio.

    I need some help in finding some people for some SSDT research. We really need to speak to some SSDT users to have a 30 minute – 1 hour max. chat on a few things we’ll be working on.

    I was wondering if you’d be able to either point me at some people to contact, or put some sort of appeal on your blog for any SSDT users interested in helping us out and getting a £25 or $/euro equivalent as a thank you?

    Hence this blog post. If you’d like to speak to Filippa then simply reply to this blog post with some contact details and I’m sure Filippa will be in touch. Hopefully the opportunity to provide feedback on a future product should be incentive enough, if not the 25 notes isn’t to be sniffed at.

    @Jamiet

  • Visual Studio is free for developers. How about SQL Server?

    Big announcements for developers in the Microsoft ecosystem today, Visual Studio is now free to folks that like to write code outside of enterprises (i.e. students, open source devs, home tinkerers etc…)

    Visual Studio Community 2013 is a new, free and fully featured edition of Visual Studio, available today, with access to the full Visual Studio extensibility ecosystem and support for targeting any platform, from devices and desktop to web and cloud services.

    http://blogs.msdn.com/b/somasegar/archive/2014/11/12/opening-up-visual-studio-and-net-to-every-developer-any-application-net-server-core-open-source-and-cross-platform-visual-studio-community-2013-and-preview-of-visual-studio-2015-and-net-2015.aspx

    As usual though the the most maligned developers in the Microsoft ecosystem, SQL Server devs, have been forgotten. There’s no full-featured version of SQL Server available for them is there? Add this to the ever-growing list of evidence that Microsoft doesn’t give a rat’s ass about the SQL Server dev community any more:

    • T-SQL is a decrepit, sad, excuse for a development language with only two* significant changes (TRY…CATCH, Windowing functions) this century
    • At Microsoft’s *developer* conference earlier this year there were 185 sessions. How many of those were devoted to SQL Server? One! (OK, two).
    • Archaic, buggy, IDEs for SSIS & SSRS
    • Ubiquitous “Won’t fix” responses on Connect

    Here’s a suggestion for you Microsoft. Throw us a bone and make SQL Server Developer Edition free too. Doing so isn’t going to save the community wads of cash (it can be bought today for only $59.95 on the Microsoft Store, I’m sure that doesn’t make up much of the double-digit annual growth that they’re currently enjoying) but that’s not the point, it would send out a message. It tells us in the SQL Server community that Microsoft still cares about us, that they still want us to use their stuff. That we’re not a forgotten breed. As a SQL Server developer I haven’t felt like that for quite some time.

    A pipe dream? Let’s see.

    @Jamiet 

    *Just my opinion. Feel free to disagree in the comments below.

  • Hey, Windows 10, fix this!

    So I hear that in Windows 10 they’re making improvements to the DOS window. You’ll be able to resize it now. And use Shift plus the arrow keys to select text. That’s great. Stunning in fact. Now here’s some more things they might want to have a go at at the same time.

    Make the environment variable editor bigger

    Would it really be a great hardship to put a drag handle onto this thing?

    SNAGHTMLc13a16

    Make PATH editing work. All the time.

    A 1024 character limit? Really? Thanks for completely obliterating my %PATH% in the meantime!

    image

     

    I don’t think I’m asking for the world. Am I?

     

    If anyone has installed Windows 10 feel free to ask them to fix this. Thank you kindly.

    @Jamiet

  • Passing credentials between Azure Automation runbooks

    Update: Turns out there's an even easier way of achieving this than the method I've described in this blog post. Joe Levy explains all in the comments below. 

    I’ve been doing a lot of work lately using Azure Automation to run (what are essentially) Powershell scripts against Azure. Recently the ability for those scripts authenticate against your Azure subscription using a username and password was provided (see Authenticating to Azure using Azure Active Directory) and it basically involves a call to Get-AutomationPSCredential, here’s a handy screenshot (stolen from the aforementioned blog post) to illustrate:

    image

    That’s all fine and dandy however you may find that you want to modularise your runbooks so that you have lots of smaller discrete code modules rather than one monolithic script, if you do so you’re probably not going to want to make a call to Get-AutomationPSCredential each time (for a start, such calls make it hard to unit test your runbooks) hence you may prefer to pass the credentials between your runbooks instead.

    Here is how I do this. In your calling runbook get the credentials, extract the username and password and pass them to the called runbook:

    workflow CallingRunbook
    {
        $AzureCred = Get-AutomationPSCredential -Name "MyCreds"
        $AzureUserName = $AzureCred.GetNetworkCredential().UserName
        $AzurePassword = $AzureCred.GetNetworkCredential().Password
       
        CalledRunbook -AzureUserName $AzureUserName -AzurePassword $AzurePassword
    }

    In the called runbook use the passed-in values to authenticate to Azure

    workflow CalledRunbook
    {
        param(
            [String]$AzureUserName,
            [String]$AzurePassword
        )
        $SecurePassword = $AzurePassword | ConvertTo-SecureString -AsPlainText -Force
        $AzureCred = New-Object System.Management.Automation.PSCredential `
                -ArgumentList $AzureUserName, $SecurePassword
        Add-AzureAccount -Credential $AzureCred
    }

    Job done!

    image

    A quick and dirty blog post but one which folks should find useful if they’re using Azure Automation. Feedback welcome.

    @Jamiet

  • Code Navigation (10 days of SSDT – Day 10)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    One of the aspects of T-SQL development that really grates on me is that the features of both the language and tooling are light years behind that of our brethren that write in “higher order” languages like C#, VB, Java etc… One of the tooling features that folks who live in Visual Studio take for granted is code navigation, happily that feature is now available in Visual Studio for T-SQL developers when one installs SSDT.

    Be aware that SSDT’s code navigation features only work when you are working offline using SSDT projects.

    There are only two keyboard shortcuts you need to learn to master code navigation (and please do use keyboard shortcuts, don’t wimp out and use the mouse):

    • F12
    • Shift+F12

    Go To Definition

    Go To Definition allows you to position your cursor inside a reference to an object, press F12, and SSDT will show you the definition of that object. Sounds simple, and it is, but its wonderfully useful when you get into the swing of using it. The following screenshots are an attempt to convey it.

    Notice here the cursor is positioned inside a reference to a table called [datawarehouse].[DimCurrency]:

    SNAGHTML1325ff31

    Simply pressing F12 at that point opens the definition of that table:

    SNAGHTML1327b945

    Find All References

    Find All References does the opposite of Go To Definition, it shows everywhere that an object is referenced. If you’ve ever been afraid to remove an object from a database because you didn’t know what was using it (let’s face it, we’ve all been there) then this feature can be a lifesaver.

    Take the previous example of table [datawarehouse].[DimCurrency]. If I again position the cursor inside a reference to that object and this time press Shift+F12 SSDT will search through my database model and present a list of all the objects where it is referred to:

    SNAGHTML13318b6e

    In the screenshot immediately above we see that this table is referred to 51 times. Those references may be in stored procedures, views, functions, triggers, constraints or extended properties. Double-clicking on any of the entries in that list takes you directly to the definition of the referring object.

    Summary

    That really is it. Two very simple features but ones which can have a massively positive effect on your productivity.

    Hopefully this goes without saying but code navigation does not work if you are referencing objects using dynamic SQL, this is because SSDT sees that dynamic SQL string as just that, a string, rather than a reference to another object in the model. Probably just bear that in mind.

    Code navigation is top of the list of features that drive my preference for developing in offline SSDT rather than online in SSMS.

    @Jamiet 

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

  • A story of scale-out on HDInsight

    Lately I have been doing a lot of work using Microsoft’s Hadoop offering, HDInsight. I suspect a lot of people who read my blog are unfamiliar with what Hadoop actually is so I thought I’d recount a recent test I did that exhibits the scale-out nature of processing data on Hadoop.

    The test used a mapreduce job (written in Java) to process an incoming CSV file and load it into an HBase database on that same cluster (HBase is a NoSQL data store, for the purposes of this blog post that’s all you need to know right now). Mapreduce is a technique for doing data processing on Hadoop’s scale out architecture where incoming data is split according to some expression over the incoming rows and then rows with the same result of that expression are combined using some sort of aggregation. The splitting and aggregation can be done on multiple nodes in the Hadoop cluster and hence I like to refer to mapreduce as distributed GroupBy.

    The actual work that our mapreduce job is doing isn’t important, but for your information its combining many rows of data pertaining to a single customer into a JSON document and loading that document into HBase. Upon successful completion of the job HBase contained 3125000 JSON documents.

    TL;DR

    I loaded 1.6GB of raw source data (62.5m rows) into HBase using HDInsight, the quickest I was able to complete this was 34m2s.The key to speeding up our throughput was to (a) use a bigger HDInsight cluster and (b) split our input data into multiple files thus forcing the processing to be distributed over more map tasks. With more performance tuning (aka scaling out to more HDInsight nodes) I am confident we get this much lower.

    Note that it is possible to specify the number of tasks that the map phase uses rather than Hadoop guessing how many it should use, for this test I chose not to specify that. In other words, splitting the incoming data over multiple files is not a necessity, it was just a trick I pulled to affect the mapreduce job.

    The detail

    I generated a 1.6GB (1,677,562,500B) file containing 62 500 000 rows. On the first run I used an HDInsight cluster that had 2 worker nodes. The entire mapreduce job took ~1hr13m50s. The map phase took ~58m, reduce phase took ~1hr6m (so clearly they overlapped – that is because the reduce phase starts as soon as the first map task completes and as you will see below the map tasks completed at wildly different times).

    Even though the reduce phase took longer its actually the map phase which caused the long execution time. To try and pinpoint why it took so long I dove into the logs that Hadoop produces. Unless you tell Hadoop otherwise it determines how many tasks it should spin up in the map phase and in this case it determined it needed 4 map tasks:

    clip_image002

    I’ve highlighted the elapsed times for each, note the 4th is much lower. This would explain why the reduce phase took so long, it started as soon as the first map task completed but then had to wait ~52minutes until all the other map tasks were complete.

    Each one of those tasks has its own task log and from those task logs I found the following information:

    Processing split: wasb://etl@dhnlsjamiet.blob.core.windows.net/nls/20141009T130423423_20140101000000_usuals_1_0/input/usuals_1_0_20140101000000.csv:0+536870912
    Processing split: wasb://etl@dhnlsjamiet.blob.core.windows.net/nls/20141009T130423423_20140101000000_usuals_1_0/input/usuals_1_0_20140101000000.csv:536870912+536870912
    Processing split: wasb://etl@dhnlsjamiet.blob.core.windows.net/nls/20141009T130423423_20140101000000_usuals_1_0/input/usuals_1_0_20140101000000.csv:1073741824+536870912
    Processing split: wasb://etl@dhnlsjamiet.blob.core.windows.net/nls/20141009T130423423_20140101000000_usuals_1_0/input/usuals_1_0_20140101000000.csv:1610612736+66949764

    The numbers at the end represent the byte ranges that each task is processing (the first one starts from byte 0 as you would expect). Notice the last one (1610612736+66949764). That means it is starting from byte 1610612736 and processing the next 66949764 bytes. Given that task is the 4th task of 4 it shouldn’t surprise you to know that if you add those two numbers together they come to 1677562500 which is exactly the same size as the input file. In other words, the logs tell us exactly what we should expect, that the input data has been split over the 4 tasks that it deemed necessary to process this file.

    Notice that the first 3 tasks processed 536 870 912B, the 4th processed only about 12% of that, 66 949 764B. This would explain why the 4th task completed so much quicker than the others. The data has not been split evenly, and clearly that’s a problem because one of the map tasks completed so much quicker than the others which ultimately means the reduce phase has to sit around waiting for all the data – the uneven split of the data has caused inefficient use of our cluster.

    We can infer some things from this:

    • The less data that a task has to process, then the less time that task takes to complete (pretty obvious)
    • If we can increase the number of tasks, the data will be distributed more uniformly over those tasks and they should complete much quicker (and in roughly the same amount of time) due to having less data to process and less competition for resources.

    Thus I ran the same test again changing only one variable, the number of nodes in the HDInsight cluster – I increased it from 2 to 20. I hoped that this would increase the number of map tasks. Unfortunately the job failed (my fault, I left some output files lying around from a previous run and that caused a failure) however it got as far as completing the map phase which is pretty much all I cared about:

    clip_image004

    As you can see there were still only 4 tasks and they actually took longer. So, we didn’t achieve more tasks and thus we didn’t speed the job up. That’s not good. I can’t explain right now why they actually took longer. The same number of tasks (4) distributed over a greater number of nodes (20) would, you would think, be slightly quicker due to less resource contention. Bit of a weird one that and I can’t explain it right now.

    I wondered if splitting the input file into lots of smaller files would make a difference so I split that file into 20 equally sized smaller files and ran the job again on the 2-node cluster. This time we got 20 tasks:

    clip_image006

    Which is great, however the map phase failed due to out-of-memory issues:

    clip_image008

    So, I uploaded those same 20 files to the 20node cluster and ran again. We got 20 tasks in the map phase and, thankfully, this time they all completed successfully. The entire job (map + reduce) completed in 34m2s (less than half the time taken on the 2node cluster when loading the single file), the map phase completed in 10m34s, reduce phase took 24m46s. The overlap there is only 1m18s and that’s because the durations of the map tasks were more uniformly distributed due to the data being separated over more tasks. Here are the 20 map tasks with durations:

    clip_image010

    That has been a bit of a braindump but I figured it might be interesting to anyone starting out on the path of doing data processing on Hadoop. Please post questions in the comments.

    @Jamiet

    UPDATE, Thanks to Paul Lambert I've found this very useful primer on attempting to artificially set the number of mappers and reducers: HowManyMapsAndReduces 

  • Declarative, Model Driven Development (10 days of SSDT – day 9)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    I’ve used the word “model” a few times in this series already without expanding on it too much but its worth doing because its actually fundamental to how SSDT works, a model underpins every that is done in SSDT.

    In the context of SSDT a model is a representation of a database, pure and simple. In SSDT that model ostensibly exists in two places, in memory when one has an SSDT project open or one is working in connected development mode, and in a .dacpac file which gets produced when an SSDT project gets built. It may help to think of a .dacpac as a representation of the model, stored in a file. (If you want to learn more about dacpacs specifically refer to my January 2014 blog post Dacpac braindump - What is a dacpac?)

    To illustrate the point if one takes a look inside a .dacpac (which is actually just a zip file with a different extension) one will notice a file called model.xml:

    image

    Open that up and you will see that its just a definition of lots of database objects represented as XML:

    SNAGHTML8c747a6

    There is an important point to make here. The model only states the definition of an object as it has been defined, it does not state how it should go about creating that object (there are no ALTER TABLE statement here for example). This notion of only defining the structure of an object, not how to create it, is known as a declarative approach to database development. One is declaring the intended state of a database, nothing else. The beauty of SSDT is that the tool can take your model and from that work out what it needs to do to get a given physical database into the same state as declared in the model. That is the fundamental underpinning of SSDT and is an important concept to grasp if one intends to use SSDT.

    @Jamiet 

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you apply register before the end of 2014.

    Earlier posts in this series:

  • Querying a dacpac from Linqpad

    I have been playing around with the DAC Framework over the past few days, namely the DacAPI which is a .Net API that one can use to programmatically traverse the contents of a dacpac.

    image

    If one prefers one can open a new C# project to start playing with the API and interrogating a dacpac but I find it a lot easier (and more fun) to use one of my favourite tools, Linqpad. Linqpad allows you to write C# code without the heavyweight ceremony of a project in Visual Studio. Stated another way, it allows you to write C# scripts rather than programs. In this post I’ll explain how to use Linqpad to interrogate a dacpac.

    First of all I’ll assume you’ve downloaded Linqpad (its free), open it up and switch your query to using C# Statements

    image

    Press F4 to bring up the query properties where you can add references to the assemblies that you need. If you have Visual Studio 2013 installed then you’ll find the DLLs at <Visual Studio Install Dir>\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\120 (if you are using a previous version of Visual Studio or if you installed the DAC Framework in isolation then it may be somewhere else, check this forum thread for details), add them here:

    image

    You need to reference

    • Microsoft.SqlServer.Dac.dll
    • Microsoft.SqlServer.Dac.Extensions.dll

    Switch to the Additional Namespace Imports tab and enter the following:

    • Microsoft.SqlServer.Dac
    • Microsoft.SqlServer.Dac.Model
    • System.Collections.Generic
    • System.Text

    image

    You’re now good to go to start writing some code. Here’s a starter for ten that dumps out a CREATE TABLE script for each table in the dacpac:

    var model = new TSqlModel(@"C:dacpacs\Mydacpac.dacpac");
    var allTables = model.GetObjects(DacQueryScopes.All, ModelSchema.Table);
    var tableScripts = from t in allTables
                    select t.GetScript();
    tableScripts.Dump(); //Dump() is an extension function that simply displays the contents of the object in the Results window

    image

    Cool stuff. Have fun!

    @Jamiet

  • Connected development (10 days of SSDT – Day 8)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    In Day 1 – Online and Offline I explained how SSDT could be used to query and affect existing databases:

    I can browse through the objects on those servers and interrogate them just as I would in SSMS. Within SSDT/SSOX however there is one important difference, when I make changes to the database I am not affecting the database directly, I am affecting an in-memory model of the database. This approach has two important advantages,

    • I can view the impact of a change (e.g. if I change a column name will it break any views that reference it) without actually making the change
    • I can make many changes to the model and then push them into my actual database en masse and SSDT will work out the best way of doing that. I don’t have to write the code that deploys my code.

    These capabilities are worth exploring in more detail which I’ll do by comparing the schema editing experience in SSMS & SSDT. Here I show that I have a copy of venerable old Northwind which I’m going to use to demo this:

    image

    Let’s say, hypothetically, I want to change the name of the CustomerID field; SSMS has a UI that enables me to do this and when it does so it will alter the definition of all the affected objects such as foreign keys that reference that column. That’s quite useful however it does somewhat shield you, the developer, from knowing the intricacies of what SSMS is doing under the covers. In SSDT this scenario is a little different, we browse to the table in SQL Server Object Explorer (SSOX – its one of the integral components to SSDT) and select ‘View Designer’:

    image

    This launches us into SSDT’s Table Designer (which we have seen before in Day 4 – Table Designer) where we can go ahead and make our change:

    image

    Let’s explain what’s going on here. We’ve changed the name of a column and before we’ve actually saved our change SSDT shows us which objects will be impacted by that change. For example, the first error in that list is:

    Computed Column: [dbo].[Quarterly Orders].[CustomerID] contains an unresolved reference to an object. Either the object does not exist or the reference is ambiguous because it could refer to any of the following objects: [dbo].[Customers].[CustomerID], [dbo].[Customers].[Customers]::[CustomerID] or [dbo].[Orders].[Customers]::[CustomerID].

    Double clicking on that error opens up the definition of the affected object and gives us a red squiggly indicating exactly where the error is:

    image

    That’s two reasons why I like the SSDT way, (1) it shows you the implications of your change as you type rather than after you try and commit the change and (2) you can jump straight to affected objects so you can change them, plus you get nice red squigglies and intellisense too:

    image

    The third, and what I think of as the best, benefit though can’t easily be demonstrated with screenshots; that is, when I make any changes I’m not actually making changes to the physical database. In actuality SSDT has built an offline model of the database underneath the covers and it is that model to which I am making changes. I can go on making changes (probably by double clicking on all the errors in the error list) and each one of those changes is made to the offline model, not to the physical database.

    Eventually I will reach a state where all the errors have been solved and I am ready to push all the model changes back to the actual database. To do that I simply hit the Update button that will appear at the top of each edited DDL script:

    image

    and when I do so a dialog appears with a pseudo code description of all the changes that I have made:

    image

    (Note that this is the same pseudo code that we talked about previously in Day 7 - Data Tools Operations Window)

    I can hit the Generate Script button to build a SQL script that will make all those changes for me or simply hit Update Database to push all the changes up.

    In summary, SSDT allows you to queue up a series of changes to a database by affecting an offline model rather than the database itself. If you’re making a simple change that doesn’t affect anything else then the benefit here is negligible but if your changes are more substantial than that then this can be a really useful feature.

    The last note on this feature is that when SSDT was first released this feature was known as PowerBuffer although I haven’t heard that word mentioned much (if ever) since then. The only reason I mention it here is so that if you hear the word you will know what is being referred to. If you are interested there is useful PowerBuffer documentation on MSDN: How to: Update a Connected Database with Power Buffer.

    @Jamiet

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

  • Data Tools Operations Window (10 days of SSDT – Day 7)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    SSDT provides feedback on the actions that it is undertaking via the Data Tools Operations window. This is a window that sits within SSDT’s Visual Studio shell and looks something like this:

    image

    Whenever SSDT publishes an SSDT project (in the offline development scenario) or makes changes to an existing database (in the connected development scenario) progress will be reported in the Data Tools Operations window. For each operation reported in the Data Tools Operations window you have the ability to view:

    • Preview - A  description of the changes that SSDT is intending to make to the target database
    • Script – The T-SQL script that SSDT builds
    • Results – Essentially the output from running the aforementioned script

    Here is a typical screenshot of the Preview:

    image

    As you can it provides a readable, pseudo-code description of what actions SSDT is going to take.

    The Data Tools Operations window is a very useful addition to SSDT and its worth highlighting it in this dedicated blog post so that people are aware of its capabilities and, importantly, that it even exists.

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

  • SQL Server Object Explorer (10 days of SSDT - Day 6)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    SQL Server Object Explorer (SSOX) is a crucial feature of SSDT and is a window that sits within SSDT’s Visual Studio shell; it is made visible from from the View menu or by using the keyboard shortcut CTRL+\, CTRL+S.

    image

    As a SQL Server developer that chooses to use SSDT I spend most of my time within SSOX. Think if it as a gateway to both your offline database source code and your physical deployed databases.

    In Day 1 – Online and offline I said

    SSDT supports two distinct ways of working, online and offline. Those two distinct usage scenarios are most obvious with SQL Server Object Explorer

    This is demonstrated by the existence of two top-level nodes within SSOX, they are called:

    • SQL Server
    • Projects

    You can see both in the screenshot above. The SQL Server node is a list of SQL Server instances that you have registered within SSOX (in the screenshot above I have registered one instance called “(localdb)\Projects”) which you can then expand thus allowing you to browse all the databases and server level objects upon that instance; an easy way to think of this SQL Server node is as a hybrid of the Registered Servers and Object Explorer windows in SQL Server Management Studio (SSMS). You can register any edition of SQL Server (Express through to Enterprise) and also Windows Azure SQL Database (WASD) instances too.

    The Projects node of SSMS provides a logical view over SSDT projects that you have open within the Solution Explorer window, I personally am a big fan of SSDT projects (see Day 3 – What is a Projects) so this Projects node is where I spend most of my time. The screenshot below depicts the interaction between Solution Explorer and SSOX:

    image

    Notice there is a project in Solution Explorer called Day04 – Table Designer and there is a node under Projects within SSOX of the same name (highlighted by the red arrow). That project has two files called Product.table.sql & Product.PrimaryKey.sql, both of which are open in the centre of the screen (note that SSDT does not stipulate that filenames must reflect the name of the object defined within them, but it is useful if they do). Notice also that table [dbo].[Product] and its primary key both appear in SSOX (depicted with the blue lines) however in there they appear as you might expect them to appear in SSMS. They are displayed in a hierarchical view of the database along with appropriate icons to signify the object type (table, primary key, etc…). This is the power of the Projects node in SSOX, it provides a logical view of the objects defined in the files of the project. If you are so inclined you can even define all of your objects within a single file in Solution Explorer (note I’m not recommending you do that) and SSOX will still show the same logical view over them all.

    Displaying a logical view of your databases and database projects is the basic premise of SSOX but it does have a number of other features too. For example there is a button on the SSOX toolbar labelled Group by Schema:

    SNAGHTML768d76e

    This changes the logical view to make Schemas a high-level node from which you can drill in to find all your database objects contained therein:

    SNAGHTML765169d6

    If you have a lot of schemas in your database then this can be really useful (I just wish they had an option to hide all the built-in schemas such as db_accessadmin, db_backupoperator etc…). A nice side-effect of grouping by schema is that you can right-click on one of the nodes underneath a schema in order to create a new object and the DDL script for that object will place it into the respective schema rather than the default [dbo].

    SSOX has a few other features that can be launched from the right-click menu but I’ll leave investigation of those as an exercise for the reader; we have covered the main features of SSOX herein, if you have any questions please feel free to ask them in the comments section below.

    @Jamiet

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

  • Azure Automation Pain Points

    In April of this year Microsoft announced a new offering on their Azure platform called Azure Automation (which I briefly talked about at Azure Automation – the beginning of cloud ETL on Azure?). Azure Automation allows one to run Powershell scripts on Azure.

    I have been using Azure Automation for the past couple of months on my current project and, as the offering is still in preview, have been asked by the product team to share with them some experiences of using it, specifically the pain points. As is my habit I’d like to share those experiences here on my blog too and given that (a) anyone can sign up and use Azure Automation and (b) I’m not under NDA I see no reason not to do that. Hence, here is the transcript of an email I sent to a member of the Azure Automation team yesterday:


    Hi,

    Thank you for the invite to email you with my pain points regarding AA so far.

    Biggest thing has to be the developer experience. I consider myself a developer rather than a sysadmin-type-person so the develop->deploy-> test-> fix cycle is important to me. As a developer its important that everything is source controlled and so I choose to develop my runbooks locally in my Powershell editor of choice and then upload to Azure Automation using scripts (which are, of course, written in Powershell). These same deployment scripts will be the ones that we eventually use to deploy to our production environment (when we reach that stage - we're not there yet) and are themselves source-controlled. Eventually we will get to the point where each time someone checks a change into the master branch we automatically push to the production environment (aka Continuous Deployment).

    The problem I have is that Azure Automation is not well suited to this type of quick development cycle. Firstly, because there are activities in my runbooks that don't work outside of Azure Automation (i.e. Get/Set-AutomationVariable) I can't run my runbooks locally. Hence, each time I make a change to a runbook (of which I currently have 6) I run my deployment script so that I can test it. Deploying then publishing a runbook takes in the region of 20seconds, multiply that by 6 and I've got a long wait between making a change and getting to the stage where I can test that change.

    Worse than that though is the time it takes to run a runbook. I know Azure Automation is currently in preview (and hence can kind of excuse what I'm about to say) but I still find the length of time that a runbook spends in the “starting” status inordinately long. I want to be able to fire off a runbook and have it start executing immediately - currently I have to wait a long time until the runbook actually starts executing (and I'm talking minutes, not seconds).

    Put all that together and you've got a situation where testing just a single change to a runbook takes minutes. That is unacceptable to be honest - I need it to be seconds.

    You might say that I could use the in-browser editing experience but that means I'm not editing the master copy of my runbook that goes into source control, and it doesn't solve the problem of the massive start up time.

    Hope this is useful to you.

    Regards
    Jamie Thomson

     


     

    I was asked to describe my pain points which is why that email comes over as rather derogatory however I should point out that I have found the overall capabilities of Azure Automation to be quite impressive. Its a very capable platform for automating the management of Azure resources and I am looking forward to it becoming generally available very soon.

    @Jamiet

  • Extended property support in SSDT (10 days of SSDT – Day 5)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    Extended properties are a feature of SQL Server that have existed since SQL Server 2000 although I assume that you probably don’t use them as I rarely see them being used on my travels. Nonetheless extended properties are a useful way of supplying extra information about your database. Most often they are used for documentation purposes however I have also used them to drive workflow in an ETL process (e.g. I toggle the loading of a table by changing one of its extended properties). SQL Server Data Tools (SSDT) includes good support for extended properties and in this blog post I’ll cover what support it provides.

    First is SSDT’s table designer. Notice in this screenshot that we have the option to choose what information is displayed per column and that “Description” is one of those chosen columns:

    SNAGHTML705b570

    If I enter a value into that Description for a column then check out what happens:

    SNAGHTML714660f

    SSDT creates the DDL that creates an extended property named MS_Description on the column in question. You can create your own extended properties as well of course but unfortunately only extended properties called MS_Description will get surfaced in the UI (and even then only for columns). If you wish to create your own then simply create them as per the code above, using GO to separate them if you have more than one defined in the same script.

    One nice aspect to this support is that SSDT treats extended properties as schema rather than data hence if I change the name of the column like so:

    image

    then an error appears, along with a red squiggly, informing you of an invalid reference in the extended property definition. Note that refactoring is supported for extended properties so if you change the name of a referenced object using refactor rename then the extended property will get updated accordingly.

    At publish time SSDT will take care of creating the extended property if it doesn’t exist or modifying it if it does, just like any other schema object. Here’s the output from publishing my project to LocalDB:

    image

    image

    OK, that’s pretty much everything there is to know about extended property support in SSDT. Any questions? Let me know.

    @Jamiet

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

  • Table designer (10 days of SSDT – Day 4)

    SQL Server Data Tools (SSDT) was released in Spring 2012 and I have been using it to build SQL Server databases nearly every day since then. In that time I’ve learnt a lot about how to make best use of the tool and I want to share that experience in a new series of blog posts called “10 days of SSDT”. I shall be publishing a different blog post every day for 10 days, each day revealing something that you may not know about SSDT. I hope you enjoy the series and contribute via the comments!

    Every database contains tables therefore not surprisingly one of the most fundamental features within SSDT is the table designer which I think of as being split into three areas, Script, Columns & Child objects:

    image

    Each table definition is stored as DDL in a text file with a .sql suffix however SSDT understands when a file contains a table definition and hence by opening up such a file the table designer will (by default) get displayed (you do have the option to right-click on a file and just show the raw script). The DDL for the table is shown in the script section of the table designer while the Columns and Child objects section are a graphical representation of that DDL. The significance of having a graphical UI over the script is that the two stay in sync with each other, any change you make in one will get reflected in the other. For example, if I were to check the ‘Allow Nulls’ checkbox for [Id] field the NOT NULL in the script would immediately change to NULL. If I define an index in my script then that will appear under Indexes in the Child objects section.

    Its intuitive to think that the Table Designer displays DDL from only one file but actually that’s not the case, its a lot smarter than that. Take the following example where I have a table defined in one file (Product.table.sql) and the primary key for that table defined in another file (Product.PrimaryKey.sql):

    image

    Notice how the primary key [pkdboProduct] is still displayed in the Child objects section even though it is defined in another file. Notice also how I can choose which file the script section displays:

    image

    That’s the basic functionality of the Table Designer.

    @Jamiet

    If you want to learn more about SSDT then come along to my training course Introduction to SQL Server Data Tools in association with Technitrain on 23rd/24th February 2015 in London. There is a discount if you register before the end of 2014.

    Earlier posts in this series:

More Posts Next page »

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement