|
|
|
|
Rob Farley - Owner/Principal with LobsterPot Solutions (a MS Gold Partner consulting firm), Microsoft MVP (SQL Server) and leader of the SQL User Group in Adelaide, Australia. Rob is also a Director of PASS, and runs training courses around the world in SQL Server and BI topics.
-
I posted a few hours ago about a reflection of the Summit, but I wanted to write another one for this month’s T-SQL Tuesday, hosted by Chris Yates.  In January of this year, Adam Jorgensen and I joked around in a video that was used for the SQL Server 2012 launch. We were asked about SQLFamily, and we said how we were like brothers – how we could drive each other crazy (the look he gave me as I patted his stomach was priceless), but that we’d still look out for each other, just like in a real family. And this is really true. Last week at the PASS Summit, there was a lot going on. I was busy as always, as were many others. People told me their good news, their awful news, and some whinged to me about other people who were driving them crazy. But throughout this, people in the SQL Server community genuinely want the best for each other. I’m sure there are exceptions, but I don’t see much of this. Australians aren’t big on cheering for each other. Neither are the English. I think we see it as an American thing. It could be easy for me to consider that the SQL Community that I see at the PASS Summit is mainly there because it’s a primarily American organisation. But when you speak to people like sponsors, or people involved in several types of communities, you quickly hear that it’s not just about that – that PASS has something special. It goes beyond cheering, it’s a strong desire to see each other succeed. I see MVPs feel disappointed for those people who don’t get awarded. I see Summit speakers concerned for those who missed out on the chance to speak. I see chapter leaders excited about the opportunity to help other chapters. And throughout, I see a gentleness and love for people that you rarely see outside the church (and sadly, many churches don’t have it either). Chris points out that the M-W dictionary defined community as “a unified body of individuals”, and I feel like this is true of the SQL Server community. It goes deeper though. It’s not just unity – and we’re most definitely different to each other – it’s more than that. We all want to see each other grow. We all want to pull ourselves up, to serve each other, and to grow PASS into something more than it is today. In that other post of mine I wrote a bit about Paul White’s experience at his first Summit. His missus wrote to me on Facebook saying that she welled up over it. But that emotion was nothing about what I wrote – it was about the reaction that the SQL Community had had to Paul. Be proud of it, my SQL brothers and sisters, and never lose it.
|
-
So far, my three PASS Summit experiences have been notably different to each other. My first, I wasn’t on the board and I gave two regular sessions and a Lightning Talk in which I told jokes. My second, I was a board advisor, and I delivered a precon, a spotlight and a Lightning Talk in which I sang. My third (last week), I was a full board director, and I didn’t present at all. Let’s not talk about next year. I’m not sure there are many options left. This year, I noticed that a lot more people recognised me and said hello. I guess that’s potentially because of the singing last year, but could also be because board elections can bring a fair bit of attention, and because of the effort I’ve put in through things like 24HOP... Yeah, ok. It’d be the singing. My approach was very different though. I was watching things through different eyes. I looked for the things that seemed to be working and the things that didn’t. I had staff there again, and was curious to know how their things were working out. I knew a lot more about what was going on behind the scenes to make various things happen, and although very little about the Summit was actually my responsibility (based on not having that portfolio), my perspective had moved considerably. Before the Summit started, Board Members had been given notebooks – an idea Tom (who heads up PASS’ marketing) had come up with after being inspired by seeing Bill walk around with a notebook. The plan was to take notes about feedback we got from people. It was a good thing, and the notebook forms a nice pair with the SQLBits one I got a couple of years ago when I last spoke there. I think one of the biggest impacts of this was that during the first keynote, Bill told everyone present about the notebooks. This set a tone of “we’re listening”, and a number of people were definitely keen to tell us things that would cause us to pull out our notebooks. PASSTV was a new thing this year. Justin, the host, featured on the couch and talked a lot of people about a lot of things, including me (he talked to me about a lot of things, I don’t think he talked to a lot people about me). Reaching people through online methods is something which interests me a lot – it has huge potential, and I love the idea of being able to broadcast to people who are unable to attend in person. I’m keen to see how this medium can be developed over time. People who know me will know that I’m a keen advocate of certification – I've been SQL certified since version 6.5, and have even been involved in creating exams. However, I don’t believe in studying for exams. I think training is worthwhile for learning new skills, but the goal should be on learning those skills, not on passing an exam. Exams should be for proving that the skills are there, not a goal in themselves. The PASS Summit is an excellent place to take exams though, and with an attitude of professional development throughout the event, why not? So I did. I wasn’t expecting to take one, but I was persuaded and took the MCM Knowledge Exam. I hadn’t even looked at the syllabus, but tried it anyway. I was very tired, and even fell asleep at one point during it. I’ll find out my result at some point in the future – the Prometric site just says “Tested” at the moment. As I said, it wasn’t something I was expecting to do, but it was good to have something unexpected during the week. Of course it was good to catch up with old friends and make new ones. I feel like every time I’m in the US I see things develop a bit more, with more and more people knowing who I am, who my staff are, and recognising the LobsterPot brand. I missed being a presenter, but I definitely enjoyed seeing many friends on the list of presenters. I won’t try to list them, because there are so many these days that people might feel sad if I don’t mention them. For those that I managed to see, I was pleased to see that the majority of them have lifted their presentation skills since I last saw them, and I happily told them as much. One person who I will mention was Paul White, who travelled from New Zealand to his first PASS Summit. He gave two sessions (a regular session and a half-day), packed large rooms of people, and had everyone buzzing with enthusiasm. I spoke to him after the event, and he told me that his expectations were blown away. Paul isn’t normally a fan of crowds, and the thought of 4000 people would have been scary. But he told me he had no idea that people would welcome him so well, be so friendly and so down to earth. He’s seen the significance of the SQL Server community, and says he’ll be back. It’ll be good to see him there. Will you be there too?
|
-
PASS Summit is coming up, and I thought I’d post a few things.
Summit Old...
At the PASS Summit, you will get the chance to hear presentations by the SQL Server establishment. Just about every big name in the SQL Server world is a regular at the PASS Summit, so you will get to hear and meet people like Kalen Delaney (@sqlqueen) (who just recently got awarded MVP status for the 20th year running), and from all around the world such as the UK’s Chris Webb (@technitrain) or Pinal Dave (@pinaldave) from India. Almost all the household names in SQL Server will be there, including a large contingent from Microsoft. The PASS Summit is by far the best place to meet the legends of SQL Server. And they’re not all old. Some are, but most of them are younger than you might think.
...Summit New...
The hottest topics are often about the newest technologies (such as SQL Server 2012). But you will almost certainly learn new stuff about older versions too. But that’s not what I wanted to pick on for this point.
There are many new speakers at every PASS Summit, and content that has not been covered in other places. This year, for example, LobsterPot’s Roger Noble (@roger_noble) is giving a presentation for the first time. He’s a regular around the Australian circuit, but this is his first time presenting to a US audience. New Zealand’s Paul White (@sql_kiwi) is attending his first PASS Summit, and will be giving over four hours of incredibly deep stuff that has never been presented anywhere in the US before (I can’t say the world, because he did present similar material in Adelaide earlier in the year).
...Summit Borrowed...
No, I’m not talking about plagiarism – the talks you’ll hear are all their own work.
But you will get a lot of stuff you’ll be able to take back and apply at work. The PASS Summit sessions are not full of sales-pitches, telling you about how great things could be if only you’d buy some third-party vendor product. It’s simply not that kind of conference, and PASS doesn’t allow that kind of talk to take place. Instead, you’ll be taught techniques, and be able to download scripts and slides to let you perform that magic back at work when you get home. You will definitely find plenty of ideas to borrow at the PASS Summit.
...Summit Blue
Yeah – and there’s karaoke. Blue - Jason - SQL Karaoke - YouTube
|
-
I got asked about calling home from the US, by someone going to the PASS Summit. I found myself thinking “there should be a blog post about this”... The easiest way to phone home is Skype - no question. Use WiFi, and if you’re calling someone who has Skype on their phone at the other end, it’s free. Even if they don’t, it’s still pretty good price-wise. The PASS Summit conference centre has good WiFI, as do the hotels, and plenty of other places (like Starbucks). But if you’re used to having data all the time, particularly when you’re walking from one place to another, then you’ll want a sim card. This also lets you receive calls more easily, not just solving your data problem. You’ll need to make sure your phone isn’t locked to your local network – get that sorted before you leave. It’s no trouble to drop by a T-mobile or AT&T store and getting a prepaid sim. You can’t get one from the airport, but if the PASS Summit is your first stop, there’s a T-mobile store on 6th in Seattle between Pine & Pike, so you can see it from the Sheraton hotel if that’s where you’re staying. AT&T isn’t far away either. But – there’s an extra step that you should be aware of. If you talk to one of these US telcos, you’ll probably (hopefully I’m wrong, but this is how it was for me recently) be told that their prepaid sims don’t work in smartphones. And they’re right – the APN gets detected and stops the data from working. But luckily, Apple (and others) have provided information about how to change the APN, which has been used by a company based in New Zealand to let you get your phone working. Basically, you send your phone browser to http://unlockit.co.nz and follow the prompts. But do this from a WiFi place somewhere, because you won’t have data access until after you’ve sorted this out... Oh, and if you get a prepaid sim with “unlimited data”, you will still need to get a Data Feature for it. And just for the record – this is WAY easier if you’re going to the UK. I dropped into a T-mobile shop there, and bought a prepaid sim card for five quid, which gave me 250MB data and some (but not much) call credit. In Australia it’s even easier, because you can buy data-enabled sim cards that work in smartphones from the airport when you arrive. I think having access to data really helps you feel at home in a different place. It means you can pull up maps, see what your friends are doing, and more. Hopefully this post helps, but feel free to post comments with extra information if you have it. @rob_farley
|
-
SQL Server Reporting Services plays nicely. You can have things in the catalogue that get shared. You can have Reports that have Links, Datasets that can be used across different reports, and Data Sources that can be used in a variety of ways too. So if you find that someone has deleted a shared data source, you potentially have a bit of a horror story going on. And this works for this month’s T-SQL Tuesday theme, hosted by Nick Haslam, who wants to hear about horror stories. I don’t write about LobsterPot client horror stories, so I’m writing about a situation that a fellow MVP friend asked me about recently instead. The best thing to do is to grab a recent backup of the ReportServer database, restore it somewhere, and figure out what’s changed. But of course, this isn’t always possible. And it’s much nicer to help someone with this kind of thing, rather than to be trying to fix it yourself when you’ve just deleted the wrong data source. Unfortunately, it lets you delete data sources, without trying to scream that the data source is shared across over 400 reports in over 100 folders, as was the case for my friend’s colleague. So, suddenly there’s a big problem – lots of reports are failing, and the time to turn it around is small. You probably know which data source has been deleted, but getting the shared data source back isn’t the hard part (that’s just a connection string really). The nasty bit is all the re-mapping, to get those 400 reports working again. I know from exploring this kind of stuff in the past that the ReportServer database (using its default name) has a table called dbo.Catalog to represent the catalogue, and that Reports are stored here. However, the information about what data sources these deployed reports are configured to use is stored in a different table, dbo.DataSource. You could be forgiven for thinking that shared data sources would live in this table, but they don’t – they’re catalogue items just like the reports. Let’s have a look at the structure of these two tables (although if you’re reading this because you have a disaster, feel free to skim past). Frustratingly, there doesn’t seem to be a Books Online page for this information, sorry about that. I’m also not going to look at all the columns, just ones that I find interesting enough to mention, and that are related to the problem at hand. These fields are consistent all the way through to SQL Server 2012 – there doesn’t seem to have been any changes here for quite a while. dbo.Catalog The Primary Key is ItemID. It’s a uniqueidentifier. I’m not going to comment any more on that. A minor nice point about using GUIDs in unfamiliar databases is that you can more easily figure out what’s what. But foreign keys are for that too… Path, Name and ParentID tell you where in the folder structure the item lives. Path isn’t actually required – you could’ve done recursive queries to get there. But as that would be quite painful, I’m more than happy for the Path column to be there. Path contains the Name as well, incidentally. Type tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it). Content is an image field, remembering that image doesn’t necessarily store images – these days we’d rather use varbinary(max), but even in SQL Server 2012, this field is still image. It stores the actual item definition in binary form, whether it’s actually an image, a report, whatever. LinkSourceID is used for Linked Reports, and has a self-referencing foreign key (allowing NULL, of course) back to ItemID. Parameter is an ntext field containing XML for the parameters of the report. Not sure why this couldn’t be a separate table, but I guess that’s just the way it goes. This field gets changed when the default parameters get changed in Report Manager. There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead. dbo.DataSource The Primary key is DSID. Yes it’s a uniqueidentifier... ItemID is a foreign key reference back to dbo.Catalog Fields such as ConnectionString, Prompt, UserName and Password do what they say on the tin, storing information about how to connect to the particular source in question. Link is a uniqueidentifier, which refers back to dbo.Catalog. This is used when a data source within a report refers back to a shared data source, rather than embedding the connection information itself. You’d think this should be enforced by foreign key, but it’s not. It does allow NULLs though. Flags this is an int, and I’ll come back to this. When a Data Source gets deleted out of dbo.Catalog, you might assume that it would be disallowed if there are references to it from dbo.DataSource. Well, you’d be wrong. And not because of the lack of a foreign key either. Deleting anything from the catalogue is done by calling a stored procedure called dbo.DeleteObject. You can look at the definition in there – it feels very much like the kind of Delete stored procedures that many people write, the kind of thing that means they don’t need to worry about allowing cascading deletes with foreign keys – because the stored procedure does the lot. Except that it doesn’t quite do that. If it deleted everything on a cascading delete, we’d’ve lost all the data sources as configured in dbo.DataSource, and that would be bad. This is fine if the ItemID from dbo.DataSource hooks in – if the report is being deleted. But if a shared data source is being deleted, you don’t want to lose the existence of the data source from the report. So it sets it to NULL, and it marks it as invalid. We see this code in that stored procedure. UPDATE [DataSource] SET [Flags] = [Flags] & 0x7FFFFFFD, -- broken link [Link] = NULL FROM [Catalog] AS C INNER JOIN [DataSource] AS DS ON C.[ItemID] = DS.[Link] WHERE (C.Path = @Path OR C.Path LIKE @Prefix ESCAPE '*')
Unfortunately there’s no semi-colon on the end (but I’d rather they fix the ntext and image types first), and don’t get me started about using the table name in the UPDATE clause (it should use the alias DS). But there is a nice comment about what’s going on with the Flags field. What I’d LIKE it to do would be to set the connection information to a report-embedded copy of the connection information that’s in the shared data source, the one that’s about to be deleted. I understand that this would cause someone to lose the benefit of having the data sources configured in a central point, but I’d say that’s probably still slightly better than LOSING THE INFORMATION COMPLETELY. Sorry, rant over. I should log a Connect item – I’ll put that on my todo list. So it sets the Link field to NULL, and marks the Flags to tell you they’re broken. So this is your clue to fixing it. A bitwise AND with 0x7FFFFFFD is basically stripping out the ‘2’ bit from a number. So numbers like 2, 3, 6, 7, 10, 11, etc, whose binary representation ends in either 11 or 10 get turned into 0, 1, 4, 5, 8, 9, etc. We can test for it using a WHERE clause that matches the SET clause we’ve just used. I’d also recommend checking for Link being NULL and also having no ConnectionString. And join back to dbo.Catalog to get the path (including the name) of broken reports are – in case you get a surprise from a different data source being broken in the past. SELECT c.Path, ds.Name FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; When I just ran this on my own machine, having deleted a data source to check my code, I noticed a Report Model in the list as well – so if you had thought it was just going to be reports that were broken, you’d be forgetting something. So to fix those reports, get your new data source created in the catalogue, and then find its ItemID by querying Catalog, using Path and Name to find it. And then use this value to fix them up. To fix the Flags field, just add 2. I prefer to use bitwise OR which should do the same. Use the OUTPUT clause to get a copy of the DSIDs of the ones you’re changing, just in case you need to revert something later after testing (doing it all in a transaction won’t help, because you’ll just lock out the table, stopping you from testing anything). UPDATE ds SET [Flags] = [Flags] | 2, [Link] = '3AE31CBA-BDB4-4FD1-94F4-580B7FAB939D' /*Insert your own GUID*/ OUTPUT deleted.Name, deleted.DSID, deleted.ItemID, deleted.Flags FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; But please be careful. Your mileage may vary. And there’s no reason why 400-odd broken reports needs to be quite the nightmare that it could be. Really, it should be less than five minutes. @rob_farley
|
-
A non-SQL MVP friend of mine, who also happens to be a client, asked me for some help again last week. I was planning on writing this up even before Rob Volk (@sql_r) listed his T-SQL Tuesday topic for this month. Earlier in the year, I (well, LobsterPot Solutions, although I’d been the person mostly involved) had helped out with a merge replication problem. The Merge Agent on the subscriber was just stopping every time, shortly after it started. With no errors anywhere – not in the Windows Event Log, the SQL Agent logs, not anywhere. We’d managed to get the system working again, but didn’t have a good reason about what had happened, and last week, the problem occurred again. I asked him about writing up the experience in a blog post, largely because of the red herrings that we encountered. It was an interesting experience for me, also because I didn’t end up touching my computer the whole time – just tapping on my phone via Twitter and Live Msgr. You see, the thing with replication is that a useful troubleshooting option is to reinitialise the thing. We’d done that last time, and it had started to work again – eventually. I say eventually, because the link being used between the sites is relatively slow, and it took a long while for the initialisation to finish. Meanwhile, we’d been doing some investigation into what the problem could be, and were suitably pleased when the problem disappeared. So I got a message saying that a replication problem had occurred again. Reinitialising wasn’t going to be an option this time either. In this scenario, the subscriber having the problem happened to be in a different domain to the publisher. The other subscribers (within the domain) were fine, just this one in a different domain had the problem. Part of the problem seemed to be a log file that wasn’t being backed up properly. They’d been trying to back up to a backup device that had a corruption, and the log file was growing. Turned out, this wasn’t related to the problem, but of course, any time you’re troubleshooting and you see something untoward, you wonder. Having got past that problem, my next thought was that perhaps there was a problem with the account being used. But the other subscribers were using the same account, without any problems. The client pointed out that that it was almost exactly six months since the last failure (later shown to be a complete red herring). It sounded like something might’ve expired. Checking through certificates and trusts showed no sign of anything, and besides, there wasn’t a problem running a command-prompt window using the account in question, from the subscriber box. ...except that when he ran the sqlcmd –E –S servername command I recommended, it failed with a Named Pipes error. I’ve seen problems with firewalls rejecting connections via Named Pipes but letting TCP/IP through, so I got him to look into SQL Configuration Manager to see what kind of connection was being preferred... Everything seemed fine. And strangely, he could connect via Management Studio. Turned out, he had a typo in the servername of the sqlcmd command. That particular red herring must’ve been reflected in his cheeks as he told me. During the time, I also pinged a friend of mine to find out who I should ask, and Ted Kruger (@onpnt) ‘s name came up. Ted (and thanks again, Ted – really) reconfirmed some of my thoughts around the idea of an account expiring, and also suggesting bumping up the logging to level 4 (2 is Verbose, 4 is undocumented ridiculousness). I’d just told the client to push the logging up to level 2, but the log file wasn’t appearing. Checking permissions showed that the user did have permission on the folder, but still no file was appearing. Then it was noticed that the user had been switched earlier as part of the troubleshooting, and switching it back to the real user caused the log file to appear. Still no errors. A lot more information being pushed out, but still no errors. Ted suggested making sure the FQDNs were okay from both ends, in case the servers were unable to talk to each other. DNS problems can lead to hassles which can stop replication from working. No luck there either – it was all working fine. Another server started to report a problem as well. These two boxes were both SQL 2008 R2 (SP1), while the others, still working, were SQL 2005. Around this time, the client tried an idea that I’d shown him a few years ago – using a Profiler trace to see what was being called on the servers. It turned out that the last call being made on the publisher was sp_MSenumschemachange. A quick interwebs search on that showed a problem that exists in SQL Server 2008 R2, when stored procedures have more than 4000 characters. Running that stored procedure (with the same parameters) manually on SQL 2005 listed three stored procedures, the first of which did indeed have more than 4000 characters. Still no error though, and the problem as listed at http://support.microsoft.com/kb/2539378 describes an error that should occur in the Event log. However, this problem is the type of thing that is fixed by a reinitialisation (because it doesn’t need to send the procedure change across as a transaction). And a look in the change history of the long stored procs (you all keep them, right?), showed that the problem from six months earlier could well have been down to this too. Applying SP2 (with sufficient paranoia about backups and how to get back out again if necessary) fixed the problem. The stored proc changes went through immediately after the service pack was applied, and it’s been running happily since. The funny thing is that I didn’t solve the problem. He had put the Profiler trace on the server, and had done the search that found a forum post pointing at this particular problem. I’d asked Ted too, and although he’d given some useful information, nothing that he’d come up with had actually been the solution either. Sometimes, asking for help is the most useful thing you can do. Often though, you don’t end up getting the help from the person you asked – the sounding board is actually what you need. @rob_farley
|
-
Massive thanks to all the people that have been shouting about this event already. I’ve seen quite a number of blog posts about it, and rather than listing some and missing others, please assume I’ve noticed your blog and accept my thanks. But in case this is all news to you – the next 24 Hours of PASS event is less than a fortnight away (Sep 20/21)! And there’s lots of info about it at http://www.sqlpass.org/24hours/fall2012/ (Don’t ask why it’s “Fall 2012”. Apparently that’s what this time of year is called in at least two countries. I would call it “Spring”, personally, but do appreciate that it’s “Autumn” in the Northern Hemisphere...) Yes, I blogged about it on the PASS blog a few weeks ago, but haven’t got around to writing about it here yet. As always, 24HOP is going to have some amazing content. But it’s going to be pointing at the larger event, which now less than two months away. That’s right, this 24HOP is the Summit 2012 Preview event. Most of the precon speakers are going to be represented, as are half-day session presenters, quite a few of the Spotlight presenters and some of the Microsoft speakers too. When you look down the list of sessions at http://www.sqlpass.org/24hours/fall2012/SessionsbySchedule.aspx, you’ll find yourself wondering how you can fit them all in. Luckily, that’s not my problem. For me, it’s just about making sure that you can get to hear these people present, and get a taste for the amazing time that you’ll have if you can come to the Summit. I see this 24HOP as the kind of thing that will just drive you crazy if you can’t get to the Summit. There will be so much great content, and every one of these presenters will be delivering even more than this at the Summit itself. If you tune into Jason Strate’s 24HOP session on the Plan Cache and are impressed – well, you can get to a longer session by him on that same topic at the Summit. And the same goes for all of them. If you’re anything like me, you’ll find yourself looking at the Summit schedule, wishing you could get to several presentations for every time slot. So get yourself registered for 24HOP and help yourself make that decision. And if you can’t go to the Summit, tune in anyway. You’ll still learn a lot, and you might just be able to help persuade someone to send you to the Summit after all (before the price goes up after Sep 30).
|
-
Four years ago, I was preparing to speak at TechEd Australia. I’d been asked to give a session on “T-SQL Tips and Tricks”, but I’d pushed back and we’d gone with “T-SQL Tips and Techniques” instead. I hadn’t wanted to show Tricks, because despite being a fan of ‘magicians’ (like Tommy Cooper) I feel like the trickery should disappear with the understanding of the technique used. This month, Mike Fal asks about Trick Shots, and I’m reminded of some of the things I do with T-SQL, and that session I gave nearly four years ago.  So I gave a talk, in which I covered 15 T-SQL Tips (probably more – I definitely threw a lot of stuff in there). They included things like the Incredible Shrinking Execution Plan, using the OUTPUT clause to return identity values on multiple rows, short-circuiting GROUP BY statements with unique indexes, and plenty more. There are a lot more things that I cover these days – you can’t exactly stay still and remain current – but still I like to maintain that there shouldn’t be trickery with T-SQL. The common thread going through many of the tips, along with every class I teach about T-SQL, is the importance of the execution plan. That’s where you can see what’s actually going on, and hopefully it can explain some of the magic that you see. Of course, there’s more to it than that, but getting your head around the relationship between queries and plans can definitely help demystify situations. Take recursive CTEs, for example. In the piece of code below (against old AdventureWorks, to which I added a covering index to avoid lookups), we see a sub-query used within a CTE (which is all about giving a name to the sub-query so it can be referenced later), and then in the second half of the UNION ALL statement, still within the sub-query, we see the CTE name used (where I’ve made it bold). Despite the fact that we haven’t even finished using it yet. This functionality has been around for a long time, but yet many people are not used to it, and see it as a trick. WITH OrgChart AS ( SELECT 1 AS EmployeeLevel, * FROM HumanResources.Employee WHERE ManagerID IS NULL UNION ALL SELECT o.EmployeeLevel + 1, e.* FROM OrgChart AS o JOIN HumanResources.Employee AS e ON e.ManagerID = o.EmployeeID ) SELECT * FROM OrgChart; There isn’t a trick here, and it comes down the principle of pulling sub-query definitions (including CTEs and non-indexed views) into the outer query. You see, the OrgChart here isn’t a database object, it’s simply a nested sub-query. You might imagine that it looks a bit like this, where I’ve replaced the OrgChart reference with a copy of the query itself.  ...but I’m not a big fan of this kind of representation, because it’s a bit strange to see that “WHERE ManagerID IS NULL” bit in there repeatedly. Are we really going to be getting that row out over and over again? I’ve seen people try to demonstrate this something like: SELECT o2.EmployeeLevel + 1 AS EmployeeLevel, e.* FROM ( SELECT o1.EmployeeLevel + 1 AS EmployeeLevel, e.* FROM ( SELECT 1 AS EmployeeLevel, * FROM HumanResources.Employee WHERE ManagerID IS NULL ) AS o1 JOIN HumanResources.Employee AS e ON e.ManagerID = o1.EmployeeID ) AS o2 JOIN HumanResources.Employee AS e ON e.ManagerID = o2.EmployeeID ; , but this isn’t right either, because this query is putting the data onto the end of rows, whereas we really do need a UNION ALL. The easiest way of showing what’s going on is to look at the execution plan.  Look at the first operator called – it’s an Index Spool (over on the left). It gets its data from a Concatenation between data that comes from an Index Seek, and a join between a Table Spool and another Index Seek. This sounds all well and good, but that Table Spool is empty. There’s nothing on the spool at all. At least, until the Concatenation operator returns that first row to the Index Spool. When this happens, the Table Spool can serve another row. The Nested Loop happily takes the row, and requests any matching rows from the Index Seek at the bottom-right of the plan, and the Concatenation operator happily passes these rows back to the Index Spool, at which point the Table Spool has more rows it can serve up again. The Index Spool controls all this. At some point, the system has to realise that there’s no more data that’s going to be served up. The Table Spool doesn’t just sit waiting for rows to appear, nor does the spooling behaviour cause the Table Spool to suddenly get kicked off again. This is all handled because the Concatenation operator keeps getting prodded (by the Index Spool) that there’s more data that’s been pushed onto it. The Table Spool doesn’t know (or even care) if the rows it’s handed over are going to end up back on the spool – after all, it doesn’t know if those employees are also managers, it just serves up the data that appears on it, when requested. The recursive CTE is not magic. It doesn’t do any kind of trickery. It’s just a loop that feeds its data back into itself. And of course, to understand this properly, you should make sure you know to read plans from the left, revealing the Index Spool which really runs this query. Learn to read execution plans. There are a bunch of resources out there (such as other posts of mine, stuff by Grant Fritchey, and more), but above all, just start opening them up and seeing how your queries run. You’ll find that a lot of the ‘tricks’ you think are in T-SQL aren’t really tricks at all, it’s just about understanding how your queries are being executed. And before you ask, I won’t be at TechEd Australia this year.
|
-
This isn’t a typical post for me. Regular readers will know that I tend to write about technical things, or things that are coming up, or something about how we do things around my company, LobsterPot Solutions – whereas this post is a response to the latest T-SQL Tuesday. I haven’t missed one yet, so I’m moving a little out of my comfort zone and posting anyway. I’m also feeling short on time at the moment, but I digress...  The challenge this time wasn’t just blogging, it was to make notes about what you did on a particular day, and then blog about that. The day was July 11, which wasn’t really a typical day (my oldest son’s birthday), but I figure that the way my work life fitted around everything else is quite typical, and worth writing about. First, some information about me, which is probably needed as a premise. I’m married with three great kids – boys who are 15 (now) and 12, and a girl who is 4. I live in Adelaide, Australia, where I run a business (LobsterPot Solutions) and the local SQL Server User Group. I’m a PASS Director, and also very involved in my local church. I’ve also been annoyingly sick recently, which sees me playing catch-up on a lot of things. This is why I don’t feel like I have time for things right now, so it’s a matter of working out priorities and focussing on things that need my attention. My portfolio on the PASS board is organising the 24HOP events, which are semi-annual virtual conferences, the next one being in September. This one is traditionally a Summit Preview event, showcasing many of the great speakers that will be presenting at the main Summit in November. I’m not speaking this year, but presented at 24HOP last year, and it’s a really good event, building interest and excitement for the content that will be heard. So at the moment, there is a bunch of organising for that to be done, and with the PASS HQ staff being based in Canada, I was responding to emails and in discussions around that pretty much as soon as I was awake (along with responding to a couple of people from the SQL community who had technical questions they’d sent my way). It was school holidays, so despite it being my son’s birthday, he wasn’t up yet (he’s 15 – sleeping-in rates quite high on his personal todo list, as it did for me when I was his age). Once he was awake we gave him presents, and I got on with some more things. I’d taken the boys out the afternoon before to celebrate, because I was expecting to spend the Wednesday with a particular client on the far side of town. That trip got postponed, but didn’t reduce the list of things that needed to get done (although it did mean I didn’t have any meetings with clients arranged). Of course, there are those things that tend to pop up on todo lists, like getting to the Post Office. I’d done enough stuff at home before heading into the city so that the Post Office was open and I could tick that off. Eventually I got into the city, where I was then spreading my time between staff and clients. My role in the company is such that I don’t tend to get heavily involved in many projects. I support projects, and I try to be available for my staff, but there are too many things going to have me dedicated to one or two projects. I went through quotes and requirements specifications, reviewed other documentation, discussed some client relationships, and spent time with one of my guys looking at the priorities of both myself and some of my staff. I had a support call with Microsoft to follow up on, went through upcoming dates with the user group venue, signed an updated procurement document for one of our clients, arranged some meetings, fielded a call from a partner company asking about licensing, and an older client asking about upgrade paths... ...all done mostly from a cafe. Being in the cafe brought a couple of interruptions, but welcome ones. When there are things that need doing, it can be really beneficial to be working from a cafe, because people I know will go past. This particular day, a guy from a client we’ve recently engaged waved hello to me, but was clearly in a rush, and another client who’s keen for us to help up-skill their people said hello and gave an update on things. This type of distraction is welcome because even when I’m busy and feeling like I don’t want distractions, this type of thing helps move some things along. (As I write this, I’m working from my office in town. Today, the distractions are slightly less welcome.) I know that I spend all day moving from one thing to another, but this seems to work for me. I have a todo list in Trello that seems to mostly work, so that when each thing (be it on the list or not) I can pick up whatever is next. I’m not scared of distractions, because I figure that the time someone calls is probably best for them. (Making time for people is really high on my priority list.) By the end of the day, I’d also spent time at a client where I’m filling in for one of my guys who’s on leave at the moment, and had got updates on things from each of my staff members. On the way home, I stopped by one of our local pasta restaurants to get the birthday dinner that Sam wanted. Later that night I got some more emails sent. Having a smart phone is both a blessing and a curse for running a business. Despite the fact that it’s often an interruption, it does mean I’m more available to my staff and clients, and anyone else who needs me – and I wonder how I’d do without it. I read my bible on it, I listen to music on it, it tracks my walking/running. I take pictures of my family with it, read the news (often via Twitter), and even use it to play games when I choose to be distracted. So that was my day. Fairly typical in many ways.
|
-
So I closed an application that runs when Windows starts up, but it doesn’t have a Start Menu entry, and I was trying to find it. Ok, I could’ve run regedit.exe, navigated through the tree and found the list of things that run when Windows starts up, but I thought I’d use PowerShell instead. PowerShell presents the registry as if it’s a volume on a disk, and you can navigate around it using commands like cd and dir. It wasn’t hard to find the folder I knew I was after – tab completion (starting the word and then hitting the Tab key) was a friend here. But unfortunately dir doesn’t list values, only subkeys (which look like folders). PS C:\Windows\system32> dir HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run PS C:\Windows\system32> Instead, I needed to use Get-Item to fetch the ‘Run’ key, and use its Property property. This listed the values in there for me, as an array of strings (I could work this out using Get-Member). PS C:\Windows\system32> (Get-Item HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run).Property QuickSet SynTPEnh Zune Launcher PS C:\Windows\system32> Ok, so the thing I wanted wasn’t in there (an app called PureText, whicih lets me Paste As Text using Windows+V). That’s ok – a simple change to use HKCU instead of HKLM (Current User instead of Local Machine), and I found it. Now to fetch the details of the application itself, using the RegistryKey method GetValue PS C:\Windows\system32> (Get-Item HKCU:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run).GetValue('PureText') "C:\Users\Rob\Utilities\PureText.exe" PS C:\Windows\system32> And finally, surrounding it in a bit of code to execute that command. That needs an ampersand and the Invoke-Expression cmdlet. PS C:\Windows\system32> '& ' + (Get-Item HKCU:\SOFTWARE\Microsoft\Windows\CurrentVersion\Run).GetValue('PureText') | Invoke-Expression A simple bit of exploring PowerShell which will makes for a much easier way of finding and running those apps which start with Windows.
|
-
MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things. One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.  If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).  That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.   MERGE is slightly different though. With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this).  (Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values) One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool. And $action – very convenient. But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it. The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted. Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.  This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.  This is also doable if you’re using MERGE to just do INSERTs. In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2.  This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful. Oh, and before people ask if you can also hook into the target table’s columns... Yes, of course. That’s what deleted and inserted give you.
|
-
Some people question the ethics of writing about my company on my blog. But I don’t have a problem with it. Is it ethical to ignore the concerns of others and to just do what I want? Anyway – it’s relevant for this month’s T-SQL Tuesday, which Chris Shaw is hosting on the topic of ethics, so I’m going to write a few things about integrity and ethics. When I hire people to work at LobsterPot Solutions, the main thing I look for is a high level of integrity. I can handle a certain level of technical unfamiliarity – that’s something that can be learned. But I can’t teach integrity in the same way. I can foster it, and I can ensure that it’s an established ‘value’ in the company. It’s the number one thing. I know a sysadmin company (it wasn’t LobsterPot) that recently lost a bunch of client data – they asked us to help them recover some of the lost data. It was a bad technical mistake they made, but that’s forgivable. They tried really hard to get the data back, and managed to salvage a lot of it, which demonstrated a better level of integrity. When I look at ethics and integrity, it’s not about mistakes, it’s more about the response to mistakes. Their integrity was brought into question over putting the client into a situation where they could’ve lost data – but this wasn’t a question of integrity, it was an honest mistake. The integrity question comes about their response. I like to look at the knee-jerk reaction. If you make a mistake, do you cover it up, or do you try to resolve it? Unfortunately, I think there’s often an element of both, but if the motivation is to protect the other people involved, then I would expect that the integrity level is okay. Ethics is about good and bad behaviour, and of course, once that definition has been made, choosing good. I’m not going to try to do the definition thing – I think people generally have a feel for what is good and bad (even with data it should be clear – you don’t get to steal client data, etc). The application of ethics comes down to integrity. At LobsterPot, we try to do the right thing all the time. This doesn’t mean that we don’t make mistakes – but our first reaction is always to do the right thing by the other parties involved, even if it costs us money. It has to. We don’t get a choice on this. And if I don’t think that integrity is your driving influence for every decision you make, then I’m sorry, but I’m just not going to hire you.
|
-
I don’t tend to find that anything about me gets described as “Fabulous”. It’s not a word I ever use myself, so I was slightly amused to see it here.  It’s going to be an interesting week (in May, Monday 14th to Saturday 19th), in which I have two stints in classrooms (the three day course in downtown Chicago and this pre-con), plus two presentations at the main SQLSaturday event. I’ll be in a room freshly vacated by Jes Borland (I suspect you'll probably be still able to smell the ‘squee’), and from the look of things, I’ll be delivering a solid 2.5 hours of material, with an intermission of 15 minutes. Mind you, with people in the other rooms like Argenis, Ted and Erin, I’m not sure I’ll have much of a crowd. It might be more like “An Intimate Afternoon with Rob Farley” with whoever’s left. The course will be the highlight of my week. I love teaching this course – it’s a great time to be able to get people in a room for a few days and go through ways to make queries better. More effective. It has “Advanced T-SQL” in the title, but I really try to focus on the “Effectiveness” aspect. Yes, we’ll look at a bunch of advanced features, and your T-SQL arsenal will grow, but the idea is to arm you with the information you need to be able to have more effective T-SQL. Advanced is only better when it’s more effective. It’s going to be a really fun few days, as I stretch your thinking and make you look at T-SQL in a new way.  The pre-con is going to be very different. We’ll be going through the syllabus of the new 70-461 exam, teaching you about all the ins and outs of the various features, leaving you in a position to be able to confidently take the exam. I’ve sat this exam in beta, but of course I can’t use any inside knowledge I gained from that to teach this. There’s a lot of stuff to get through. Each of the four sections has four or five bullet points underneath, and even more sub-points under that. We’ll be pushing through a lot of things, and a lot of the more basic stuff will certainly be skimmed through – but we’ll be looking deeper into a lot of the new things, and making sure that you get all the concepts on the exam. I can’t offer a proper guarantee that you’ll pass – some people just take exams badly. But as we’ll also be looking at a bunch of exam technique aspects, I think you'll be fine. The two sessions that I’m doing at the SQLSaturday #119 are two of my favourite talks. One of them is on SARGability. I remember doing this talk with a bunch of MVPs in the room (and a certain cloudy Microsoft employee), and even they said “Ooh – I didn’t know that” afterwards. SARGability – the ability to use indexes effectively – is such a significant aspect of querying, and a pet topic of mine (as regular readers will realise). SQL Server even provides a bunch of methods you can use to improve the SARGability, even if you can’t tweak the queries themselves. Very cool stuff. And did I mention I won’t have slides? The other is on Analytic Functions (a talk which I’m currently doing around Australia and New Zealand at SQLSaturday 135, 136, 138, 139 and 140). There are eight new Analytic Functions in SQL 2012, plus some new enhancements to the OVER clause. I’ll be running through these, and I’m sure you’ll leave the room with new ideas to try to enhance your reporting and data analysis. See you there?
|
-
This coming Tuesday sees a midweek SQL Saturday hit Adelaide. LobsterPot’s a sponsor, as are a bunch of other companies. An all day event, with two tracks featuring some of the best material you’ve ever seen presented. I’m presenting too. The thing I really want to draw your attention to is that we have two sessions from Paul White. If you haven’t heard of Paul, click the link to have a look at his blog. When you’ve picked your jaw up and some of the mind-blowing information he likes to write about, imagine yourself sitting in sessions by him. I’ve just got back from Wellington where I heard him for an hour, and knew that giving him two sessions was completely the right choice. Everyone left the room wishing that he could’ve gone on longer, and I suspect Paul will be somewhat caught up for questions for the rest of the day, as people try to pick his brain about some of the Query Optimizer things he can teach. He’s been scheduled for the morning so that attendees can have plenty of opportunity to see him around for the rest of the day. It’s his first time ever to Australia, so it’s completely brilliant to have him come to Adelaide for this event. I shouldn’t suggest that the other sessions won’t be excellent though. Peter Ward’s session about the $10000 question has been very well received at SQLSaturday events so far, as have the sessions by Roger Noble and Paul te Braak. Combining these with Microsoft’s Raja N presenting about the Database Consolidation Appliance, some excellent local speakers, a terrific sponsor-session from Fusion-io (and a couple of spots from me), and I’m sure you’ll agree that this event is definitely worth getting to. Numbers are limited and being a free event it may well sell out. So get yourself registered (but I’d recommend joining PASS first, which gives you a bunch of extra benefits and there’s no extra effort involved), and I’ll see you there. It’s in less than a week!
|
-
SQL 2012 brings us a bunch of new analytic functions, together with enhancements to the OVER clause. People who have known me over the years will remember that I’m a big fan of the OVER clause and the types of things that it brings us when applied to aggregate functions, as well as the ranking functions that it enables. The OVER clause was introduced in SQL Server 2005, and remained frustratingly unchanged until SQL Server 2012. This post is going to look at a particular aspect of the analytic functions though (not the enhancements to the OVER clause). When I give presentations about the analytic functions around Australia as part of the tour of SQL Saturdays (starting in Brisbane this Thursday), and in Chicago next month, I’ll make sure it’s sufficiently well described. But for this post – I’m going to skip that and assume you get it. The analytic functions introduced in SQL 2012 seem to come in pairs – FIRST_VALUE and LAST_VALUE, LAG and LEAD, CUME_DIST and PERCENT_RANK, PERCENTILE_CONT and PERCENTILE_DISC. Perhaps frustratingly, they take slightly different forms as well. The ones I want to look at now are FIRST_VALUE and LAST_VALUE, and PERCENTILE_CONT and PERCENTILE_DISC. The reason I’m pulling this ones out is that they always produce the same result within their partitions (if you’re applying them to the whole partition). Consider the following query: SELECT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), LAST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)), PERCENTILE_DISC(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)) FROM Sales.SalesOrderHeader ; This is designed to get the TotalDue for the first order of the year, the last order of the year, and also the 95% percentile, using both the continuous and discrete methods (‘discrete’ means it picks the closest one from the values available – ‘continuous’ means it will happily use something between, similar to what you would do for a traditional median of four values). I’m sure you can imagine the results – a different value for each field, but within each year, all the rows the same. Notice that I’m not grouping by the year. Nor am I filtering. This query gives us a result for every row in the SalesOrderHeader table – 31465 in this case (using the original AdventureWorks that dates back to the SQL 2005 days). The RANGE BETWEEN bit in FIRST_VALUE and LAST_VALUE is needed to make sure that we’re considering all the rows available. If we don’t specify that, it assumes we only mean “RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW”, which means that LAST_VALUE ends up being the row we’re looking at. At this point you might think about other environments such as Access or Reporting Services, and remember aggregate functions like FIRST. We really should be able to do something like: SELECT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM Sales.SalesOrderHeader GROUP BY YEAR(OrderDate) ; But you can’t. You get that age-old error: Msg 8120, Level 16, State 1, Line 5 Column 'Sales.SalesOrderHeader.OrderDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Msg 8120, Level 16, State 1, Line 5 Column 'Sales.SalesOrderHeader.SalesOrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Hmm. You see, FIRST_VALUE isn’t an aggregate function. None of these analytic functions are. There are too many things involved for SQL to realise that the values produced might be identical within the group. Furthermore, you can’t even surround it in a MAX. Then you get a different error, telling you that you can’t use windowed functions in the context of an aggregate. And so we end up grouping by doing a DISTINCT. SELECT DISTINCT YEAR(OrderDate), FIRST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), LAST_VALUE(TotalDue) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate, SalesOrderID RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)), PERCENTILE_DISC(0.95) WITHIN GROUP (ORDER BY TotalDue) OVER (PARTITION BY YEAR(OrderDate)) FROM Sales.SalesOrderHeader ;  I’m sorry. It’s just the way it goes. Hopefully it’ll change the future, but for now, it’s what you’ll have to do. If we look in the execution plan, we see that it’s incredibly ugly, and actually works out the results of these analytic functions for all 31465 rows, finally performing the distinct operation to convert it into the four rows we get in the results.  You might be able to achieve a better plan using things like TOP, or the kind of calculation that I used in http://sqlblog.com/blogs/rob_farley/archive/2011/08/23/t-sql-thoughts-about-the-95th-percentile.aspx (which is how PERCENTILE_CONT works), but it’s definitely convenient to use these functions, and in time, I’m sure we’ll see good improvements in the way that they are implemented. Oh, and this post should be good for fellow SQL Server MVP Nigel Sammy’s T-SQL Tuesday this month.
|
|
|
|
|
|