THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Adam Machanic

Adam Machanic, Boston-based independent database consultant, writer, and speaker, shares his experiences with programming, performance tuning, and optimizing SQL Server 2000, 2005, and 2008, in conjunction with related technologies such as .NET.

  • T-SQL Tuesday #002: The Roundup

    According to T-SQL Tuesday rules as ratified by me in the first and second T-SQL Tuesday posts, the T-SQL Tuesday host is supposed to post a roundup within two days of the end of the event. So a reasonable person should expect a roundup to be posted by the second Thursday of the month. It gives me no pleasure to admit that I've been completely unreasonable and have totally dropped the ball. I'm twothreefour weeks late. (I actually started the post two weeks ago. That's bad. And now I'm forced to finish it because tomorrow is T-SQL Tuesday #003. That's very bad.) I have lots of excuses, but none of them are any good. So I hope that everyone who participated will accept my apology.

    Now that the unpleasant part of this post is behind us, let's visit the positive side. The event generated 24 great posts! Apparently everyone out there can relate to being confused by something in SQL Server.

    And since I'm three weeks late, this month's T-SQL Tuesday invitation has already been posted by the host, the much-better-organized-than-I-am Rob Farley (who sent me a link to the post over two weeks before he published it). Once you're done reading all of the great posts listed below, visit Rob's blog and get ready for this month's event.

    The posts below are listed roughly in the order in which comments showed up on my blog on the day in question. I've also included a couple of stragglers who were late getting their posts in. If I'm three weeks late with the roundup I can surely forgive someone who posted their entry 24 hours behind schedule.

     


    Rob Farley, who as I mentioned before is quite well organized, kicked things off with a cool and quite puzzling puzzle. What happens when you use a HAVING clause without a GROUP BY clause? Check out the post, then read the solution in his followup.


    Kalen Delaney's post popped up next, puzzlingly on an almost identical topic to that of Rob's. Kalen gives us a bit of a history lesson about some strange situations with non-aggregated columns that used to be possible in older versions of SQL Server.

     

    Sankar Reddy is so puzzled by SQL Server that he posted not just one, but two posts as part of the event. In his first entry, he talks about an issue with transactional replication (always a fun technology to manage). In his second post he discusses the mysterious Sort Warning event.

     

    Jonathan Kehayias is yet another double-poster for the event. In his first post he describes a SQL Server 2008 to 2005 downgrade process and how he worked around the issue of doing the migration for 80 databases. In the second post he shows the twists and turns one can take when trying to do something simple like find out what SQL a request is executing.

     

    Stephen Horne took the opposite end of the "puzzle" challenge, posting an actual puzzle. Can you write an unbeatable tic-tac-toe game in T-SQL? Give it a shot, it's fun!

     

    Michael Swart comes up with a difficult puzzle of his own. Can you reverse-engineer an MD5 hash? You shouldn't, in theory, be able to do so (at least, easily), but here are some hints... And if you still can't figure it out, read the solution.

     

    Another Michael--Michael Coles--jumped the gun a bit, posting his first T-SQL Tuesday post a day early. Its topic is the ever-puzzling task of creating properly-ordered build scripts. After I mentioned that his post wasn't technically eligible for the event, Michael pulled out a second great post, this time on the topic of SQL Server's often-misunderstood extended properties feature.

     

    David Leibowitz brings a nice post on a topic that many SSRS developers get wrong: how to properly do multivalue parameters with stored procedures.

     

    Another SSRS-related issue comes up in the puzzle posted by Brian Garrity. How do you tell whether FMTONLY is turned on?

     

    Beginning Spatial provided a few spatial puzzles for those of you interested in that area.

     

    Mladen Prajdic knows that there are many ways subqueries can be misunderstood... And shows us another. How many times is that subquery really evaluated?

     

    Grant Fritchey may be scary, but is certainly not afraid of SQL Server; he says in his post that it was difficult to think of a puzzling situation. But after some brainstorming he came up with an interesting post about the query optimization process.

     

    Alexander Kuznetsov continues his series of posts on defensive database programming and busts one of the most common patterns I've seen--the IF EXISTS(...) UPDATE ELSE INSERT pattern. A very nice read.

     

    Steve Jones did a great job of describing one of the most common puzzling situations SQL programmers can encounter when working with less-than-perfectly-designed databases: implicit conversion issues.

     

    Brad Schulz is up next with a T-SQL Tuesday first: a fairy tale! He uses the story to describe, step-by-step how to write a complex query and why the query optimizer throws exceptions when you try to do various things illogically. A very nice post.

     

    Can anything strange happen when you reconfigure the "max server memory" setting? Ask Aaron Nelson... He was certainly puzzled by the outcome of this seemingly-innocent action.

     

    Quick, which ASCII character is this: " "?

     

    Despite being late with this roundup post, I did managed to pull together my own entry for the event. If you work with non-typed XML on a regular basis, you might want to read my post to get a bit of a performance boost when using the Nodes method.

     

    Andrew Hogg contributed a post in which he describes the trickery required to query the various DMVs in order to properly do dynamic partitioning. In the end, five views are needed to satisfy what at first seems like an easy query.


    John Dunleavy's post is on the topic of join syntax. Remember that deprecated *= outer join syntax? Apparently such abominations never go away, and John tells us all about how to get rid of the offending code.

     

    Jorge Serrada had a seriously confusing situation involving shared access rights to a program and the requirement that certain XML files get distributed to end-user's computers before they could run the thing and ... Read his post to find out how he resolved this mess.

     

    Closing out the event is Jason Brimhall, who shares an interesting tale of woe: his UDF worked fine in SSMS and not so fine in SQL Agent. An all-too-familiar situation. Read his post to find out how he solved it.



    ...And that's it for thislast month's event! Head over to Rob's blog right away to read up on this month's edition--you have just over 24 hours from the time I hit Publish until your submissions are due. Get writing! And I'll see you next time.


  • Paul Randal and Kimberly Tripp - Boston Area - March 29-April 2

    If you've been looking for advanced training in the northeast, your wait is over. Paul and Kimberly will be gracing us with one of their famous "SQL Server Immersion" events the week of March 29.

    This course will cover storage engine internals, indexing and performance strategies, and of course in-depth sections on database maintenance from the guy who wrote DBCC. You can't go wrong there. A complete outline is available on the course web site.

    If you've never visited the Boston area, the spring is a great time to do it--we usually have great weather right in between the winter freeze and the overly-humid summer. The course location is Cambridge, MA, right near MIT: a really cool area with lots to see and do, easy access to downtown Boston via public transportation, and the weather should be perfect for some brisk evening walks around the Charles river after days of having your head pumped with SQL knowledge.

    If you register soon you can take advantage of the "early bird" special, $600 off the $3100 course fee. Use registration code "EARLYBIRD" to take advantage.

    Hope to see many of you there!


  • T-SQL Tuesday #002: Is it XML, or Not?!?

    The query optimizer is a finicky thing, and sometimes it doesn't understand exactly what you're trying to do until you give it a bit more information. The situation I'm going to describe in this post is one such case. By providing the optimizer with ever-so-slightly more data, it's possible to make some XML processing over 300 times faster.

    Here's the situation: The XML I was working with was stored in a table, but not typed as XML. Rather, it had been typed as VARCHAR(MAX). This presents an interesting conundrum for the optimizer: should the query be optimized as though operations are being done on a string, or on XML?

    If you would like to follow along with the examples below, here's some DDL and an INSERT statement to populate a test table using AdventureWorks data:

    CREATE TABLE #myXML
    (
        x VARCHAR(MAX)
    )
    GO

    INSERT #myXML
    (
        x
    )
    SELECT
        x
    FROM
    (
        SELECT
            *
        FROM AdventureWorks.Production.Product
        FOR XML PATH ('Product')
    ) AS y (x)
    GO

    Running this code will put one row into the temp table, with an XML document containing all of the product data from the AdventureWorks Production.Product table. And now, just for kicks, what if we want to pull all of the ProductIDs out of that document? Simple enough...

    WITH
    theXML
    (
        x
    ) AS
    (
        SELECT
            CONVERT(XML, x)
        FROM #myXML
    )
    SELECT
        node.value('ProductID[1]', 'INT')
    FROM theXML
    CROSS APPLY x.nodes('/Product') AS nodes(node)

    This code isn't especially interesting or puzzling in and of itself. It converts the document(s) in the table to XML, runs them through the .nodes() function to produce one row per product node, and pulls out the first ProductID attribute found. If you've actually run the code on your end at this point, you know why I was puzzled: This code takes a full 20 seconds to run on my end. Which is a bit extreme, considering that there are only 504 products in the AdventureWorks database. In the real situation, the documents were several times bigger, and 20 seconds was over an hour in some cases. And that just wouldn't do.

    And so much head-scratching ensued. And teeth gnashing. And cursing of the SQL Server programmability team. You know, a typical day at the office.

    I pulled apart my code, put it back together again, and considered writing a CLR UDF to do the processing. But then I tried something on a whim:

    DECLARE @x XML =
        (
            SELECT TOP(1)
                x
            FROM #myXML
        )

    SELECT
        node.value('ProductID[1]', 'INT')
    FROM @x.nodes('/Product') AS nodes(node)

    And--shocker--this code returns all 504 ProductIDs seemingly instantly. (Actually, it takes around 28 milliseconds on my end.)

    So was a cursor and document-by-document processing the answer? At first, it seemed so. But after further messing around I noticed something: adding TOP(1), so that only a single row was processed, wasn't taking too long. Could it be that the query processor was doing a lot more work than necessary, like converting the text to XML 504 times?

    The TYPE directive can be used with FOR XML to make your query return the XML document typed as XML rather than typed as a string. Perhaps it would work here?

    WITH
    theXML
    (
        x
    ) AS
    (
        SELECT
            CONVERT(XML, x)
        FROM #myXML
        FOR XML PATH(''), TYPE
    )
    SELECT
        node.value('ProductID[1]', 'INT')
    FROM theXML
    CROSS APPLY x.nodes('/Product') AS nodes(node)

    The empty path expression is needed because the TYPE directive only works in conjunction with a valid FOR XML mode. Using an empty path has zero net effect on the actual XML produced in this case. But using the TYPE directive has quite a huge effect: A reduction in query time to around 58 milliseconds on my end. The 30,000% speedup I promised you earlier.

    So why does this work? A quick peek at the plans indicates that I was correct. Here's the first part of the plan for the first version of the query:

      |--Compute Scalar(DEFINE:([Expr1023]=[Expr1022]))
           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1004], XML Reader with XPath filter.[id]))
                |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1004]))
                |    |--Compute Scalar(DEFINE:([Expr1004]=CONVERT(xml,[tempdb].[dbo].[#myXML].[x],0)))
                |    |    |--Table Scan(OBJECT:([tempdb].[dbo].[#myXML]))
                |    |--Filter(WHERE:(STARTUP EXPR([Expr1004] IS NOT NULL)))
                |         |--Table-valued function

    The second version is almost exactly the same, but for one additional iterator:

      |--Compute Scalar(DEFINE:([Expr1024]=[Expr1023]))
           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1005], XML Reader with XPath filter.[id]))
                |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1005]))
                |    |--UDX(([Expr1004]))
                |    |    |--Compute Scalar(DEFINE:([Expr1004]=CONVERT(xml,[tempdb].[dbo].[#myXML].[x],0)))
                |    |         |--Table Scan(OBJECT:([tempdb].[dbo].[#myXML]))
                |    |--Filter(WHERE:(STARTUP EXPR([Expr1005] IS NOT NULL)))
                |         |--Table-valued function

    Notice the "UDX" iterator? That's an XML iterator that handles the conversion to typed XML. And in the first case, we don't get one, even though we've "technically" converted the string to XML at that point.

    This story was puzzling, and somewhat arcane, but it has a moral that stretches far beyond this simple example: Only by giving the query optimizer much more information than any rational person might think necessary did we get a plan that does the right thing. And that is quite often the case when working with SQL Server. CHECK constraints, foreign keys, UNIQUE constraints, the DISTINCT keyword, GROUP BY, APPLY, and various other constructs are more than just ways to define your requirements or the output you're looking for. They can be used to provide information to the query optimizer to help it determine the best way to process your data. Information that can make your query return in a second instead of an hour. Information that will make your users happy and your project a success.

    The secret to writing high performance T-SQL? Step out of your human mind. Un-puzzle. Be the optimizer. And until next month, thank you for reading this entry in T-SQL Tuesday #002!


  • New England Data Camp v2.0 (SQL Saturday #34): The Almost-Final Schedule

    The January 30 event is now starting to shape up quite nicely! Below you will find the current schedule of sessions for the day. We still have some room, so if you haven't registered already, now is the time to do so. Please make sure to register online if you're going to attend so that we can get a fairly accurate headcount for food, bags, etc.


    Start TimeBusiness IntelligenceData DevelopmentDatabase AdministrationPerformance Tuning
    08:15 AMSlava Kokaev
    Introduction to SSIS - Developing SSIS Packages
    Scott Abrants
    Automating Database Deployments with Visual Studio
    Mike Walsh
    As a DBA, Where Do I start?
    Bala Subra
    SQL Server Performance Tuning Tips
    9:30 AMAyad Shammout
    Introducing PowerPivot (BI Self-Service) for Excel
    Charles Hyman
    USING VSTS Database Edition and 2010 Preview
    Aaron Bertrand
    Management Studio Tips & Tricks
    Grant Fritchey
    Best Practices for Working With Execution Plans
    11:00 AMBala Subra
    SQL Server Reporting Services
    Matt Van Horn
    Developing with SQL Azure
    Roman Rehak
    SQL Server Development Tips and Tricks
    Mike Walsh
    You Can Improve Your Own SQL Code
    12:15 PMMichael Ruland
    Sponsor Lunch: Turbocharge your ETL with expressor
       
    01:00 PMSunil Kadimdiwan
    Visualize your data on a Map
    Steve Simon
    Data Access Layers: A Cornucopia
    Ayad Shammout
    Building High Availablity SQL Server environment
    Adam Machanic
    Dynamic Management Master Class: 15 Powerful DMVs
    02:30 PMMike Litchfield
    Designing and Implementing an ETL Framework
    Jim O'Neil
    "Dallas": Microsoft's Data Marketplace
    Varsham Papikian
    Use PowerShell and SQLCMD - be more effective
    Dean Richards
    Tuna Helper for SQL Server DBA's
    03:45 PMSteve Simon
    Data Mining.. Making $mart financial decisions
    Matt Van Horn
    Pivot Data Visualization
    Peter Tassmer
    Storage Design For SQL Server
     

     

    Please note that registration opens at 7:45 a.m. We hope you'll show up on the early side to make the first session and avoid having to rush. We'll be providing coffee and some breakfast items to compel you to get out of bed and join us.

    Please let me know if you have any questions. Looking forward to seeing many SQLblog readers at the event!


  • Invitation for T-SQL Tuesday #002: A Puzzling Situation

    Welcome to the second installment of T-SQL Tuesday, the monthly SQL Server blog party! Last month's kickoff event was a great success, and I'm really excited for this month's theme. I hope that we see even more bloggers, with even more great posts this time around--so please spread the word!

     

    Theme

    Have you ever found yourself unable to figure out the intricacies of how some piece of code works? Ever been confused by the results you've gotten back from a query, only to find out that something totally unrelated was going on? Or have you ever been compelled to wile away your spare time working on a "challenge" posted by some blogger?

    For this month's T-SQL Tuesday, I'm asking participants to write a blog post on a "puzzling" topic, along the lines of some of the following ideas:

    • Describe a confusing situation you encountered, and explain how you debugged the problem and what the resolution was
    • Show a piece of code that doesn't behave as most people might expect, and illustrate the reasoning behind the discrepancy
    • Create a challenge for your readers to solve
    As always, even given the event's name the posts are not limited to T-SQL! Any component of, or software product related to SQL Server, is fair game. MDX, SSIS, LINQ to SQL, Entity Data Model, NHibernate, and any other software product that deals with SQL Server data can be featured in your post. Be creative!

     

    Rules

    As before, any blogger can--and should--feel free to participate in this event. In order to make the event slightly more international, I'm changing the time range from PST-based to UTC-based. So the rules are:

    • Your post must go live between 00:00:00 UTC and 23:59:00 UTC on Tuesday, January 12, 2010
    • Your post must link back to this one, and it's recommended that you clearly identify the post as a T-SQL Tuesday post
    • You are responsible for ensuring that a trackback or comment appears here so that I can find the posts

    Follow the rules, and your post will be included in the roundup to be posted on January 13 or 14. Don't follow the rules, and it won't show up there. Simple as that!

     

    Twitter

    Follow the event on Twitter by watching for the #TSQL2sDay hash tag.

     

    Additional Notes

    If you put in a comment and your post doesn't follow the other rules, I will ignore your post. (Really sorry to have to add this, but a couple of people felt the need to do that last time and it violates the spirit of the event. Especially the person who tried to spoof a trackback, without any kind of link in the initiating post. If you're not going to link back to the invitation post, please don't bother participating, now or ever.)

    Per the rules outlined last time, I'm hosting again this month, and starting next month the event will rotate around to other blogs. I'll control the schedule even once the event leaves here, so please let me know if you're interested in hosting--either with a comment, a Twitter DM, or an e-mail--and I'll add you to the list. To host, you must have participated in two previous T-SQL Tuesday events, and your blog must have had at least one post a month for the prior six months.

    I'm also collecting topic ideas, and would appreciate your sending them over as you think of them so that we can ensure that we have plenty of material to keep this event running for years to come.

    Enjoy, and as always let me know if you have comments, questions, concerns, etc.


  • [OT] Auto-Responders: Making E-Mail Suck, One Reply at a Time

    Do either of these look familiar:

    "Suzanne is out of the office and will return December 12."

    "Hi! I'm on vacation, and may not have access to e-mail. If you need immediate assistance, please contact ..."

    Or how about this one (seen in December):

    "I'm on vacation from July 3-7 ..."

    Or the worst one I've seen recently, just three letters long:

    "OOF"

    Auto-responders are a nuisance, an annoyance, and destroy the asynchronous nature of e-mail that makes it such a useful communication mechanism. Here's why:

    Too Much Information Disclosed

    Does every single person who e-mails you really need to know where you are and what you're doing with your time? Get over yourself; no one needs to know whether you're on vacation or doing a seminar in China. It's information that doesn't need to be shared and, arguably, can be considered a violation of your own privacy and security. I don't know about all of my readers, but I get a lot of e-mails from a lot of people I don't know and have no reason to trust. What better time to rob your home than when you're off touring Machu Picchu? And what better way to communicate this fact to a complete stranger than via an auto response?

    Are You Really That Important?

    Is every single person who e-mails you sitting on the edge of their chair, repeatedly hitting the refresh button hoping that your reply will soon arrive? No, I didn't think so. And if you answered yes, get over yourself. If your reply is "delayed", no one is going to care. As a matter of fact...

    The Asynchronous Nature of E-Mail

    E-mail is, by its very nature, asynchronous. Which means that a reply will always be delayed. We send off an e-mail and expect a reply at some later date. If someone had something truly important to talk to you about, they would talk to you about it--on the phone. Or maybe start an instant message session. Did I mention that you should get over yourself?

    As an aside, sending an auto response telling me that reply "will be delayed" has an interesting side-effect. It subconsciously tells me that if I don't receive an auto response from you, that I should expect a reply right away. So next time you wait before replying to one of my e-mails, I might just wonder why you're snubbing me.

    Technology Obviates the Need for Auto-Response

    I know that at least 75% of you have in your pocket, right now, an iPhone, Palm, Blackberry, or Droid. You can--and do--check e-mail no matter where you are. And you can--and will--pull it out while you're out of the office, on your vacation or anywhere else, and reply to any important e-mails that have come in. So forget about telling everyone that you're out of the office. Does it make you feel important? You need to ... get over yourself and quietly use that technology you've invested in, rather than wasting my time with pointless auto responses.

    "But, but, but..."

    I can already feel the wrath of my readers coming to bear as a result of this post. And there are a few (very few) valid reasons to use auto-responders:

    • Your employer has a rule that says you have to do it. Okay. Bad idea to argue with the person who pays your bills.
    • You really are going away, for a long time, to a place where you really can't access your e-mail.
    • You have an SLA or some other binding reason that a reply absolutely must go out after any e-mail is received.

    Barring these reasons, you have no excuse. Sorry.

    A Plea for 2010

    Please turn off your auto-responders. Especially for e-mail addresses that are subscribed to lists. There's almost nothing worse than an auto response until you get auto responses from someone you didn't even e-mail (you know who you are). If you must use auto-responders, please follow some basic rules:

    • Don't tell me where you're going, or that you'll be on vacation. Just say that you're out of the office. Honestly, nine times out of 10, I don't care.
    • Don't use an auto-responder if you will be checking e-mail in any way at least once every 24 hours. Yes, this may be less than the 5-minute interval you usually work with, but no one knows that. A delay of a day or more is perfectly acceptable--and expected--for most e-mails.
    • Set your auto-responder to automatically turn itself off the day you return. That way you can avoid sending a response when you actually are back in the office, which makes you look like a total moron.
    • If you regularly provide "immediate assistance", set up an immediate assistance e-mail alias and have your customers e-mail you there. Then you can simply redirect those e-mails when you're not available. Problem solved, and no one needs to delete your auto responses.
    • Don't tell me that response will be delayed. I've already figured that much out.
    • Get over yourself. You're just not that important. Sorry.
    Thanks to those who've read this far. I wish you all a fantastic 2010 (even those of you who use auto-responders).

  • Improving Your Public Speaking: "Confessions of a Public Speaker"

    Speaking is one of those activities where there is always something to tweak or improve. Whether you've just finished your first talk or your thousandth, after you're done I guarantee you'll look back and find at least a couple of things that you'll wish went better. Changes you can make for the next time will always be on your mind. As a speaker, part of your job is to realize that no talk is going to be perfect, but with work you can make your talks better and better, with fewer modifications required each time.

    Making things trend upward, at least for me, involves two components: lots of practice, and reading up on how others do a great job. As part of that second component--reading--I've purchased a number of books on the topic of speaking. While some of them contain useful tidbits, the majority of them--even the great ones--are painfully boring. And most of them have nothing very useful to share; some of the advice even borders on counter-productive.

    With all of that said, I'm happy to report that a new book published by O'Reilly and Associates bucks these trends and delivers what I've been looking for. "Confessions of a Public Speaker", by Scott Bercun, is highly entertaining and packed with great tips that apply equally well to both new and experienced speakers. Bercun does a great job of debunking a number of speaking myths (should you really consider what the audience would look like naked?) and teaches both what to do and what not to do by sharing many humorous anecdotes from his own and others' speaking experiences.

    At around 200 pages, the book is a breeze to get through, and if you're anything like me you will not want to put it down. Upon receiving the book in the mail I ripped open the package planning to skim the first few pages... and ended up reading half of the book in the first sitting. This has never happened to me with a speaking book before, and that alone is testament to the greatness that has been achieved here.

    Bercun focuses on conference and other public-venue speaking, so this book is perfect for just about anyone reading this blog post. Whether you're speaking at a user group, SQL Saturday, Code Camp, or major conference, the advice in the book will apply to your situation. Especially interesting is a chapter in which a number of disaster scenarios are outlined, with advice on both how to react in the best possible way and how to avoid them to begin with (note to self: don't go out drinking the night before doing a big talk).

    The book is easy to digest, the content is top-notch, and the text is just as long as it needs to be to make its point--just like a great talk, and just like a good blog post, so I'll stop here. If you want to improve your public speaking, pick up a copy of this book right away.


  • Spam Fighting and Lessons Learned

    Here on SQLblog, we take spam seriously. Actually, I should rephrase that: I take spam really, really, really seriously, some other people take spam somewhat seriously, and some don't really seem to care. It all balances out in the end. We've done a pretty good job keeping the site clean and mostly spam-free. We have a few different automated anti-spam tools, but it's not enough. I do manual sweeps though most of the site once or twice a week and pick up anything that managed to slip through the cracks. I'm constantly updating and refining our rules to try to keep the manual work as quick and painless as possible. I take it as a point of pride that we have one of the lowest spam rates I've ever seen on a large blog site with anonymous comments enabled.

    And all of that work, and my pride, is still not enough. A couple of days ago I discovered, amidst a mild rush of panic, that spammers were getting their links on the site by creating new user accounts and putting the links in their profiles. Many of these accounts had obvious names that immediately signaled that something was up. A user account called "69 Sexy" on a site like SQLblog is a dead giveaway. Some of them were obvious but in stranger ways; we had over 80 accounts created by spammers advertising various fake Christmas tree sites. Christmas tree spammers? Seriously?

    Being the kind of guy I am, and given the amount of work I put in fighting spam on the site, there was no way I was going to let these accounts live. So I hit the database and started hacking around trying to figure out how to stop them. "Hacking around" is not generally a good thing when you're trying to do database work, but the design that underlies Community Server is, shall we say, not the greatest I've ever seen. After several minutes of prodding and false starts, I found the table where the profile information is stored and thought about how to do the cleanup.

    I looked at a few of the spam accounts and found that they shared some common qualities: They all had "interesting" keywords in their profiles, but none of them had associated posts in blog comments or the forums--these, if they ever did exist, would have been taken care of by one of our other anti-spam measures.

    I did a few queries and discovered that there are five or six core tables in which most user data is stored, and about 30 satellite tables. My fear was that I might catch a real, valid account using one of the spam keywords (some valid word could have an embedded fragment of a spam keyword), and I decided to let referential integrity protect me. I wrote a script that started a transaction, did the deletes from the core tables in a TRY block, and if any exception occurred a rollback would fire in the CATCH block. I tested this logic using my own account (safely inside of a nested transaction to make absolutely certain I didn't delete myself!), and was pleased to see that the transaction was rolled back as expected.

    And from that point on, it was kind of fun. I thought of all sorts of keywords and ran my script against them, removing scores of user accounts from the site. Every few keywords I would check the users list and find a few more. Goodbye, "forex traders". Goodbye, "payday loans". And see you later, "wire frame Christmas trees". (What the hell is a wire frame Christmas tree? I may never know.)

    Of course, I wasn't using these keywords in their full, natural form. If I had a user with "forex trader" in his profile, I might delete all users where I found the string "forex", and maybe all users where the string "trade". The work was done using code along the lines of: Profile LIKE '%' + @string + '%'. I figured that it didn't matter if I caught some non-spam users, thanks to my diligent work with TRY-CATCH and the database's referential integrity constraints.

    The next day, I received an e-mail from one of the site's bloggers. He was locked out of his account, unable to access his blog. And so I jumped back into the database and made a startling discovery: The database, as it turns out, has a few referential integrity constraints. And it also lacks a few referential integrity constraints in places where a normal person might expect to see them. Oops.

    Turns out that I deleted the user accounts for about a quarter of our bloggers, and about 50 users who had left perfectly valid comments in blog posts. Luckily, we're almost as serious about backups as we are about spam, and a quick database restore and an hour and a half of scripting later all of the deleted user data was restored.

    The morals of this story, for me: First and foremost, assumptions aren't worth much. This is something we all know, yet it's a trap I fall into over and over and over. Some lessons are more difficult to learn than others. Second, and more technical, don't trust a database design until you fully understand its ins and outs. Otherwise, surprises can and will pop up. This is another lesson I've learned countless times and seem to keep hitting. Something in me really wants to believe that most database designers do the right thing. But the reality is, they don't.

    And the final lesson of the day is not to get too caught up in pride or determination. I wanted those spammers out, and I wanted them out that moment. I wasn't willing to wait a bit longer or think a bit more carefully through my solution. And I paid the price for that rash decision making. Those spam links weren't actually hurting anything, they were merely causing me irritation. In retrospect, I could have handled the situation in a much more thoughtful manner.

    In the end, no real harm was done. A valuable lesson or two was learned. And I got a topic for a blog post out of the deal. Not bad. And one day those spammers will feel my wrath... In a cool, level-headed way.


  • T-SQL Tuesday #001 (Date/Time Tricks): The Roundup

    Wow! The response to the first T-SQL Tuesday was truly amazing. We ended up with 20 great posts, from all over the world. If you didn’t participate this time, fear not—we’ll be doing this every month from now on so there is plenty of time to jump in. And don’t forget our Twitter hashtag, #TSQL2sDay, which you can follow to keep up with T-SQL Tuesday even when it’s not the 2nd Tuesday of a month.

    Here are the posts for this month, in the order in which trackbacks/comments were received on the invitation blog post:

     

    daveb_bigger Kicking things off was Dave Ballantyne, who shared “Age calculation with SQL Server”. If you’ve ever tried to calculate someone’s age and run into leap year issues, read this post.

     

    batfinkRob Farley shared a post entitled “A date dimension table with computed columns”. Rob’s offering should help those of you who work with date dimensions in Analysis Services.

     

    m2_normal

    Next up was Mike Walsh, with an intro post called “Dates and Time”. Mike covers many of the basics such as ISO date formats and whether you should cluster a date column. Definitely a good refresher!

     

    avatar.aspx_thumb_2ED1E6ADBluedog67’s “Introduction to Effective Dating” has nothing to do with getting a girl’s phone number. Instead, he discusses the ins an outs of “effective dates”—in other words, the date that a piece of data became reality. A very interesting post!

     

    DPP_0043

    Alan Wood brings us “Return a range of dates between 2 days”, a post in which he describes a function that does exactly what the title says it will. I often use calendar tables for this, but there are definitely times that a function makes more sense. Good stuff.

     

    summitcloud_logo1_bigger.gif SummitCloud was not happy with some of the built-in SSRS features for Gantt charts, and posted an elaborate workaround in “Project Time Reporting Hack in SSRS”. A must-read if you need to create pretty reports based on time intervals.

     

    biggerpicture

    Brad Schulz is feeling rather paranoid… Various dates are giving him night sweats and other forms of panic. In “Friday the 13th 2009: Cluster of Terror” he uses a calendar table an T-SQL to evaluate one popular day of doom. Scary!

     

    me_bigger Speaking of scary, next up is the Scary DBA himself, Grant Fritchey, with a cautionary tale—how not to do date and time queries. Read Grant’s post and then check your code base. Are you doing something that could cause problems?

     

    IMG_4848

    Jack Corbett is not sure whether he’s a wise man or a wise guy, but he does know how to retrieve date rages. He shares his insights in his example-filled T-SQL Tuesday post.

     

     

    avatar.aspxAndy Leonard loves SSIS more than any man should love a computer program, and it shows. In “A Couple SSIS Date Expressions” he reveals the true nature of how to manipulate dates in our favorite Microsoft ETL program.

     

    meSmall_normal

    Mladen Prajdic thought that there was already plenty of date/time information out there—why reinvent the wheel? He shares a large number of links to previously-written posts in “SQL Server Date and Time fun from all around”.

     

     

    kalen_d Ever the internals geek, Kalen Delaney uses her knowledge to answer a number of pressing date and time-related questions in “My Datetime FAQ”. If you’ve ever wondered about the significance of the year 1753, read this post.

     

    SeanJenSm_bigger

    Sean and Jen McCown are the Midnight DBA team, a couple that apparently prefers work to sleep. In “Remix! Optimized: Query by Hour, Day, Week, or Month” Jen shares the secrets of querying your data using a variety of granularities. Cool!

     

    jamie_dancing_bigger

    Jamie Thompson may be the SSIS Junkie, but his post is all about “Unambiguous date formats” in T-SQL. If you work with applications that are used by people in different countries, read this.

     

    adam_machanic

    I’m the guy writing this blog post, and like the Midnight DBAs I apparently also prefer work to sleep—it’s currently almost 1:00 a.m. and I’m starting to wonder about just how fuzzy my brain is going to be in the morning. Luckily, I shared “Exploring ‘Fuzzy’ Interval Islands Using SQLCLR”, so it may be a non-issue.

     

    deleteme_bigger

    Allen Kinsel would like to thank the PASS program committee. And what does this have to do with date and time tricks? Not much, but he sneaks through the door by providing a link to a video about time zones. Okay, Allen, that’ll do.

     

    ArmyPicture_bigger

    Jonathan Kehayias is interested in better tracking data across date and time ranges. He doesn’t ramble much (at all) in his post “Splitting Date/Time Ranges and Intersections”. Useful stuff, for sure.

     

    d832c4d8fb6071a0d4f16f005f3d2810

    Stuart Ainsworth surprised me with his post. Why? Because it’s the only one that covered data modeling, one of the more interesting aspects of the date and time question. His very interesting post is titled “Date/Time Issues and Data Modeling”.

     

    johnsterrett_bigger

    I’m not quite certain that John Sterrett’s post is on topic for this month’s T-SQL Tuesday, but we’ll let it slide. His title is “Disk usage monitoring with Data Collector”, and he does at least mention time intervals. Good enough?

     

    Dad_and_D_bigger

    Aaron Nelson kept things quite true to theme, posting a grab-bag of “Date, Time, tricks with the DateTime Data Type”. His post includes a number of snippets—perhaps you’ll find something that will help you solve a problem you’re currently facing?

     

    … And that’s it for this month’s T-SQL Tuesday! Next month will once again be hosted on this blog. Watch for the invitation post around January 4th. And please leave a comment here if you have any topic suggestions—I could definitely use some help coming up with a great list so that we can keep things rolling.

    Enjoy!

     

    Update: There are a few posts by people who didn’t properly link back to the original. I’ll give them a one-time pass, because it’s the holiday season and I’m a nice guy. Click through to read posts by Alexander Kuznetsov, Peter Larsson, and Rajib Bahar.


  • T-SQL Tuesday #001: Exploring "Fuzzy" Interval Islands Using SQLCLR

    When working with time intervals, we often want to ask a couple of basic questions:

    • Which time periods are not covered by our intervals? These are known as "gaps".
    • What are the time ranges that we are fully covering? These are known as "islands".

    If you're unfamiliar with "gaps" and "islands" I highly recommend reading some of Itzik Ben-Gan's recent work in SQL Server Magazine. He's had a great series going on the topic. But one thing that he hasn't found a good T-SQL solution for is a problem that I call "fuzzy islands."

    When is an island fuzzy? When it doesn't necessarily have a fixed end time. For an example of this, consider a store, selling a number of products. Management might want to see a report showing when each product was selling. This is, effectively, an island question. The goal is to find all of the covered ranges during which sales occurred. But running such a report, you might find that way too much data is returned. A given product may have sold units on Monday, Tuesday, and Thursday, but for some reason no one bought one on Wednesday. Creating a new island every time there is a small gap will create a 300-page report where a 1-page dashboard might suffice--not a good user experience, nor a good way of representing the data. The solution? Introduce a bit of fuzziness--a rule that says, for instance, that a gap is only a gap if it's longer than 7 days.

    Answering the fuzzy islands question is not a very difficult thing to do in T-SQL. The basic algorithm follows:

    1. Find all of the "start" dates or times. These are simply those dates or times for which a previous date or time in the fuzzy interval does not exist. So if a row is dated 2009-12-08 and our fuzzy granularity is 7 days, we know we have a start date if there is no other covered data from the end of November.
    2. For each start date identified, find the minimum date greater than the start date. This is done by looking ahead rather than behind, so if our date is 2009-12-08 and we have a granularity of 7 days, we'll look forward until December 15th.
    3. Optionally, add the fuzzy factor to the end date. This is something that I think is a good idea, as it introduces the concept of an "active" interval--a period over which, for example, a product is considered to have been selling. The interval shouldn't necessarily terminate the day that the last sale occurred. But of course this depends on the situation in question.

    The following query produces a fuzzy islands report for each product in the AdventureWorks (or AdventureWorks2008) Production.TransactionHistory table. You can modify the @active_interval variable to tweak the fuzziness and change the output.

    --Find all "active" product time ranges, meaning that the product
    --has sold within the previous 7 days
    DECLARE @active_interval INT = 7

    SELECT DISTINCT
    t_s.ProductID,
    t_s.TransactionDate AS StartDate,
    DATEADD
    (
    dd,
    @active_interval,
    (
    SELECT
    MIN(t_e.TransactionDate)
    FROM Production.TransactionHistory AS t_e
    WHERE
    t_e.ProductID = t_s.ProductID
    AND t_e.TransactionDate >= t_s.TransactionDate
    AND NOT EXISTS
    (
    SELECT *
    FROM Production.TransactionHistory AS t_ae
    WHERE
    t_ae.ProductID = t_s.ProductID
    AND t_ae.TransactionDate BETWEEN
    DATEADD(dd, 1, t_e.TransactionDate)
    AND DATEADD(dd, @active_interval, t_e.TransactionDate)
    )
    )
    ) AS EndDate
    FROM
    (
    SELECT DISTINCT
    ProductID,
    TransactionDate
    FROM Production.TransactionHistory
    ) AS t_s
    WHERE
    NOT EXISTS
    (
    SELECT *
    FROM Production.TransactionHistory AS t_ps
    WHERE
    t_ps.ProductID = t_s.ProductID
    AND t_ps.TransactionDate BETWEEN
    DATEADD(dd, -@active_interval, t_s.TransactionDate)
    AND DATEADD(dd, -1, t_s.TransactionDate)
    )
    ORDER BY
    ProductID,
    StartDate
    GO

    Running this query you'll find that it works... But the results are returned a bit more slowly than we might desire--15 to 16 seconds on my end. Looking at the query plan, the reason for this becomes quite obvious: Lots and lots of table scans. How can we eliminate all of the overhead?

    SQLCLR to the rescue. The best way to solve this problem--at least until the SQL Server team adds proper OVER clause support (LAG and LEAD, specifically)--is to use a cursor algorithm. We could do this in a T-SQL cursor, but why bother? Cursor logic in SQLCLR is much, much faster.

    To solve the problem, I implemented an enumerator, called active_products_enumerator. The enumerator is initialized using a SqlDataReader and an "active interval" -- the number of days we're allowing for fuzziness. The DataReader is expected to return rows ordered by ProductID and TransactionDate. The enumeration process uses the following algorithm:

    1. If the current ProductID is not the same as the previous ProductID, return an end date for the previous interval and start a new one
    2. If the current period date is greater than the previous period date plus the active interval, return an end date for the previous interval and start a new one
    3. Otherwise, continue

    Following is the MoveNext method for the enumerator (the complete code is attached to this post so that you can run it on your end without my bombarding you with a gigantic code-filled post):

    public bool MoveNext()
    {
        try
        {
            current_results = null;

            while (r.Read())
            {
                new_ProductID = r.GetSqlInt32(0);
                new_period_date = r.GetDateTime(1);

                if (new_ProductID != ProductID)
                {
                    if (ProductID != 0)
                    {
                        current_results = new results(
                            ProductID,
                            StartDate,
                            period_plus_interval);
                    }

                    ProductID = new_ProductID;
                    StartDate = new_period_date;
                    period_plus_interval = new_period_date.AddDays(activeInterval);

                    if (current_results != null)
                        return (true);
                }
                else
                {
                    if (period_plus_interval < new_period_date)
                    {
                        current_results = new results(
                            ProductID,
                            StartDate,
                            period_plus_interval);

                        StartDate = new_period_date;
                    }

                    period_plus_interval = new_period_date.AddDays(activeInterval);

                    if (current_results != null)
                        return (true);
                }
            }

            //return the last row of data
            if (ProductID != 0)
            {
                current_results = new results(
                    ProductID,
                    StartDate,
                    period_plus_interval);

                //set this to 0 so we don't return another row
                ProductID = 0;
            }

            if (current_results != null)
                return (true);
            else
            {
                r.Dispose();
                return (false);
            }
        }
        catch
        {
            r.Dispose();
            throw;
        }
    }

    Put into a table-valued function, as the attached code does, this algorithm will return the same data as the T-SQL query in under a third of a second on my end--around 45 times faster than the T-SQL version.

    Note that I've played some games with a loopback connection to get this whole thing to work. That's a topic for a future blog post, so stay tuned. In the meantime, please realize that you'll have to catalog the assembly with EXTERNAL_ACCESS permission to make this happen.

    This post was created for T-SQL Tuesday, the revolving SQL Server blog party, hosted this month by... me. Enjoy!


  • New England Data Camp v2.0 (SQL Saturday #34) - January 30, Waltham MA - Registration and Call for Speakers Open

    Last week I very quietly announced, via Twitter, that we've officially launched the site for New England Data Camp v2.0, the second go at a full-day event jointly hosted by both the New England SQL Server Users Group and the Southern New England SQL Server Users Group. The event will take place on January 30, 2010, at Microsoft's Waltham, Massachusetts office.

    If you missed last year's v1.0, you can check out the fantastic session list on my blog post for the event. We had just over 200 attendees and feedback was quite positive. This year's event promises to be even bigger and better, as we've partnered with SQL Saturday, which should smooth some of our rough edges.

    We are planning to have up to 25 sessions in four or five tracks, and our public call for speakers is open and ready for your submissions. Never presented before? No problem--shake off your stage fright and give it a try. I promise, you will have a good time. The Data Camp audience is laid back and happy to learn from you, whether or not you're a seasoned presenter.

    Just want to attend? That's fine, too. Our registration page is ready and waiting to reserve your spot at the event.

    We're also looking for a few sponsors to help defer the cost of the event. If your company would like to increase awareness amongst the best and brightest data professionals in New England, you can do no better than to sponsor this event. Information on sponsorship plans is available on our Sponsors page.

    Looking forward to seeing many SQLblog readers there! Please let me know if you have questions, concerns, comments, etc.


  • Who is Active? v9.57: Fast, Comprehensive DMV Collection - What's Really Happening on Your Server?

    Happy December, SQLblog readers! My gift to you, just in time for the holidays: The newest "official" release of your favorite SQL Server activity monitoring stored procedure.

     

    Click here to download Who is Active? v9.57

     

    Since the last release--v8.82, from August of this year--I've made a number of modifications to the script, resulting in six interim "beta" releases. The uniting theme of all of these changes is more, better quality data, faster.

    More data.

    • Two new core options were added at the behest of users: @show_sleeping_spids and @show_system_spids. These options cause the procedure to return information that wasn't previously available via Who is Active, about--you guessed it--sleeping and system sessions, respectively.
    • Another major change was adding a feature so that the script now shows blocking sessions whether or not they're included in the default filter criteria. This is not something you turn on or off--it just happens--and will ensure that if you're debugging a blocking scenario you'll automatically have all the information you need.
    • And for you query plan geeks out there, the full wait stats collection mode (see the "Faster data" section below) now returns node identifier information with CXPACKET waits--helping you to track progress of tasks as a plan is executed.
    • Finally, a small modification. The online help system (@help=1) now returns information about both the available input parameters as well as all of the available output columns.

    Better data.

    • A few minor bugs were fixed, mostly having to do with the evils of MARS and the fact that the DMVs don't properly deal with MARS sessions in many cases.
    • Workarounds were also added for inconsistencies in how the DMVs report SQL handles, even without MARS.
    • Two features were added to help you get only the data you need when you need it, and not the data you don't: 
      • Dynamic sort ordering, via a parameter called @sort_order, lets you pass in a list of columns and column directions by which to sort the output.
      • And "not" filters, implemented using parameters @not_filter and @not_filter_type, work exactly the opposite of regular filters. These are useful in those cases where you have a bunch of sessions that aren't of interest, and you don't want them cluttering your output.
    • Finally, I've changed the default output column order to something I think is a bit more useful. Don't like my selections? No problem--override me using the @output_column_list parameter.

    Faster data.

    • This is the area in which I made the most modifications. A monitoring tool borders on useless when it takes a minute or more to return key metrics when your server is on fire, and alas, previous versions of Who is Active were doing just that for some users.
    • I've made major changes to the core queries in this version of Who is Active, bringing down query times from minutes to a few seconds in many cases.
    • In conjunction with these changes I added a new lightweight wait stats collection mode, which is the new default. This mode collects only the top non-CXPACKET wait, giving preference to blocked waits, so that you can see the worst problems without having to sort through a lot of output that may or may not apply.
      • If you miss the complete stats collection mode, it's still there--simply set @get_task_info=2 when calling the procedure.

    I'm quite happy with this release, and I hope that it will help people quickly solve a number of tricky SQL Server problems. As always, your feedback is very much appreciated! Leave me a comment here, e-mail me (my address is in the script), or track me down at a conference. Most of the features in the past few versions are a direct result of requests I've gotten from users.

    A huge thank you to those who tested and gave me feedback since the last version! Aaron Bertrand, Rajiv Jain, Michelle Ufford, Uri Dimant, and everyone else, I really am thankful for your efforts. To these people and all of my readers, I wish you a happy December and a prosperous 2010.

    Until next time, enjoy!

     

    Click here to download Who is Active? v9.57


  • Invitation to Participate in T-SQL Tuesday #001: Date/Time Tricks

    T-SQL Tuesday is the SQL Server blogosphere's first recurring, revolving blog party. The idea is simple: Each month a blog will host the party, and about a week before the second Tuesday of the month a theme will be posted. Any blogger that wishes to participate is invited to write a post on the chosen topic. The event is called "T-SQL Tuesday", but any post that is related to both SQL Server and the theme is fair game. So feel free to post about SSIS, SSRS, Java integration, or whatever other technologies you're working with in conjunction with SQL Server. Even if your post includes no T-SQL we still want to see it.

    The posts must go live on the second Tuesday of the month, by (or before) 23:59 PST. Each post must link back to the host blog's post, and the blogger should make sure that a trackback has been generated and successfully posted to the host blog; if not, a comment should be posted on the host blog with a link, so that the host blogger can find the post. The host blogger will then take all of the posts and within a day or two will compile a "roundup"--a brief summary with links--so that readers can find all of the relevant posts.

    Why do this? Several reasons: It's fun, it's going to generate a lot of interesting content, and it's going to bring lots of bloggers into the fold and improve our overall community by creating a lot of links between blogs. Any blogger is invited to participate, whether the blog has been live for six days or six years. If you can write, you should join in the fun.

    Rules of engagement for hosts: Although anyone can join in the party, hosting has a slightly higher bar: Your blog must have been active in the last six months ("active" is defined as at least one post per month), and you must have participated in T-SQL Tuesday events on at least two prior occasions. Since that's impossible until we've had two events, I'll host the first two myself. After that it will move around to as many blogs as we can get involved--and I hope that a number of you out there decide to go for it. The host blogger is not required to write a participating post to be included in the roundup, but is free to do so if he or she chooses. Want to host? Contact me through my blog--I'll keep a waiting list and control it centrally so that it doesn't devolve into anarchy too quickly.

    So that's that. If you bothered to read the title of this post you're aware that the topic for this month is Date/Time Tricks. Write a blog post that talks about dates and times--this can be based around T-SQL programming, data modeling, ETL, reporting, or whatever else you're using dates and times for (and who isn't?). Make sure that your post goes live on Tuesday, December 8, between 00:00 and 23:59 PST. Make it clear in your post that it's for T-SQL Tuesday, and make sure it links back here. Finally, make absolutely certain that a comment shows up here so that I can find the post.

    Questions? Comments? Post here.

    Enjoy! I'm looking forward to the start of a great blog tradition in the SQL Server community.


  • Demos: Connections, Fall 2009, Las Vegas

    Thanks to everyone who attended my sessions this week at SQL Server Magazine Connections in Las Vegas. Attached to this post you will find the demos for the following sessions:

    SDV301: Best Practices for Exception Handling and Defensive Programming in SQL Server 2005 and 2008

    As developers, we sometimes become lax about dealing with error and exception conditions by the time our code gets down to the data level. Exceptions can feel like something that only application code needs to worry about, until you realize that in SQL Server they can have a tremendous effect on your transactions and your data integrity. Learning to properly handle them is, therefore, of paramount importance to those of us who write data-centric applications. SQL Server 2005 greatly improved exception handling options by adding support for the structured TRY/CATCH syntax, but there is a lot more to the story than just that feature. In this session, we'll delve into the ins and outs of exceptions in both SQL Server 2005 and SQL Server 2008, starting with the database engine itself: types of exceptions, when and why they're thrown, and how the server treats them. Next, learn how to configure and throw your own custom exceptions, as well as how to leverage the SQL Server exceptions infrastructure with a variety of exception handling and defensive programming techniques both with and without the TRY/CATCH syntax. Most importantly, we review the effect of exceptions on transactions, and how to take programmatic control over the outcome of your transactions in the face of an exception.

     

    SDV303: T-SQL Power! Learning to Harness the Under-Used OVER Clause

    First introduced in SQL Server 2005, the OVER clause is an ANSI SQL enhancement that gives you tremendous control when dealing with aggregations. By using the OVER clause, query writers can simultaneously aggregate columns based on multiple groups. The feature also enables the query engine to provide windowing mechanisms for ranking and row numbering. Leveraging these powerful language enhancements allows you to solve a surprisingly large number of difficult query problems—including custom paging schemes, data de-duplication, "top-N" problems, and complex statistical calculations. Even better, this feature can be creatively applied to help with performance optimization of certain tough queries. In this session, you will learn all of these techniques and see why, after applying the OVER clause in dozens of projects since the release of SQL Server 2005, I consider it to be one of the most powerful T-SQL features available.

     

    SDV210: What Happened? Auditing, Tracking, and Change Monitoring Technologies in SQL Server 2008

    Regulatory bodies...end-users...your boss. They all want answers. Many questions are easy enough to deal with: "Did someone drop my view?" Others are a bit trickier: "What was the previous value of this row?" And some are seemingly impossible: "Who selected the data from this table over the past week?" For many DBAs, the answer to some or all of these questions is often "Umm..." But don't blame yourself; getting this information in SQL Server has never been especially easy—until now. SQL Server 2008 ships with several new technologies designed to help you track and report on exactly what happened, who did it, and when. In this session, you will learn about SQL Server 2008's Change Tracking, Change Data Capture, and SQL Server Audit features, each of which provides a distinct set of capabilities and has specific strengths and weaknesses. Looking at each of these technologies in turn, you will see how they work and where you might want to leverage them in your SQL Server infrastructure. If you're used to saying "Umm..." get ready to say "I'll be right back with the answer."

     

    Enjoy!

     

     

     


  • PASS Summit 2009: Extended Events Demos

    Thanks to everyone who attended my talk on Advanced Extended Events at the PASS Summit this last week in Seattle. I just went to the PASS web site and noticed that they still haven't put the demos up for download. So just in case you're dying to play around with this hot technology as soon as humanly possible, I thought I should post here. No warranties, guarantees, etc.

    Enjoy!


More Posts Next page »

This Blog

Syndication

News

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement