THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Aaron Bertrand

  • When was my database / table last accessed?

    A frequently asked question that surfaced again today is, "how do I see when my data has been accessed last?"  SQL Server does not track this information for you.  SELECT triggers still do not exist.  Third party tools are expensive and can incur unexpected overhead.  And people continue to be reluctant or unable to constrain table access via stored procedures, which could otherwise perform simple logging.  Even in cases where all table access is via stored procedures, it can be quite cumbersome to modify all the stored procedures to perform logging.

    SQL Server 2008 will offer Server Auditing for all actions, and this can be logged to a file, or to the Windows Application or Security Log.  You can do something as narrow as record when a specific login queries AdventureWorks.Person.Address.City, and as wide as recording information about every query against every database on the entire instance.  Here is a quick sample that audits all select queries against Person.Address in the AdventureWorks sample database:

    USE master;
    GO
    CREATE SERVER AUDIT Test_Server_Audit
        TO FILE ( FILEPATH = 'C:\Audits\' );
    GO
    ALTER SERVER AUDIT Test_Server_Audit
        WITH (STATE = ON);
    GO

    USE AdventureWorks;
    GO
    CREATE DATABASE AUDIT SPECIFICATION Test_Database_Audit
        FOR SERVER AUDIT Test_Server_Audit
        ADD (SELECT ON Person.Address BY PUBLIC)
        WITH (STATE = ON);
    GO

    SELECT *
        FROM Person.Address;
    GO

    SELECT *
        FROM fn_get_audit_file('C:\Audits\*', NULL, NULL);
    GO

    USE AdventureWorks;
    GO
    ALTER DATABASE AUDIT SPECIFICATION Test_Database_Audit
        WITH (STATE = OFF);
    GO
    DROP DATABASE AUDIT SPECIFICATION Test_Database_Audit;
    GO
    USE master;
    GO
    ALTER SERVER AUDIT Test_Server_Audit
        WITH (STATE = OFF);
    GO
    DROP SERVER AUDIT Test_Server_Audit;
    GO

    For those of us who don't want to wait for SQL Server 2008 and cannot use stored procedures to log select activity, there is another answer: the DMV sys.dm_db_index_usage_stats, introduced in SQL Server 2005.  By showing the last read and write to a table, this DMV allows us to answer the questions we couldn't before:
    • when was database x accessed last?
    • when was table y accessed last?
    We can answer the question about access to a database simply by aggregating the data in the DMV to the database level:

    USE AdventureWorks;
    GO

    SET ANSI_WARNINGS OFF;
    SET NOCOUNT ON;
    GO

    WITH agg AS
    (
        SELECT
            last_user_seek,
            last_user_scan,
            last_user_lookup,
            last_user_update
        FROM
            sys.dm_db_index_usage_stats
        WHERE
            database_id = DB_ID()
    )
    SELECT
        last_read = MAX(last_read),
        last_write = MAX(last_write)
    FROM
    (
        SELECT last_user_seek, NULL FROM agg
        UNION ALL
        SELECT last_user_scan, NULL FROM agg
        UNION ALL
        SELECT last_user_lookup, NULL FROM agg
        UNION ALL
        SELECT NULL, last_user_update FROM agg
    ) AS x (last_read, last_write);

    Switching focus to each table is accomplished by adding the object name to the GROUP BY (and as Jerry pointed out, this will require SP2 to use OBJECT_SCHEMA_NAME(), otherwise you can join against sys.tables and sys.schemas):

    USE AdventureWorks;
    GO

    SET ANSI_WARNINGS OFF;
    SET NOCOUNT ON;
    GO

    WITH agg AS
    (
        SELECT
            [object_id],
            last_user_seek,
            last_user_scan,
            last_user_lookup,
            last_user_update
        FROM
            sys.dm_db_index_usage_stats
        WHERE
            database_id = DB_ID()
    )
    SELECT
        [Schema] = OBJECT_SCHEMA_NAME([object_id]),
        [Table_Or_View] = OBJECT_NAME([object_id]),
        last_read = MAX(last_read),
        last_write = MAX(last_write)
    FROM
    (
        SELECT [object_id], last_user_seek, NULL FROM agg
        UNION ALL
        SELECT [object_id], last_user_scan, NULL FROM agg
        UNION ALL
        SELECT [object_id], last_user_lookup, NULL FROM agg
        UNION ALL
        SELECT [object_id], NULL, last_user_update FROM agg
    ) AS x ([object_id], last_read, last_write)
    GROUP BY
        OBJECT_SCHEMA_NAME([object_id]),
        OBJECT_NAME([object_id])
    ORDER BY 1,2;


    One word of note is that sometimes an UPDATE can look like a simultaneous read and write.  For example:

    USE AdventureWorks;
    GO
    UPDATE Person.Address SET City = City + '';
    GO
    SELECT *
        FROM sys.dm_db_index_usage_stats
        WHERE database_id = DB_ID()
        AND index_id = 1
        AND [object_id] = OBJECT_ID('Person.Address');
    GO

    See that for index_id 1, last_user_scan and last_user_update are identical and fairly recent.

    Another note is that unless a view is indexed, you cannot reliably track access to a view -- instead the references to the underlying tables are updated in the DMV.

    UPDATE - Mike C# and dave ballantyne brought up a great point that applies to all DMVs: the values do not survive a SQL Server restart, or detach/attach, or even Auto-Close. So, if you restart your server and then want to see when something was last accessed, all objects will either be NULL or very recent. One way to work around this is to create a SQL Server Agent job that polls the DMV periodically, and stores a snapshot of the data. This way you can have a running history of "last access" and maybe roll it up once per day (or whatever granularity is suitable).

    Even when SQL Server 2008 is released, auditing of some kind will be required if you want more information, such as a history of who ran which queries.  And if you are looking for more details about information that has been added, updated or deleted, you are going to want to look into the Change Tracking and/or Change Data Capture features.  But in the meantime, this DMV provides a quicker and much lighter-weight approach to at least determining when your data was accessed last.
  • Performance / Storage Comparisons : MONEY vs. DECIMAL

    As you may already know, I am not a big fan of the MONEY data type, because of its inflexibility, accuracy problems, and the expectations the name of the type evokes in new users.  If I had my way, MONEY would become a synonym for DECIMAL in SQL Server 2008 (allowing for specific precision and scale), and be removed in the following version.  Of course there are people out there that either don't feel as strongly as I do, or feel the opposite -- that MONEY should be here to stay.

    After a recent discussion about the pros and cons of using MONEY vs DECIMAL for storing currency (and even non-currency) data, curiosity got the better of me.  One of the arguments for the MONEY data type was performance.  No supporting data was provided, of course.  So I decided to conduct some tests myself.  I wanted to measure how MONEY compared to DECIMAL data types both in their original implementations and using new technologies available in SQL Server 2005 (VARDECIMAL storage format) and SQL Server 2008 (page and row compression).

    The person arguing for MONEY showed the space used by MONEY compared to the same information stored in a DECIMAL(20,4) column.  Not all that surprisingly, the latter was slightly larger.  But is that the whole story?  No, for two reasons.  One is that the *performance* of these choices was not compared, and the other is that DECIMAL(20,4) is not a very realistic requirement for storing currency data.  Unless you are storing the pricing information for luxury yachts or aircraft carriers, in which case you can probably drop the decimal places altogether and use INT or BIGINT.  For the rest of us, a better choice would be DECIMAL(8,2) or DECIMAL(10,2).

    I created 11 databases, each with a single table containing a single column:

     

      Keeping these tables in separate databases allowed for isolation of several factors and measurements, including database level settings, log growth, data file size and even backup time.

      Next, I populated the table in each database with approximately 390,000 rows of varying length decimal data (based on calculations against object_id from a triple cross join of sys.objects on itself), and measured the insert times and storage requirements.  Here is how they stacked up:

       

      Then I performed an update that affected all rows, making sure that roughly 20% of the rows would have a significant change in significant digits (e.g. by adding 1,000,000).  Here is the performance comparison, as well as how the data and log were affected:

       

      Next I compared the time and cost of performing a SELECT COUNT(*) with a WHERE clause against the column:

       

      And finally, I performed native and compressed backups of each database, comparing execution time and output size: 

       

      The following chart summarizes everything performance-wise.  The orange with the dot means that database performed the best; the x on the red background means it performed the worst.

       

      And this chart summarizes all things size-wise: 

       

      Of course, there is nothing overly definitive here.  DECIMAL(10,2) with row compression enabled got the most "first place" metrics, while MONEY with no compression and VARDECIMAL types never finished near the top of the class.  But you can judge from the results for yourself, and make decisions based on your own priorities.

      [UPDATE]
      Alex asked for some metrics on more complex operations like SUM().  I ran some tests using both SUM() and AVG().  The logical reads of course are the same as all the others, and the scan costs remained unchanged as well.  But as for the observed performance of both calculations (compute scalar cost was identical for both operations), see the following chart:

      Again, this was an average over 10 tests.  Note that I did not append these results to the summary charts I delivered above.  And sorry about the slightly different-looking screen shot.
      [/UPDATE]

      Please take into account that these tests were performed on a dual-core laptop computer, and the database files were created on external storage. There are many other tests I could have run to glean more performance and storage data, against a much larger data set, and using production-class hardware, but for the scope of this post I just wanted to glance at the most basic operations.  I repeated these tests 10 times from start to finish, so each metric taken is an average of 10 tests (in a lot of cases they were the same every time).

      This was a very tedious exercise to perform.  If you would like to perform your own tests, with your own sample data, and on your own hardware, I will more than gladly share my scripts.  I'd post them here right now, but they are scattered and not distribution-friendly at this point. 


    • Call a spade a spade! (SQL injection, or IIS vulnerability?)

      In a recent blog post, Dancho Danchev mis-labeled a recent IIS vulnerability as a "massive SQL injection attack."

      Let's be honest here.  Yes, this alert needs attention.  But this is not a new SQL injection vulnerability.  It is simply an exploit in IIS that lets malicious users access your source code.  If your database is already open to SQL injection attacks by anyone who can access the file system on your web servers, then yes, SQL injection is just waiting for the next vulnerability to your file system.  However, if you protect your database server(s) from SQL injection in the first place, then no IIS vulnerability will magically become known as a SQL injection attack.

      Never mind that half the IIS servers in the world probably don't even connect to SQL Server, and of the remainder, not all are vulnerable to SQL injection.  The ones that are vulnerable are that way because the web developers and/or DBAs have been sloppy and allowed for practices that help make SQL injection possible.

      Call it what it is; don't sensationalize it.  And instead of trying to create panic, provide a little education!  How do you prevent an IIS vulnerability from becoming a SQL injection attack?  There are plenty of things you can do.  Some of them are pretty obvious, or have been discussed previously, but I'll recap the ones on my list:

      • Do not expose your SQL Server to the Internet directly
        While in some cases you can't avoid this (shared database servers at a hosting provider, for example), if your server-side code yields a public address, or enough information that the public address can be easily determined, then you are opening yourself up.  All someone needs is read access to your config file or ASP page in order to obtain credentials to access your SQL Server from anywhere.  Talk to your network administrator about keeping SQL Server behind your firewall.

      • Make your passwords strong
        Ideally, your applications will use Windows authentication, but if you must use mixed authentication modes, then make sure your SQL Authentication passwords are "strong" passwords.  It is very hard to be completely immune to a dictionary attack, but you can make it much more difficult by using a 16-character password with mixed case and alphanumerics, like '$QL$erver_r0ck$!', as opposed to an "easier" password like 'tweetybird.'

      • Follow the principle of least privilege
        Do not use sa as the login in the connection strings for your application.  Use a low-privileged user that can only execute (certain) stored procedures.  There is no reason someone should be able to add a query like "SELECT * FROM sys.objects" to your server-side code, or launch extended procedures like xp_cmdshell, or drop objects, because that user should not have sufficient access to do so... the application user should not be sa or db_owner.  Lock down your applications, and only give them the rights they need.

        Similarly, do not use a domain administrator or otherwise privileged user as the service account.  This would mean that anything that runs under the context of SQL Server has free reign over your server or even entire network, using a variety of tools like extended procedures.

      • Always use stored procedures, or at least parameterized statements
        If you build ad hoc SQL in your applications, then you are asking for SQL injection attacks, and I strongly suggest you become familiar with using stored procedures or parameterized queries.  Otherwise, all input becomes suspect, since it is very easy to use comments or semi-colons to change the meaning of queries or to append additional queries to be executed.  With a query that uses strongly typed parameters, however, this technique becomes fruitless.  This does not mean something like:

        sql = "EXEC dbo.foo @param1 = '" & Request.QueryString("bar") & "'"
        conn.execute(sql)

        This is still vulnerable to SQL injection, because I can now call the page using ?bar=';drop table blat;--

        Instead you should use a command object and pass the inputs to parameters.  (This also prevents you from having to escape apostrophes in names like O'Hagan, delimit date literals correctly, etc.)
      • Use TRY/CATCH to return more generic error messages
        In order to prevent revealing your database structure, do not let errors like foreign key violations or other errors bubble up to the application.  This just gives your potential attacker more information about your database structure than they need to have.  Instead use error handling to say "That user does not exist" instead of the default error message SQL Server provides -- which gives specific table and column information back to the user.  If you are using ASP.Net, then you can make sure that you turn CustomErrorsMode to "On" or "RemoteOnly" and set compilation debug to "false"...

      • Do not store passwords in your Users table
        A lot of web applications store usernames and passwords so that their users can log in to the application.  Instead of storing a password in plain text, which can then be read easily by anyone who manages to gain read access to the Users table, store a hash of the password (using MD5 or something similar).  When the user attempts to login (hopefully via SSL), you use the same technique to hash their entry and compare the hashed values, instead of a clear text comparison.  Even if the user has read access to the stored procedure that implements the hash, all they can do with it is try and try and try... they cannot reverse engineer the data if you use a proper hashing technique.

    • Don't want to call CSS / PSS to get a cumulative update? You don't have to!

      Bob Ward posted a blog entry today where he explains the process of getting a cumulative update for SQL Server 2005 without having to call Microsoft's support team (they are called CSS now but you may remember them as PSS).  This is a great evolution in the process, which used to be very difficult (you had to open a support case and be deemed eligible to be issued a CU by a support engineer), and a few months ago they made it a little easier, allowing you to submit a form and have an engineer review it (without a phone call or a formal case), and up to a day later, you got an e-mail providing the download link(s).  Now, at least for post-SP2 cumulative updates, the e-mail containing the download is almost immediate.

      Makes me wonder why they don't just make it downloadable like a service pack, but in any case, this is a great step forward.  You can see his post here:

      http://blogs.msdn.com/sqlreleaseservices/archive/2008/04/15/cumulative-update-7-for-sql-server-2005-service-pack-2-2.aspx


    • Using System Restore for Cumulative Update Rollback?

      Just wanted to post a brief warning about expecting to be able to roll back a cumulative update or hotfix by reverting to a restore point.  In short: don't do it!  These two features are *NOT* designed to work together.  There are various complications with this and other methods of removing a cumulative update.  Hopefully you won't be in a situation where you need to remove a cumulative update, but if you do, I will go over a few things you should be aware of.

      Binaries vs. Uninstall

      System restore is not necessarily going to remove binaries from locations on your disk that Windows isn't deemed to "own."  So what you may end up with in some scenarios is a system that has a cumulative update (partially) installed, but since the program is no longer registered in add/remove programs, there is no longer the ability to remove it correctly.  Eventually this may lead to a wipe and reinstall.  I strongly recommend using the CU uninstaller utility in Add/Remove Programs instead of trying to "roll back" like you might do with shareware and less complex, non-service-type applications.

      Removing "old" setup files

      You may be very anal about cleaning up old files from your system that are no longer needed.  Lets say you install SP2 on SQL Server 2005, then you install CU5.  If you install CU6, you may think it is safe to blow away the CU5 files.  Hold on, tiger!  If you need to uninstall CU6, it is going to hang at some point, and leave you in a bit of a pickle.  What happens is it rolls back to SP2 (or whatever release/SP level you were at before applying any CUs) and then tries to apply CU5.  If it can't find the install files (because you deleted them!) then you will be stuck at SP2 and I am not sure currently if the uninstall will fail and rollback, or if it will succeed and leave you at SP2 instead of CU5.  I have heard that you might get the big hourglass in this scenario, and if so, you may eventually kill it -- potentially leaving your system in a bad state.  So, my recommendation here is, leave those installer files there.  This should be a very last resort for reclaiming disk space, and with a little foresight, you should run the installers from a system other than C:.

      Cluster Scenario

      Personally, I would highly recommend testing CU functionality on a throw-away cluster (e.g. virtualized) and doing your best to not have to remove a CU once it has been baked into an important cluster.  I have successfully removed a CU from one cluster in the past, but I am sure it is not a very highly tested scenario.  Both Geoff Hiten and I have had cases where attempting to install an SP or CU onto a cluster took on one node but not the other, and this made it impossible to back out the installation *or* to supercede it with the next SP or CU.  Geoff found a work-around after a very lengthy process; I gave up on MS Support after several months, and resigned to rebuilding the cluster from the ground up.

      Summary

      In a session in Seattle today, we were told that there would be a KB article forthcoming on how to deal with rolling back SP and CU installs.  When I hear about it, I will post something.


    • SQL Server 2005 SP3 is publicly confirmed

      Microsoft has committed publicly to releasing Service Pack 3 for SQL Server 2005.  Francois Ajenstat first talked about it very recently here.  While there were some casual mentions of the release in some responses to Connect items, I wanted to be sure it was truly public knowledge before I gave it higher visibility.  You should expect it in Q4 of this year, after SQL Server 2008 is released.

    • I'll trade you consistency for meeting your deadline

      While playing with the new Policy-Based Management (PBM) features of SQL Server 2008 the other day, I came across a really annoying syntax implementation that is going to trip up a lot of people unless it is fixed.  We all know how DATEADD works:

      SELECT DATEADD(DAY, 1, GETDATE());

      Sadly, because an expression for a condition is validated using C# and not T-SQL, the syntax needs to be slightly different:

      SELECT DATEADD('DAY', 1, GETDATE());

      Note the single quotes around 'DAY'... this syntax, obviously, will not work in T-SQL anytime soon:

      Msg 1023, Level 15, State 1, Line 1
      Invalid parameter 1 specified for dateadd.

      So, imagine I am creating and testing a big WHERE clause in a query window, and once I have it right, I want to copy and paste it into the expression editor for a condition, because I want to use that WHERE clause to enforce or monitor some policy.  Now I need to save two versions of it; the original T-SQL version (should I need to modify it later), and the condition-compatible version -- after meticulously adding single quotes by hand (or writing my own parser that will add them for me).  For example, would you really want to be going through large expressions like this, and "correcting" them?

      DATEADD('DAY',1-DATEPART(DW,GETDATE()),DATEDIFF('DAY',0,DATEADD('MINUTE',DATEDIFF('DAY',0,GETDATE()),GETDATE())))

      And this, only if I know in advance that the DATEADD syntax needs to be different (it is not the only function affected, by the way -- DATEPART() and DATEDIFF() have the same restrictions).  Note that because I forgot to put single quotes around 'DW', I will get an error message.  The error message that I receive is far less than helpful, and is the same regardless of where or how often I forgot single quotes, or even if I left out one of the parentheses:

      Error parsing 'DATEADD('DAY',1-DATEPART(DW,GETDATE()),DATEDIFF('DAY',0,DATEADD('MINUTE',DATEDIFF('DAY',0,GETDATE()),GETDATE())))'. Make sure string constants are enclosed in single quotes and facet properties are prefixed with '@' sign.

      It would be great to have a helpful addendum, indicating at least the first syntax error encountered, for example "Incorrect syntax near 'DW'"...

      I espoused about much of this on Connect, of course:

      PBM : Expressions like DATEADD require inconsistent and unintuitive syntax

      If you think consistency is important, please consider voting. Unfortunately, due to time constraints around the looming RTM date, it is unlikely this will be fixed. But one can hope.

      There are some other cases in the past where some decision was made on a developer's computer somewhere, and by the time the decision came into question, it was too late to correct.  Well, I have three examples that come to mind immediately, and one of them was, in fact, changed at the last minute.

      DATETIME2
      This data type covers pretty much everything, doesn't it?  Why not be consistent with the INT or CHAR types (BIGDATETIME or DATETIME(MAX))?  Do you envision the need for, say, EVENBIGGERDATETIME?  Yet the name implies that they are leaving room for a higher-scale or higher-precision date/time data type (DATETIME3?).  Or maybe the name was chosen by an employee that was recruited from Oracle.

      TIMESTAMP
      This was a very unfortunate naming blunder, since the data type is equivalent to ROWVERSION and has nothing to do with date or time at all.  But the implication given by name alone leads a lot of people to add a TIMESTAMP column to their table, only later to seek help in the forums, asking how to display their TIMESTAMP values as DATETIME, or perform a WHERE clause to filter by DATETIME.  We have asked repeatedly that this alias for ROWVERSION be marked as deprecated, and yet in sys.types in SQL Server 2008, TIMESTAMP appears, but ROWVERSION does not.  :-(

      DATE and TIME as CLR types
      When SQL Server 2005 was in beta, the SQL team thought it would be great to use DATE and TIME as a way to demonstrate the power of the new CLR types.  Unfortunately, the new types did not play well with the other DATETIME data types, the new SSMS GUI, or built-in functions like DATEADD and DATEDIFF.  Thankfully, a rather large group of us were so vocal in our complaints, that before ship date, they agreed to cut the feature until they could get it right.  There are still a few issues with the types as implemented in SQL Server 2008, but trust me, we are in much, much, much better shape now.


    • Why is the 8.3 filename convention still biting us?

      It is 2008.  Not 1992.  Why are we still intentionally creating filenames that conform to the limitations of ancient DOS and Windows 3.1 naming standards?  This came up today in one of the SQL Server newsgroups, where a user was trying to find the data file for the Northwind database he was sure had just installed.  Why couldn't he find it?  Because he was searching for Northwind.mdf.  Silly user!  It's obviously going to be Northwnd.mdf.  You should know that if i follows o, or if there are two e's, then you drop the second and all subsequent vowels to make an 8.3 filename.  However, if you have a double o, two e's, or i before e, then you drop the first vowel that appears.  *smacks forehead*

      If they couldn't call it Northwind.mdf because of the 8.3 limitation, then why didn't they pick a name they could still spell completely?  Eastwind and Westwind come to mind.  I wonder if I should file a suggestion on Connect to finally name the SQL Server executable correctly?  I'm sure there are still people who try to find sqlserver.exe or sqlsrvr.exe or sqlsrver.exe...


    • SQL Server 2008 is getting close... I can smell it!

      No, this is not a "funny funny, ha ha!" April Fool's Day post (though I do find those entertaining, and considered posting one myself).  This is for real, and it almost made me tingle a little.  I saw this comment from Eric Kang in a Connect item yesterday morning:

      "This is a known issue that we have addressed in a release candidate build."

      Now, had he posted that this morning, I might have assumed he was taking liberties because of the date.  But since it was yesterday, I have to believe that it is genuine.  If they're already talking about RC builds, we should see them soon... which means, barring any serious show-stoppers, RTM will be just around the corner. 

      I'm a little nervous that I'm still finding significant usability issues in CTP6, and since the code is already frozen, these are getting swept under the rug, or deferred to the next release, or maybe we'll see them in a service pack.  Or maybe they will address some of them, quietly, in the next CTP / RC release.

      In any case, as with all software, if the pros outweigh the cons, then push forward, right?  Even with little blemishes here and there, I am excited about starting to seriously work with the product, and hope to have it out in production in short order.


    • Is SQL Server "keeping up" with MySQL?

      I read a comment in a Connect suggestion today that implied that implementing their suggestion would allow SQL Server to "keep up with MySql."  Regardless of the actual suggestion, that made me laugh out loud... and since I was on the train, that got me some weird looks.

      The actual suggestion was to add row constructors, e.g.

      INSERT dbo.Table (col1, col2) VALUES (1, 2), (3, 4), (5, 6), (7, 8);

      Of course, SQL Server 2008 adds support for this construct.  Better late than never.  However, my impression is that this is the exception rather than the rule.  When did MySQL finally add stored procedures?  How about transaction support, SSIS-like capabilities, a scheduled job management system, and a relatively easy-to-use consolidated GUI management tool? 

      "LIMIT" notwithstanding, if SQL Server's functional specs weren't already light years beyond MySQL with SQL Server 2005, it will certainly leave them in the dust when SQL Server 2008 hits the streets.

      Now, this isn't a slam on MySQL really, but rather I am questioning the narrow perception of many MySQL users.  I am curious if you can provide a list of features (aside from "LIMIT") that MySQL has that SQL Server doesn't, or that MySQL had first.  Perhaps my perception is the one that is narrow.


    • Which to use: "<>" or "!="?

      For some, the answer is easy.  Since Tibor reminded me several months ago that <> == standard and != <> standard (and yes, I used that notation on purpose), I have been making a conscious effort to use only <> going forward.  Similarly, I have tried to avoid GETDATE() in favor of CURRENT_TIMESTAMP.  (Of course this doesn't work in most of my systems where we use UTC, and occasionally get bitten by some sys admin accidentally switching a server to EST/EDT or to observe daylight savings time under GMT -- so there I always use GETUTCDATE()). While I have no intention of going back into old code just to change it, I have been making little updates here and there when I have been in there for other reasons...

      In general, I am not immediately concerned about portability.  Companies don't just change their database platforms overnight, and when they do decide to move, it's going to require a lot of changes.  Certainly code is easier to change than data types and table structure, but little syntax things like this are still going to remain a small part of the puzzle for the majority of database applications I have observed.

      I have no problem using IDENTITY (sorry Celko), UPDATE FROM (sorry Hugo), and little things like RAND() and NEWID() (I have no idea who to apologize to here).  These offer me things that the standard can't... for example, in order to convert most of my UPDATE FROMs to standards-compliant UPDATEs, I would need to use a cursor instead of a set-based solution, and in several situations the performance would not be acceptable.

      I also love some of the new syntax that is not supported elsewhere (e.g. DECLARE @foo INT = 5;) and MERGE (which has rough equivalents in some other vendors, and which adheres to the standard, but has extensions that don't)... I will not be afraid to use these when we move to SQL Server 2008.  Others have expressed opposition to anything that deviates from the standard, but I believe that if the standard is limiting, and a vendor introduces something that makes our lives easier, why should I avoid it based on principle?  Those concerned with portability can just make it a priority to avoid non-standard syntax... however, in any vendor implementation, all but the simplest database applications will undoubtedly have to deviate from the standard to some degree.  In fact, I would wager that it is impossible to develop a real-world application of any complexity whatsoever, on a popular RDBMS (meaning Oracle, MySQL, SQL Server, DB2, etc), and adhere solely to ANSI-92.  Part of the problem is that no vendor is 100% compliant, but even more so, all vendors have attractive and tempting alternatives that improve development, maintenance, performance, or all three.

      Are there Microsoft deviations that don't make sense?  Of course.  The prime example that comes to mind is that Microsoft's implementation of TOP is horrible.  And because they have stale tools that would take too much effort to change / replace, they are *still* inadvertently force-feeding this mythical idea that in order to save a view definition with an ORDER BY clause, all you need to do is add TOP 100 PERCENT to the SELECT list.  And people are still surprised / angry that selecting from such a view without an order by clause on the outer query does not always return the rows in the order they expect.  The OVER() clause is a better alternative to TOP in certain scenarios, but it is not complete (and won't be in SQL Server 2008, either).

      We even start people off with the general idea that following the standard is not important.  Look at MS Access, the starter database for most people in the Microsoft realm.  It has very little ANSI compliance, and supports all kinds of Microsoft-specific extensions like FIRST(), VBA, macros, etc.  And when moving to SQL Server, a database platform made by the same vendor, the level of effort required will often exceed that of a SQL Server -> Oracle or MySQL -> DB2 conversion.  So it is not surprising to me that people generally have little concern for adhering to the standards, not just because in reality they rarely *need* to do so, but also because they've been "brought up" that way, so to speak.

      Yes, best practice should dictate to follow the standard.  But often it just doesn't seem worth it.

      Anyway, back to the title.  I am using <> now, to help ease the pain down the road, in the odd event that I write something that lasts longer than the company's commitment to the Microsoft platform.  Though, deep down, I prefer != ... mostly because it seems less Mickey Mouse-ish.  And by Mickey Mouse, I mean, of course, pre-.NET Visual Basic.  :-)


    • Increased limits for columns / indexes / statistics

      I am still on the fence about whether this is a good thing or a bad thing.  However, in the next build of SQL Server 2008 you will be able to get your hands on, you will find that some element limits have changed, just a bit.  With the introduction of sparse columns, you can now jam 30,000 columns into a table.  No, that is not a typo; I really meant to type 30,000 columns (up from 1,024).  You can also create 30,000 statistics (up from 2,000), and 1,000 indexes (up from 250).

      This room comes at a cost, though.  For tables with sparse columns, the maximum row length drops from 8,060 bytes to 8,018 bytes.

      I question whether there is a practical application for this in the real world.  There probably is, but I must be in the wrong market segment to understand what it could possibly be.

      In any case, I'm sure this news makes someone out there happy.


    • I would rather see SP3, but I'll take CU6 instead...

      Cumulative Update 6 for SQL Server 2005 is now available.  This is build 3228.

      http://support.microsoft.com/kb/946608

      Looks mainly like fixes to replication, reporting services, and analysis services.  But there appear to be some fun access violations and arithmetic overflow issues that are being fixed, too.

    • 16 Changed Dynamic Management Views in SQL Server 2008

      Yesterday Denis Gobo told you about 33 new DMVs in SQL Server 2008; previously, I had noticed that a few of the existing DMVs from 2005 had changed slightly.  So, Denis sparked my curiosity, and I ran the following query on a recent build of SQL Server 2008, which had a linked server pointing to SQL Server 2005:

      SELECT s1.vn, s1.cn
      FROM
      (
              SELECT
                      vn = OBJECT_NAME([object_id]),
                      cn = name
              FROM
                      master.sys.all_columns
              WHERE
                      OBJECT_NAME([object_id]) LIKE 'dm[_]%'
      )
              s1
      LEFT OUTER JOIN
      (
              SELECT
                      vn = v.name,
                      cn = c.name
              FROM
                      [SQL2005_LinkedServer].master.sys.all_columns c
              INNER JOIN
                      [SQL2005_LinkedServer].master.sys.all_views v
              ON    
                      c.[object_id] = v.[object_id]
              WHERE
                     v.name LIKE 'dm[_]%'
      )
          s2
      ON
              s1.vn = s2.vn
              AND s1.cn = s2.cn
      WHERE
          s2.vn IS NULL
          AND EXISTS
          (
              SELECT 1
                  FROM [SQL2005_LinkedServer].master.sys.all_views
                  WHERE name LIKE 'dm[_]%'
                  AND name = s1.vn
          )
      ORDER BY
          vn,cn;

       

      Here are the results; 25 new columns across 16 DMVs.  Several seem to have to do with resource governor, but the one I think I like the best is sys.dm_os_sys_info.sqlserver_start_time ... this goes back to a suggestion I made on Ladybug, which was a bug/suggestion system that pre-dates Connect.

      ViewColumn
      dm_db_file_space_usage database_fragment_id
      dm_exec_cached_plans pool_id
      dm_exec_query_memory_grants group_id
      dm_exec_query_memory_grants ideal_memory_kb
      dm_exec_query_memory_grants is_small
      dm_exec_query_memory_grants pool_id
      dm_exec_query_resource_semaphores pool_id
      dm_exec_requests group_id
      dm_exec_sessions group_id
      dm_fts_active_catalogs is_importing
      dm_os_buffer_descriptors numa_node
      dm_os_memory_cache_entries pool_id
      dm_os_schedulers quantum_length_us
      dm_os_sys_info sqlserver_start_time
      dm_os_sys_info sqlserver_start_time_ms_ticks
      dm_os_tasks parent_task_address
      dm_os_worker_local_storage broker_address
      dm_repl_schemas re_colattr
      dm_repl_traninfo begin_time
      dm_repl_traninfo commit_time
      dm_repl_traninfo error_count
      dm_repl_traninfo is_known_cdc_tran
      dm_repl_traninfo session_id
      dm_repl_traninfo session_phase
      dm_tran_active_transactions filestream_transaction_id

    • List of SQL Server 2005 post-SP2 builds

      For years, ever since I started posting SQL Server 2000 build lists (and later, SQL Server 2005 build lists) over at aspfaq.com, I have been complaining that Microsoft makes it very hard for us to stay on top of these things, and suggested on numerous occasions that they publish a list of builds that have been released, even if only through support cases.

      Well, it looks like they're finally listening.  Here is a list of all of the builds that have been released for SQL Server 2005, since service pack 2:

      http://support.microsoft.com/kb/937137/

      Now what I would suggest next is that whenever they create a summary KB article like this, they place it somewhere prominent (e.g. on the service pack x download page).  This way, people don't have to know to take it upon themselves to go search for post-SP builds that will fix any unexpected problems that the service pack itself introduces (I'll give you a hint: sounds like "tainted Jets fans").

      So, people who have been painstakingly keeping track of builds manually (myself on this blog, and others like sqlserver-qa.net, it seems like we have much less work cut out for us in the future.  The main problem is that it's impossible to keep updating our own blog entries or reference sites, especially during periods of high activity for the build team (currently we're not in one of those).  Hopefully they will keep improvin