|
|
|
|
Aaron is a senior consultant for SQL Sentry, Inc., makers of performance monitoring and event management software for SQL Server, Analysis Services, and Windows. He has been blogging here at sqlblog.com since 2006, focusing on manageability, performance, and new features; has been a Microsoft MVP since 1997; tweets as @AaronBertrand; and speaks frequently at user group meetings and SQL Saturday events.
-
This time around, I've got five wish list items for T-SQL / DDL, and then one bonus - a parallelism bug. My thinking on the former is that, since we are either about to start (or are just finishing) locking down the code for SQL Server 2012, this is the first moment where some of these suggestions can be seriously considered without the "but we're about to ship" excuse. :-) The last one is just an incorrect results bug that needs to be fixed. I'll start with the T-SQL / DDL wishes:
Ordered Set Functions
For a while I've been pretty vocal about getting grouped concatenation into SQL Server - functionality that exists in MySQL (GROUP_CONCAT) and more recently added to Oracle 11gR2 (LISTAGG). (And here I thought we were the only pencil-necks using an R2 moniker.) I've often promoted this Connect item when the question comes up on forums or #sqlhelp. This can be solved by a grander implementation - ordered set functions. This comes directly from the standard and supports grouped concatenation (with predictable ordering, unlike many of the workarounds that exist today, such as FOR XML PATH), and a slew of other functionality. The item you should vote for is one that Itzik Ben-Gan just raised this week (and please see this document for more in-depth background):
Connect #728969
Create or Replace
I'm pretty sure I've pimped this one before, but in the interest of driving up items that might make the T-SQL / DDL feature list for the *next* version of SQL Server... we should have the ability to specify in the DDL that we want to create this thing if it doesn't exist, and alter if it does. This avoids the need to re-script permissions, worry about breaking existing dependencies, or build complicated batch logic with a stub create, dynamic SQL, etc. I would gladly take this for simple modules only (procedures, triggers, functions, non-indexed views) as the logic around completely re-designing, say, a table can quickly become more complex.
Connect #127219
Native Regular Expression Support
I would love to see RegEx as a first class citizen, and at least in V1 don't expect this to be fully optimized the way LIKE can be. I just get so sick of writing complex and wordy expressions using PATINDEX / SUBSTRING / STUFF / REVERSE while switching between 0-based and 1-based in my head. And I . Since other CLR functionality is slowly making its way into the top T-SQL tier (spatial, FORMAT(), TRY_PARSE(), etc), I think it's high time we can perform RegEx as well without each of us having to write our own complicated assemblies, deploying and maintaining them, all that assuming we can assure the boss or the IT guys that it's ok to use CLR.
Connect #261342
Create Table As
It would be really nice to have DDL that just says "create a table, but make it a copy of that other table." Today it is easy enough to do this using the tools but to do it in an automated way of any kind is foolhardy - the complexity of the code to generate a CREATE TABLE statement when anything complex is involved can be astounding - consider dependencies, constraints, keys, permissions, triggers, computed columns, indexes, etc. Some folks recommend PowerShell for this but I just think you are trading a different syntax for the same complex code.
Connect #124506
Dynamic Pivot
The PIVOT and UNPIVOT commands as they exist today expect that you know all of the potential values up front. What if you are feeding a report that is more than happy to accept as many columns as you're willing to provide? What ends up happening is we use really ugly workarounds with dynamic SQL after scanning the table and seeing all the potential pivot values (year-month combinations, let's say) that may be represented. I'd like to see this be a little more reactive to the data returned by the query, though it does step into the land of unstructured data at least a little bit...
Connect #127071
...and one of these things is not like the others...
Another parallelism bug
While at least one of the related bugs has recently been fixed, I've previously talked about a few potential perils of parallelism
(say that three times fast!). Last week at the MVP Summit I encountered
an unfortunate scenario in SQL Server 2012, where incorrect and in fact
unpredictable results can come out of using SUM() OVER() with the new
windowing functions when combined with parallelism. There are some
decent workarounds but they're not very intuitive (and it may be
difficult to notice that you're affected).
Connect #728932
|
-
Microsoft recently released Windows Server "8" Beta for download. When you launch setup you'll be asked if you want to install Server Core or Server with a GUI:
While I will definitely be leaning toward Server Core for production, I chose the GUI installation for now just to feel my way around for a bit on a local VM. Also because it would be tough to demo all of SQL Server 2012's features from a Server Core VM.
Next during setup you will be asked if you want to upgrade or create a new installation. The upgrade is pre-selected, so be careful about pressing Enter here since - if you are installing on a VM - there is unlikely to be a pre-existing version of Windows on the VHD you just finished allocating:
From there installation itself was pretty simple, a lot like previous versions of Windows. The slowest part was "Expanding files" - I'm on SSD so I can't even imagine how slow this is going to seem for you platter folks. Just a little warning. Though I did experience a snafu when installing VMWare Tools (I run VMWare Fusion) - the VM froze up. I've also heard about some issues with the VMWare Tools for VMWare Workstation. So, depending on your virtualization platform, your mileage may vary.
Two things needed to be changed immediately. One was the resolution - the default is 1024 x 768 and this is horribly unusable. Once I was up to 1680x1050 (right-click the desktop and choose "Screen Resolution"), I wanted my Start Menu back. So I ran the following command provided in Mikael Nystrom's blog post (there is also a PowerShell version):
reg.exe add HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer /v RPEnabled /d 0 /t REG_DWORD /f |
Sadly, these tricks were written for the Consumer Preview, but they do not seem to work for server. If you've enbled the Start Menu on server, please let me know how you did it and I will update this post (and my VM).
One other thing I wanted to change was file extensions. I tried to create a .ps1 script on my desktop and it was actually called .ps1.text. This is one of those "let's cater to dummies" features that I've always hated, and lack of disciplined extensions is actually one of the things I like less about Mac OS. Anyway Windows Explorer now has a ribbon, and some of these settings are much easier to get to compared to the old Tools / Folder Options path:
Once Windows Server 8 was up and running (almost) the way I like, it was time to install SQL Server 2012 RC0. Remembering my trials with an earlier preview of Windows 8, and after reading Allan Hirt's blog post, I knew that I would have to manually install .NET Framework 3.5 in order to install SQL Server 2012. Allan outlines how to do this with the command line:
dism /online /enable-feature /featurename:NetFx3 /source:d:\sources\sxs\ |
I did this, and it reported success:
However, after restarting the system, when I went into the Server Roles and Features wizard, the .NET 3.5 feature was still not enabled. So I stepped through the wizard to add the feature, and rebooted again.
Once I rebooted, I validated that .NET 3.5 was correctly installed according to Server Manager. I went through the SQL Server setup and, once I had made all of my feature and other configuration selections, it took about 12 minutes to install SQL Server. When it was finished, it had placed a whole slew of tiles onto the Metro start page:

So, a lot of cleanup necessary here. But I have SQL Server, Management Studio and I can start to play:
|
-
A long, long time ago in a galaxy you can reach out and touch right now, ORDER BY in a VIEW meant something. If you had a view like the following:
CREATE VIEW dbo.Rubbish AS SELECT TOP 100 PERCENT Trash FROM dbo.Garbage ORDER BY Trash;
|
And then you ran a query such as:
SELECT Trash FROM dbo.Rubbish;
|
You could reliably expect the rows to come back ordered by Trash in ascending order, even though it is relatively clear that the ORDER BY here serves a completely different and overloaded purpose - to dictate which rows should be filtered by TOP.
When SQL Server 2005 came around, changes to the optimizer meant that this behavior - which was never documented, by the way, hence never guaranteed - ceased to work as reliably as it had in previous versions. In fact in this case the optimizer sees TOP 100 PERCENT and simply throws out both TOP and ORDER BY. Compare the differences in the plan from the view with and without the ORDER BY:'
Notice how there is no sort operator in the query without ORDER BY? Because we didn't tell SQL Server what order we want, it's free to make its own choice. In this case it is choosing a clustered index scan which is not based on the Trash column. You can the end result of the difference in these plans by comparing the results from the select:
Obviously only the latter result honors the outer ORDER BY, and in fact both cases have discarded the inner TOP/ORDER BY combination.
People are surprised by this change, and in fact variations on the symptom come up in forum discussions all the time. Further to the confusion, that still exists today, early on there were some very vocal customers who screamed bloody murder and were very stubborn about changing their code to put the ORDER BY on the outermost query where it belongs. After upgrading to SQL Server 2005 they cried foul until Microsoft finally submitted - creating a trace flag that forced the optimizer to behave the old way. You can read about this in . Note that this trace flag is only valid in SQL Server 2005 and SQL Server 2008 - in SQL Server 2008 R2 and SQL Server 2012, the trace flag seems to be a no-op (at least in every case that I've tried). Even if you are still on 2005/2008, I highly recommend using this option as a last resort only, because it changes the way the optimizer works and this can have other undesirable impacts on your entire workload.
Hopefully none of this is news, as it has been discussed at length both online in general, and more specifically on Connect. Where it gets more interesting is that there have been several bugs reported that tools within Management Studio, such as the view designer, actually encourage this behavior and help perpetuate the myth. The issue I noticed this weekend is that, while most bugs have been closed as won't fix, one (#249248) has been closed as fixed. So I immediately fired up the latest build of Management Studio I have, right-clicked the views node, and selected New View. Once I added a table, what I saw was not very promising, as I was able to choose Sort Order and Sort Type:
So Management Studio for SQL Server 2012 still allows me to specify a sort order and sort direction for any column in the view! And what does it do to the SQL? It injects TOP 100 PERCENT / ORDER BY! Bad, bad, bad!
On the plus side, if I try to *save* this view, I at least get some kind of warning:
For Google/Bing-fu:
|
Warning: The ORDER BY clause is used only to determine the rows that
are returned by the TOP clause in the view definition. The ORDER BY
clause does not guarantee ordered results when the view is queried,
unless ORDER BY is also specified in the query itself. Click CANCEL
to discard your modifications. Click OK to save the view. |
It's great that they're finally trying to stop people from believing that ORDER BY in a view means anything about ordering the result. But the problem that's being overlooked here is that not everyone uses the view designer to create and save a view in a single session. They may be generating code for a view that they're going to cut & paste into a CREATE/ALTER view script elsewhere, to use in a CTE or derived query in a more complicated script, or to send to a colleague. In a lot of cases they are never going to see this warning, because this won't magically pop up when they try to use the syntax in other scenarios. So I left a comment on the Connect item with a little detail about that, because I'm unfortunately going to have to disagree that this issue has been "fixed." I'm not sure why the Sort options in the view designer can't just be hidden without changing the behavior of the tool in general - can't it just always behave as if I hadn't changed those options at all? I do acknowledge that there are some edge cases here, where someone really does want to create a view that selects the top (n) rows ordered by some criteria. So perhaps my suggestion to remove the columns altogether is a bit harsh. But when I choose a Sort Order for one or more columns, it should not just throw TOP 100 PERCENT into my query; the query that's produced at this point shouldn't be valid. Instead it should prompt me with a similar warning to what I get now when I click Save, with additional wording that implies, "If you really do mean to take some subset of the rows in the table, please add a TOP (n) clause to indicate the number or percentage of rows you want to return."
Agree with me? Please go comment and vote!
|
-
I read a statement on a forum recently that said something like:
|
IDENTITY columns are the primary key, and primary keys are clustered. |
Wow. My first thought was, "I hope people aren't learning from this." Unfortunately, I suspect a few already have, as no matter what you write on the Internet or say in passing, at least one person is going to believe you. I suspect the statement above came from someone's opinion, based on seeing a handful of tables in their life, which always had an IDENTITY column that was also the PK. While you may be able to attribute this to correlation (e.g. there are many IDENTITY columns that are also the clustered primary key), this is not necessarily causality (e.g. they are the clustered primary key because they are the IDENTITY columns, or vice versa). Besides, it is not a universally true statement anyway (since there are many variations possible where an IDENTITY column is not the primary key, where the primary key is not clustered, and so on).
This is just an exercise of how to dispel such a myth, and battle the blatant disregard for how someone's limited experience might differ from the true spectrum of reality, since they may need to be convinced with actual examples. (As an aside, I remember several years ago when I interviewed for a position at Microsoft, where the interviewer told me that you couldn't create a self-referencing foreign key. I still laugh about it that he refused to believe me until I grabbed his laptop and showed him. He wasn't the reason I didn't take the job, by the way.)
I'm not trying to be mean here, but statements intended to be perceived as fact should not be made so casually and without any background in the topic. And I've talked here before about why generalizations are dangerous - even when they're not about issues that can easily be proven one way or another, or where those generalizations appear to be true in all the cases you've seen. Whenever you hear someone say "always" or "never" when it comes to SQL Server, you should at least have your tin foil hat within reach, because it's possible someone is trying to brain-wash you. There are very few scenarios that are absolute, very few statements that are 100% true 100% of the time, and a large majority of statements can and should be suffixed with, "...but it depends on x, y and z." I know how much everyone hates "it depends" but, quite frankly, it almost always does.
Anyway, back to the specific issue at hand. Let's examine the example this person mentioned, and a few counter-examples:
A table with a clustered primary key on the identity column:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, name NVARCHAR(32) -- ,... other columns ); |
You can even leave out the CLUSTERED keyword, as the default implementation of a primary key in SQL Server is CLUSTERED:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1) PRIMARY KEY, name NVARCHAR(32) -- ,... other columns ); |
So this is the table the person above based their conclusion. But what about tables that violate his definition?
A table with an IDENTITY column but no primary key or clustered index:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1), name NVARCHAR(32) -- ,... other columns ); |
A table with an IDENTITY column but the primary key is on another column and is clustered:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1), name NVARCHAR(32) PRIMARY KEY --CLUSTERED -- ,... other columns ); |
A table with an IDENTITY column but the primary key is on another column and the primary key is not clustered:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1), name NVARCHAR(32) PRIMARY KEY NONCLUSTERED -- ,... other columns ); |
A table with an IDENTITY column represented by a unique clustered index, but the primary key is on another column:
CREATE TABLE dbo.foo ( ID INT IDENTITY(1,1), name NVARCHAR(32) PRIMARY KEY NONCLUSTERED -- ,... other columns ); CREATE UNIQUE CLUSTERED INDEX x ON dbo.foo(ID); |
So clearly there are plenty of variants where IDENTITY columns are not necessarily the clustered primary key.
Conclusion
If you want your IDENTITY column to be the primary key, say so. If you want the primary key to be clustered, say so. But most importantly, understand that just because you see something in one place, don't assume that it is always true, or that it is the rule you have to follow - even if you've only ever seen it to be true. Some other areas where this comes up: the "natural order" debate (where people assume that because some query without an ORDER BY returns in a specific order today, that it will continue to order that way tomorrow) and the "short-circuit" debate (where people believe that SQL Server will always process your query in the order it was written). I hope I've dispelled both of those myths in a previous Bad Habits topic and the short-circuiting one more recently in this question on dba.stackexchange; but if you believe any of these are universally true, please feel free to leave a comment. :-)
|
-
Last night Microsoft released two new cumulative updates for SQL Server 2008 R2: Cumulative Update 5 for SQL Server 2008 R2 SP1
My recommendation: You should test and plan for this update if SELECT @@VERSION is between 10.50.2500 and 10.50.2805.
Cumulative Update 12 for SQL Server 2008 R2 (RTM)
- KB #2659692
- Build 10.50.1810
- 3 fixes but none are directly related to the relational engine.
My recommendation: Consider applying if SELECT @@VERSION is between 10.50.1600 and 10.50.1809, you are affected by any of the issues highlighted in the KB article,
and you are avoiding Service Pack 1 intentionally (otherwise you should
install SP1 and CU5 for SP1).
My usual disclaimer: these updates are NOT for SQL Server 2008. Only apply to systems where @@VERSION returns 10.50.xxxx.
|
-
During SQL Server 2012 RC0 setup (specifically when upgrading), you may have noticed upgrade rules regarding Lock Pages in Memory (LPIM):

However, for most folks, these rules always pass, whether or not they are actually using LPIM. I wanted to run a few tests to demonstrate why this is - or at least in which situations the rule checks will fail. So I created two virtual machines running SQL Server 2008 R2 SP1 CU3 - one running Windows Server 2008 SP2 (x86), the other running Windows Server 2008 R2 SP1 (x64 obviously). Both machines have 4 CPUs and 8GB of RAM, and have been updated with all Windows Updates (except Internet Explorer) as of the time of writing. I set up SQL Server to run as an explicit user account (to make it easy to assign LPIM rights for the appropriate tests).
I will attempt to upgrade SQL Server under various scenarios (LPIM / AWE on x86, and LPIM / trace flag 845 on x64 - required for LPIM on Standard Edition). All of these tests will be run with LPIM enabled, but other settings will vary as follows:
On the 2008 box, to make sure that AWE can be used by SQL Server, you may first need to set this at the operating system level - this depends on the specific operating system, whether DEP is enabled, amount of RAM and other factors. It used to be quite easy - go into boot.ini, add the /PAE switch, and reboot. Starting with Windows Server 2008, this changed to use the command-line utility bcdedit. To ensure that AWE and PAE are explicitly set on this specific system (x86 box with 8GB RAM), you could run the following command at an elevated command prompt:
bcdedit /set pae ForceEnable |
Of course we don't have to do any of this on the x64 box - and in my case I didn't need to do this on the x86 box, either. But the matrix for the above can become quite complicated, and is just one more reason to get off of x86 - both Windows and SQL Server - as soon as you can! (Never mind that AWE is no longer supported in SQL Server 2012, nor is a cluster on WoW.)
Next, to set Lock Pages in Memory, we need to launch the Local Group Policy Editor (Start > Run > gpedit.msc) and add the SQL Server service account to Computer Configuration > Windows Settings > Security Settings > Local Policies > User Rights Assignment > Lock pages in memory:

And then set SQL Server to run as the matching account so that it inherits the LPIM rights. You can always change this later, but for a one-off test like this it can be just as easy to specify the service account during setup:
We can then enable trace flag 845 using SQL Server Configuration Manager - right click the appropriate service, choose Properties, and on the Advanced tab, add -T845 to the Startup Parameters list (make sure there is no space between the last ; and -T845):

Note that this is much easier with SQL Server 2012 client tools installed, even against older versions of SQL Server, as there is a new Startup Parameters tab to help you avoid messy typing inside that little combo box:
And finally, to enable SQL Server to utilize AWE on the x86 box, you can use the UI through Management Studio, or you can run the following code using sp_configure:
EXEC sp_configure 'show advanced options', 1; RECONFIGURE WITH OVERRIDE; GO
EXEC sp_configure 'awe enabled', 1; RECONFIGURE WITH OVERRIDE; GO
EXEC sp_configure 'show advanced options', 0; RECONFIGURE WITH OVERRIDE; GO |
(This is all just for testing purposes, but of course I need to point out that you wouldn't just enable AWE or LPIM without also adjusting min/max server memory appropriately and without proper consideration into what other services on the box will need memory.)
Restart each box and we should be ready for the first tests. We can verify LPIM / AWE via the following entries the SQL Server error log after restart:
For x64, there will be a line item that says "Using locked pages for buffer pool.":

When LPIM is not enabled (e.g. when the user rights assignment has not been set or the trace flag is not in use), the line above does not appear.
For x86, a message like "Using locked pages for buffer
pool." will not appear in the error log, even though LPIM is enabled in
this case. Instead you will see a line item about AWE, "Address Windowing Extensions is enabled":

On
x86 if LPIM is not enabled (e.g. if you haven't used the trace flag on
Standard Edition, have not enabled AWE, or have not set the user rights correctly), you will see
different error messages in the log, such as "SQL Server is not configured to use all of the available system memory. To enable SQL Server to use more memory, set the awe enabled option to 1 by using the sp_configure stored procedure." and "Addressing Windowing Extensions (AWE) requires the 'lock pages in memory' privilege which is not currently present in the access token of the process.":
One issue you might come across is the case where you've triple-checked everything - you know you've given the service account LPIM rights in the Group Policy editor, you've set the trace flag and you've restarted SQL Server, but the error log still seems to indicate that LPIM is not set. Please verify that you've entered the trace flag argument correctly. The following two entries are not the same, and only the first is valid:
...\mastlog.ldf;-T845; ...\mastlog.ldf; -T845; ----------------^ this space may make you scream |
Once you've verified that LPIM is up and running, let's go through the first few steps of SQL Server 2012 RC0 setup and see how the rule checks pan out.
32-bit systems
- AWE and trace flag 845 both enabled
If you have AWE enabled and the trace flag turned on, the AWE check fails:
|
--------------------------- Rule Check Result --------------------------- Rule "Is AWE Enabled Check for x86 installations" failed.
AWE is currently enabled on a 32-bit instance of SQL Server. This feature is no longer supported. You must disable AWE before upgrading the instance. --------------------------- OK ---------------------------
|
- AWE disabled, trace flag 845 enabled
If you disable AWE and restart the service, you will see the above message about using sp_configure in the error log, but both the AWE and the LPIM checks pass. This is because you didn't already have LPIM enabled, since AWE was not enabled. Even though you might have thought that you were using LPIM prior to the upgrade:
|
--------------------------- Rule Check Result --------------------------- Rule "LPIM Check for x86 installations" passed.
Lock Pages In Memory (LPIM) check for X86 succeeded. --------------------------- OK ---------------------------
|
- AWE enabled, trace flag 845 disabled
If you re-enable AWE and then turn off the 845 trace flag, you won't see any error messages in the error log, except for the "AWE is enabled" message. And you will see the same AWE rule fail upgrade you'll see if you have the trace flag turned on:

|
---------------------------
Rule Check Result
---------------------------
Rule "Is AWE Enabled Check for x86 installations" failed.
AWE
is currently enabled on a 32-bit instance of SQL Server. This feature
is no longer supported. You must disable AWE before upgrading the
instance.
---------------------------
OK
---------------------------
|
- AWE and trace flag 845 both disabled
If you then disable AWE and leave the trace flag disabled, you get the message in the log about not using all of the available memory, but again, like in case 2., both the AWE and LPIM upgrade rule checks pass:
|
---------------------------
Rule Check Result
---------------------------
Rule "LPIM Check for x86 installations" passed.
Lock Pages In Memory (LPIM) check for X86 succeeded.
---------------------------
OK
---------------------------
|
So for x86, I found no way to trigger an upgrade rule failure for the specific "LPIM Check for x86 installations" using RC0, but perhaps by RTM, this rule will also fail in addition to the AWE check in cases 1. and 3. above. In any case, I don't think it's that big of a deal since, if you are running x86, you won't be able to take advantage of AWE or LPIM anyway (more on that below). I think during the upgrade process you should be warned about this on any x86 machine with more than 4GB of RAM, regardless of your current AWE settings or trace flags. Not everyone is going to read the discontinued engine features topic or attend one of my "What's New in SQL Server 2012?" presentations, and they may not realize later that when they change those settings they are not going to have any effect.
Note that in SQL Server 2012, AWE is no longer supported. So it may very
well be the case that you were expecting your AWE settings (in
sp_configure, not in boot.ini) and LPIM to play well together after
upgrading to SQL Server 2012, but this is not the case - SQL Server will
no longer be able to see all of your memory, and this may have
significant impact on your workload. Of course if you have a machine
with 16GB of RAM and suddenly SQL Server will only see about 4GB, you
might not be too impressed with the "upgrade" - hence the warnings. Note
also that your experience with the RCs of SQL Server 2012 (all of which
are Evaluation Edition) may yield different results than what will
happen at RTM time (when you will be able to use a license key to
specify "normal" editions like Standard and perform an actual end-to-end
upgrade).
64-bit systems
- Trace flag 845 enabled
As expected, with both LPIM and TF845 enabled, the upgrade rule "LPIM Check for x64 Installations" passes (even though the "Using locked pages in memory" message does not appear in the error log). There is nothing to warn the user about because LPIM will continue to be honored:
|
---------------------------
Rule Check Result
---------------------------
Rule "LPIM Check for x64 installations" passed.
Lock Pages In Memory (LPIM) check for X64 succeeded.
---------------------------
OK
---------------------------
|
- Trace flag 845 disabled
If we restart SQL Server without the trace flag, we no longer see the "Using locked pages for buffer pool" message in the error log, but the LPIM check still succeeds - even though we're on Standard Edition and the trace flag is not set (which is the reason behind the warning):
|
--------------------------- Rule Check Result --------------------------- Rule "LPIM Check for x64 installations" passed.
Lock Pages In Memory (LPIM) check for X64 succeeded. --------------------------- OK ---------------------------
|
This last case is the most concerning to me. The point of the warning is to let me know that I have the LPIM permission set for the service account, but that it won't be honored without the trace flag because I'm running Standard Edition. In this specific case, I think the LPIM upgrade rule should fail, even though my current configuration also prevents me from using LPIM.
Conclusion
I quite sincerely hope this is my last 32-bit related blog post. For one, the x86 installation of SQL Server took at least five times longer than the x64 equivalent. x86 RIP!
In any case, this is just one way Microsoft is helping to prevent customers from shooting off their own feet. Especially in the x86 case where their current workloads are dependent upon AWE; it needs to be clear to them that after the upgrade to SQL Server 2012, their configuration isn't what they think it is...
|
-
I see a lot of people suggest while loops instead of cursors in situations where row-based processing is required (or, at least, where folks think that row-based processing is required). Sometimes the justification is that constructing a while loop is simpler and more straightforward than constructing a cursor. Others suggest that a while loop is faster than a cursor because, well, it isn't a cursor. Of course the underlying mechanics still represent a cursor, it's just not explicitly stated that way using DECLARE CURSOR.
The difficulty of writing a piece of code should not be the primary factor in avoiding that type of code. I use MERGE - while I will likely never memorize that syntax, I know it is safer and less likely to cause deadlocks than typical "UPSERT" methodologies. Looking up the syntax diagram in Books Online, or using templates or snippets, are good workarounds to having to avoid coding constructs because they're "too hard." The same is true for cursors - sure the syntax is cumbersome, but templates or snippets can do most of the work for you.
Besides, is it that much simpler to set up and use a WHILE loop? And is it really that much more efficient?
Simplicity
I asked a few colleagues to take this code sample and change it from an explicit cursor to a WHILE loop:
DECLARE @schema_name SYSNAME, @object_name SYSNAME, @index_name SYSNAME @s NVARCHAR(MAX) = N'';
DECLARE indexes CURSOR LOCAL STATIC FORWARD_ONLY READ_ONLY FOR SELECT s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i;
OPEN indexes;
FETCH NEXT FROM indexes INTO @schema_name, @object_name, @index_name;
WHILE @@FETCH_STATUS = 0 BEGIN -- we're just concatenating here, but pretend we needed to, -- say, call a stored procedure for each row in the cursor:
SET @s += CHAR(13) + CHAR(10) + N'ALTER INDEX ' + QUOTENAME(@index_name) + ' ON ' + QUOTENAME(@schema_name) + '.' + QUOTENAME(@object_name) + ' REORGANIZE;';
FETCH NEXT FROM indexes INTO @schema_name, @object_name, @index_name; END
CLOSE indexes; DEALLOCATE indexes;
|
Before you crucify me about the code sample, or tell me that I could do this without a cursor or a while loop, please understand that I wasn't really trying to solve this problem, I was only trying to come up with a simple string-building exercise that can be accomplished with a cursor. I know that a cursor isn't required to just return a concatenated string, for example you can use FOR XML PATH:
DECLARE @s NVARCHAR(MAX);
SELECT @s = ( SELECT CHAR(13) + CHAR(10) + 'ALTER INDEX ' + QUOTENAME(i) + ' ON ' + QUOTENAME(s) + '.' + QUOTENAME(o) + ' REORGANIZE;' FROM ( SELECT TOP (1000000) s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i ) AS x FOR XML PATH(''), TYPE ).value('.[1]', 'NVARCHAR(MAX)');
|
Or even simpler concatenation:
DECLARE @s NVARCHAR(MAX) = N'';
SELECT @s += CHAR(13) + CHAR(10) + 'ALTER INDEX ' + QUOTENAME(i) + ' ON ' + QUOTENAME(s) + '.' + QUOTENAME(o) + ' REORGANIZE;' FROM ( SELECT TOP (1000000) s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i ) AS x;
|
(While it's true that these alternatives don't use an explicit cursor or while loop, so may appear to be "set-based," it is important to note that in these approaches the actual output order is not guaranteed. And if you think this is accomplished under the hood without a cursor-like operation, you're only fooling yourself. :-))
Again, the point of the exercise was not to eliminate the cursor, but to see if a simpler, more straightforward while loop could achieve the same result. One of the complications of the desired result is that I want the resulting output to order the indexes by row_count descending. So there is no unique, incrementing column or index to take advantage of, like you might be able to do if you have an IDENTITY column and you don't care about order otherwise. So this was a potential wrench I threw in intentionally because, in my experience, you will often care about order and you won't always have the luxury of a column inherent in the data set that supports the desired order of processing.
I'll show the four while loops that were submitted. They're all quite similar, using either #temp tables or @table variables to hold the intermediate data set before looping. I'm not going to reveal who submitted these approaches, but rather simply label them Colleague #1, #2, #3 and #4 (both to keep them straight and to protect the innocent). I hope they aren't offended that I took a few liberties with their syntax (mostly indenting and statement termination). These are all fairly similar constructs, and in fact I received a 5th contribution just prior to publishing - I didn't include it because it didn't have any elements not present below.
Colleague #1
DECLARE @indexes TABLE ( s SYSNAME, o SYSNAME, i SYSNAME, c BIGINT, rn INT );
INSERT @indexes(s, o, i, c, rn) SELECT s, o, i, c, rn = ROW_NUMBER() OVER (ORDER BY c DESC, s, o, i) FROM ( SELECT s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name, c = s.row_count FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ) AS x;
DECLARE @min INT, @max INT, @cur INT = 0, @s NVARCHAR(MAX) = N'';
SELECT @min = MIN(rn), @max = MAX(rn) FROM @indexes;
WHILE (@cur < @max) BEGIN SET @cur += 1;
SELECT @s += CHAR(13) + CHAR(10) + 'ALTER INDEX ' + QUOTENAME(i) + ' ON ' + QUOTENAME(s) + '.' + QUOTENAME(o) + ' REORGANIZE;' FROM @indexes WHERE rn = @cur; END
|
Colleague #2
SELECT RowID = ROW_NUMBER() OVER (ORDER BY s.row_count DESC, OBJECT_SCHEMA_NAME(o.[object_id]), o.name, i.name), s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name INTO #Temp FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i;
DECLARE @CurrentRowID INT, @s NVARCHAR(MAX) = N'';
SELECT @CurrentRowID = MIN(RowID) FROM #Temp;
WHILE @CurrentRowID IS NOT NULL BEGIN SELECT @s += CHAR(13) + CHAR(10) + N'ALTER INDEX ' + QUOTENAME(i) + ' ON ' + QUOTENAME(s) + '.' + QUOTENAME(o) + ' REORGANIZE;' FROM #Temp WHERE RowID = @CurrentRowID;
SELECT @CurrentRowID = MIN(RowID) FROM #Temp WHERE RowID > @CurrentRowID; END
DROP TABLE #Temp;
|
Colleague #3
DECLARE @Temp TABLE ( RowID INT IDENTITY PRIMARY KEY, SchemaName SYSNAME, ObjectName SYSNAME, IndexName SYSNAME );
INSERT INTO @Temp (SchemaName, ObjectName, IndexName) SELECT s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i;
DECLARE @CurrentRowID INT, @s NVARCHAR(MAX) = N'';
SELECT @CurrentRowID = MIN(RowID) FROM @Temp;
WHILE @CurrentRowID IS NOT NULL BEGIN SELECT @s += CHAR(13) + CHAR(10) + N'ALTER INDEX ' + QUOTENAME(IndexName) + ' ON ' + QUOTENAME(SchemaName) + '.' + QUOTENAME(ObjectName) + ' REORGANIZE;' FROM @Temp WHERE RowID = @CurrentRowID;
SELECT @CurrentRowID = MIN(RowID) FROM @Temp WHERE RowID > @CurrentRowID; END
|
Colleague #4
DECLARE @Row INT = 1, @s NVARCHAR(MAX) = N'';
CREATE TABLE #IndexList ( RowID INT IDENTITY(1,1) ,SchemaName SYSNAME ,TableName SYSNAME ,IndexName SYSNAME );
INSERT INTO #IndexList ( SchemaName , TableName , IndexName ) SELECT s = OBJECT_SCHEMA_NAME(o.[object_id]), o = o.name, i = i.name FROM sys.objects AS o INNER JOIN sys.indexes AS i ON o.[object_id] = i.[object_id] INNER JOIN ( SELECT [object_id], index_id, row_count = SUM(row_count) FROM sys.dm_db_partition_stats GROUP BY [object_id], index_id ) AS s ON o.[object_id] = s.[object_id] AND i.index_id = s.index_id WHERE o.is_ms_shipped = 0 AND i.index_id >= 1 ORDER BY s.row_count DESC, s, o, i;
WHILE @Row <= (SELECT MAX(RowID) FROM #IndexList) BEGIN SELECT @s += CHAR(13) + CHAR(10) + N'ALTER INDEX ' + QUOTENAME(IndexName) + ' ON ' + QUOTENAME(SchemaName) + '.' + QUOTENAME(TableName) + ' REORGANIZE;' FROM #IndexList WHERE RowID = @Row;
SET @Row = @Row + 1; END
DROP TABLE #IndexList;
|
I validated that the resulting value of @s matched the cursor result in all four cases. I don't know about you, and maybe I'm just being dense, but I had to
think a lot harder about that than I would have with a cursor,
cumbersome syntax and all. As far as I'm concerned, based on these four submissions from esteemed colleagues, constructing a while loop to solve this problem isn't any easier or more straightforward than a cursor. Especially since a DECLARE CURSOR block can easily be constructed for you using Management Studio's templates (or the new snippets feature in SQL Server 2012).
But what about performance? Surely a cursor is slower than a while loop, because it's a cursor, right?
Performance
As I mentioned in a previous article, the cursor usually gets a bad rap for two reasons: (1) folks think a cursor is bad simply because it says DECLARE CURSOR, and (2) people use a cursor (or any row-by-row processing methodology) when it isn't necessary. In cases where it is the best option (e.g. running totals) or it is necessary (e.g. maintenance tasks, or cases where a stored procedure call must be made for every row or conditionally depend on processing that occurred for the previous row), people just say DECLARE c CURSOR FOR ... when they should usually be using the most efficient cursor declaration possible (LOCAL STATIC FORWARD_ONLY READ_ONLY). And yes, some will argue that FORWARD_ONLY READ_ONLY could be replaced by FAST_FORWARD, but I prefer the former, because the latter can't be combined with STATIC:
Msg 1048, Level 15, State 1, Line 1 Conflicting cursor options STATIC and FAST_FORWARD.
|
Now, you are more than welcome to do your own tests with the various
cursor options, to see which perform best in your scenario, but I've
always come back to the set of options described above. I put my cursor code above into a stored procedure, as well as the two "set-based" concatenations, and the four colleague-provided options. I then ran some performance tests where I measured just duration of each method, executing 1000 times (I ran the middle section three times, then took the average):
SET NOCOUNT ON; GO CREATE TABLE #stats ( rownum INT IDENTITY(1,1), procname SYSNAME, dt DATETIME2 ); GO
INSERT #stats(procname,dt) SELECT '-', SYSDATETIME(); GO EXEC dbo.while_cursor; GO 1000 INSERT #stats(procname,dt) SELECT 'cursor', SYSDATETIME(); GO EXEC dbo.while_xmlconcat; GO 1000 INSERT #stats(procname,dt) SELECT 'xml concat', SYSDATETIME(); GO EXEC dbo.while_simpleconcat; GO 1000 INSERT #stats(procname,dt) SELECT 'simple concat', SYSDATETIME(); GO EXEC dbo.while_colleague1; GO 1000 INSERT #stats(procname,dt) SELECT 'colleague 1 while', SYSDATETIME(); GO EXEC dbo.while_colleague2; GO 1000 INSERT #stats(procname,dt) SELECT 'colleague 2 while', SYSDATETIME(); GO EXEC dbo.while_colleague3; GO 1000 INSERT #stats(procname,dt) SELECT 'colleague 3 while', SYSDATETIME(); GO EXEC dbo.while_colleague4; GO 1000 INSERT #stats(procname,dt) SELECT 'colleague 4 while', SYSDATETIME(); GO
-- using the new LAG functionality in SQL Server 2012:
SELECT procname, duration = DATEDIFF(MILLISECOND, LAG(dt, 1, NULL) OVER (ORDER BY rownum), dt) FROM #stats ORDER BY rownum; GO
DROP TABLE #stats; GO
|
The results (the "winner" highlighted in green):
The XML and simple concatenation approaches aside, the cursor clearly outperformed all of the while loops. This may not necessarily be true in simpler scenarios, so I'm not suggesting that a cursor will always be as good as or better than a while loop. But by the same token, I think I've demonstrated that a while loop isn't always faster or easier to write and understand than a cursor.
I would love to hear your experience, especially any examples where a while loop clearly outperforms a cursor or where it is much easier to write or comprehend.
|
-
I see a lot of people getting really excited. There is a "virtual launch event" for SQL Server 2012 being held on March 7, 2012. You can read more about the event at http://sqlserverlaunch.com/. Let me throw out a dose of reality: if you are not on a TAP or otherwise going live with a private build or release candidate, you will not be installing and deploying SQL Server 2012 on March 7th. I promise. This date will not mark a release, go-live, or general availability. These launch events are marketing-centric sessions to get you excited about the product. Will you be able to download Express editions from the Microsoft web site, and other SKUs from MSDN or your volume licensing portal, shortly after that? Sure. The next day? Almost certainly not. I actually don't remember the lag between the launch event (we'll call it a soft launch) and the RTM availability for previous releases, so I'm not going to throw out any conjecture there. But from all previous launch events by Microsoft, across various product
lines, the CD/DVD/download has never been available at the same time as
the event. Let's not forget that this virtual launch event is not the only "launch event" in March - later in the month, both SQL Connections and SQL Bits are also serving as launch events. Launch and release are similar words, but not identical.
Just wanted to keep it real, and prevent people from getting too excited or planning deployments around this launch event. It is just a marketing event, folks. Worth attending, no doubt, but you won't get magic golden tickets with a download code on that day. PS If I end up being wrong, I'll just delete this post. :-)
|
-
This one is quite subjective, and I'm sure I will face plenty of
opposition - not only because it's a preference thing and many people
are married to their preferences, but also because it violates the
strict interpretation of the standard. Personally, I'm more worried
about the former than the latter - I have no concerns whatsoever that
SQL Server will eradicate the = notation for column aliases, nor do I
worry that the code I write needs to work when ported to Oracle, DB2,
MySQL, etc. (I highly doubt this will be the biggest problem in such an
event). If these concerns are important to you, you may want to stop
reading.
To be honest, there are four six different ways you can define the alias for a column in a SELECT statement - the alias being the column name the application or API sees when the result is returned from SQL Server. The methods I've seen used are:
SELECT 1 AS x; -- #1
SELECT x = 1; -- #2
SELECT 'x' = 1 ; -- #3
SELECT 1 x; -- #4
SELECT 1 AS 'x'; -- #5
SELECT 1 'x'; -- #6
|
(I've ignored [alias] and "alias" variations for brevity.) But I really want to discuss only two of these methods.
First, I want to discard item #3 entirely (which I've edited thanks to techvslife's comments). Why? Because using string literals as column aliases has been deprecated for some time now. You can see how prevalent this is in your environment with the following query:
SELECT [object_name], instance_name, cntr_value FROM sys.dm_os_performance_counters WHERE [object_name] LIKE '%:Deprecated Features%' AND instance_name LIKE 'String literals as column aliases%'; |
I'd also like to discard #4, simply because I find it completely unreadable. Without using either = or AS, it makes it very hard to interpret if that is intended to be an alias, or if x is actually a column name and the author simply forgot a comma. I feel the same way about leaving out AS to denote a table alias, or WITH to denote a table hint (maybe I'll treat those in another post).
And #5 and #6 I'm going to ignore because they're the same as #1 and #4 but with single quotes around the alias. Personally I find the single quotes around aliases to be distracting, making the select list tougher to read no matter which aliasing convention you use. All alias names are strings; why would I want to make it look like it is intended to be data? Also note that #5 and #6 are not defined in any standard, they just seem to be allowed into the syntax (I haven't studied the standards lately but I don't believe these forms are documented).
So that leaves #1 and #2.
I prefer #2 simply for readability reasons. Remember that even in cases where you're "the database guy," the T-SQL code you write is not consumed solely by you - it is also consumed by application developers who want to understand your query (and in a lot of cases, the priority is understanding the shape of the result set); it will also be consumed by your successors, should you move on to a different project, a different company, or worse. Let's take a look at a very simple example of a query against the Sales.SalesOrderHeader table in AdventureWorks2008R2. I've added some new columns using calculations, and you can see how different it can be for someone trying to read the code and determine what the column names are. On the left pay attention to how your eyes have to move all over the script to locate the alias names that have been provided; on the right, it is a much simpler scan down the left hand side. (Click for larger.)
Now, this all depends on whether you are already following somewhat
logical coding conventions in the first place. If your SELECT list looks
like stream of consciousness from James Joyce, which I see more often than I'd like, then whether you use AS or = isn't
going to matter at all. This is with word wrap turned on in the T-SQL editor, and is much worse with word wrap turned off.

Raise your hand if you've seen this kind of code (or have written such a mess yourself). I suggest always writing code with word wrap turned off - in T-SQL at least, there aren't
many cases where a line *needs* to go off-screen, and having word wrap
turned off will protect you from doing it inadvertently.
In SQL Server 2012, there are easier ways of determining the shape of a resultset from a stored procedure, as long as it is the *first* resultset; in this case, it doesn't matter how ugly your queries are, you never have to look at them to determine the names of columns, their data types, or the order they are returned:
SELECT name, system_type_name FROM sys.dm_exec_describe_first_result_set_for_object(OBJECT_ID(N'dbo.UsingAS'), NULL) ORDER BY column_ordinal;
|
Results:

That doesn't mean you should stop paying attention to formatting once you've upgraded to SQL Server 2012; your code should still be tidy and readable, this just gives you a quick way to examine result sets from code written by less motivated people, without having to refactor the code first (which is what I often end up having to do when I am trying to troubleshoot unreadable code). If I haven't convinced you to use = for column aliases, I hope I've at least convinced you to use AS if you're currently using method #3 or #4 above.
|
-
My favorite blog post Picking favorites is never easy. While I definitely feel like I had some more thought-provoking, controversial and laborious blog posts, my favorite would have to be my summary of 18456 errors: http://sqlblog.com/blogs/aaron_bertrand/archive/2011/01/14/sql-server-v-next-denali-additional-states-for-error-18456.aspx
When I initially wrote that post, it started as a quick listing of the new states introduced in SQL Server 2012 (involving contained databases). When I started testing various issues using both contained and non-contained databases, I quickly realized that there isn't really a comprehensive resource out there that lists the various states and helps decipher the root of the problem. There also isn't a lot of help about a more fundamental point: the error message that's relayed to the end user or application is intentionally vague. To find the real reason for the permissions issue, you need to look at the SQL Server error log.
In a more general sense, I am proud to have started getting serious again about the Connect digests. Even though most issues are way too late to consider for SQL Server 2012, I still think there is a lot of value in voting for items you think are important or worthwhile. There is even more value in expressing your business case for the change or fix - how specifically will your organization make better use of SQL Server, or in fact be able to justify a migration or an upgrade, if the issue in the Connect item is resolved? My most popular blog post
By a wide margin, my most popular post in 2011 was actually a post I wrote in 2010. The purpose was to both express and prevent frustration in the removal of evaluation editions of SQL Server:
Fun with Software : uninstalling SQL Server 2008 R2 Evaluation Edition
This process is unnecessarily complex, particularly if the expiration date has already passed. And since there's no straightforward or documented way to determine when an evaluation edition will expire, this is kind of a double-whammy for some folks. I have two related Connect items I'd like to point out:
- #257649 : Make expiration date of eval edition more visible
Written in February of 2007, I expressed the fact that we have to use all kinds of cumbersome workarounds to determine when an evaluation edition installation will expire. My favorite is one that I learned of a lot more recently: check the earliest create date in sys.logins. But it is an ugly hack not unlike checking create_date of tempdb in sys.databases to determine system uptime. They've exposed this value in DMVs in the meantime; why not store the expiration date somewhere easily accessible?
- #329054 : Setup : Please provide a utility that removes an instance
The steps you have to go through sometimes to remove an instance of SQL Server are infuriating. In this connect item, filed in February of 2008, Microsoft initially responded that they were going to create a utility for cleaning up instances. Then they closed it as a duplicate. Then they closed it as "done." Then I re-opened it after discovering that the utility was not in the plans. Then it was closed as fixed again. Then I re-opened it. Then they explained that the utility would not be produced, but a KB article would be published instead. Three months later, they advised it was in edit. Four more months later, I inquired about the status of the KB article. They waited a few days and then explained that, like the utility, the KB article that they promised also wouldn't be published. Since they still haven't fixed a lot of the issues with removal of an instance, I guess they'd rather see folks wipe and pave a machine than take the risk of using MSIZAP and/or following my instructions in the blog post above. This is an embarrassing demonstration of how Connect fails (now, there are many examples out there of how Connect succeeds, so don't take this as an overall summary).
While I'm glad that I continue to help people get out of this sticky situation, and it's nice that so many express their appreciation, I do sincerely hope that Microsoft considers fixing this issue. Adding your votes and, more importantly, your comments, to the above issues will help show Microsoft that they should make these usability concerns a priority for the release after SQL Server 2012. My Travels
I had some great travel experiences last year. For work, I went to England twice (I spoke at both SQL Bits 8 in Brighton, and SQL Bits 9 in Liverpool), spoke at the SQL Rally Nordic event in Sweden and on the SQL Cruise to Alaska, and spoke for the first time at the PASS Summit in Seattle. I also spoke at countless SQL Saturdays, a handful of user group meetings, and visited our offices in Charlotte multiple times. I also got to enjoy Negril, Jamaica, with my wife, for our 1st anniversary - we went to the same resort where we were married on 10/10/10. We also went to Canada multiple times. This year I have even more travel planned - next week I embark on SQL Cruise Caribbean, then in February I'll be taking a trip to Quebec City and attending the MVP Summit in Seattle. In March I'll be hosting a curling bonspiel up near North Bay, speaking at SQL Saturday #105 in Ireland, and speaking again at SQL Server Connections in Vegas (another first for me). In April, Nicole's sister is getting married on the Mayan Riviera. Sound like a busy four months? Yeah, I thought so too.
My Goals
At the end of 2010, I wrote a goals post for 2011. I did not meet all of them for various reasons, but rather than dwell on my failures (I'll be doing that quite enough on my own, thanks!), I'll treat a couple of successes: - I certainly met my speaking goals. I gave 20 presentations across 14 different events in 3.5 different countries (including a couple from my house). The .5 is Canada, since I believe I gave my SQL Cruise Alaska talk while we were floating west of Vancouver Island.
- I definitely had lulls in my blogging, but am happy with my overall results here. I also met my goal of personal blogging - I abandoned aaronbertrand.com three years ago, but started a new blog for topics non-SQL over at bertrandaaron.wordpress.com.
- I wanted to attend more sporting events. I think we did that - went to some Bruins games, and lots of Merrimack basketball and hockey games. This year we've already been to a Patriots' playoff game and have tickets to see the Bruins beat up on the Penguins in April.
I have plenty of items on my goals list for 2012, but not in a state to share just yet.
|
-
On Monday, I completely missed that the SQL Server Release Services team published two cumulative updates for SQL Server 2008. I'd blame jury duty, but that didn't start until Tuesday, and only lasted two days. Well, better late than never. Cumulative Update #8 for SQL Server 2008 Service Pack 2
Cumulative Update #3 for SQL Server 2008 Service Pack 3
As usual, I'll post my standard disclaimer here: these updates are NOT for SQL Server 2008 R2 (where @@VERSION will report 10.50.xxxx).
P.S. this is blog post #499. You're probably not counting, but I am. :-)
|
-
First of all, I want to congratulate both Kendal van Dyke (@SQLDBA) and James Rowland-Jones (@jrowlandjones) on their recent appointment to the PASS Board of Directors. I have no doubt they will go above and beyond to fulfill their duties and represent the SQL Server community at large. Some others feel that more deserving candidates were slighted by this process, and argue that the immediate runners-up in the previous election should get called upon to fill vacated roles. You can read about them here: Steve Jones: http://voiceofthedba.wordpress.com/2012/01/13/ethics-and-power/
Andy Warren: http://www.sqlandy.com/index.php/2012/01/pass-time-to-change-the-by-laws/
Steve's main argument is that the next highest vote recipients in the recent election should automatically fill vacated slots. He contends that Sri has done a lot of work for the community,
organizing SQL Saturdays and bringing SQL Rally to Dallas. I know it is
not his intention, but stating these things, in my mind, tends to
belittle the efforts of the men the board did choose. Kendal's resume is
quite similar to Sri's - he works tirelessly for the Orlando SQL Server
user groups, speaking at every SQL Saturday there, helping plan their
most recent SQL Saturday, and starting a user group (MagicPASS). He was
also a key player in bringing the first SQL Rally to Orlando, and having attended as both a speaker and a sponsor, I know that Kendal went to great lengths in making the event as flawless as possible. And James'
effort with SQLBits should not be overlooked either - I have been to multiple events over there and I can tell you first-hand that the UK SQL community
is quite happy with his efforts on their behalf. He has also served as
an international advisor to the PASS board since early 2011, and is
involved with many community groups in addition to PASS and SQLBits. Both have been recognized as SQL Server MVPs and constantly demonstrate their commitment to the community. I don't want to take away from Sri's accomplishments, but do want to have equivalent context for the other choices as well.
Andy simply argues that, while the by-laws allow the board to vote for whomever they please, they have traditionally appointed the person with the next highest votes from the previous election -- so that's what he expected to happen this time, too. This is a fair expectation, but not a perfect one: to me it seems similar to when people get bitten by the fact that ORDER BY in a view no longer guarantees order without an explicit ORDER BY on the outer query: they're relying on observed behavior vs. what is (or what isn't) written in stone. The process that happened here is not at odds with the way the by-laws are written, it just so happens that in previous incidents the board's decision happened to coincide with the community's vote. (He also suggests that they could have changed the size of the board in order to give James a vote and still take the next two highest vote-getters to fill the vacated seats.)
The idea Steve and Andy have is not without merit, but this is not how the by-laws are currently written (Thomas LaRock explains). I also think there are time limit issues at play. If these appointments happened in March, July, or October 2012, would the 2011 election results be as relevant? If we were to rely solely on community votes, at what point would we need to hold a new vote? I'm just trying to think of how the by-laws could be explicitly written to allow for the community to control every member of the board without having to hold elections all the time.
When I tried to think of real-life (well, non-tech) examples of this, I could only come up with two off the top of my head. - President of the United States
When the President becomes unable to serve, the Vice President - someone who the President just happened to choose as their running mate - takes his/her place. Not the runner-up in the Presidential race, who quite likely does not hold the same values as the person the people elected. That doesn't mean they are not capable of the position, or that they wouldn't have done a good job, it's just not how the law is written.
- Beauty Pageant Queen
When these ladies lose their crown, the runner-up takes their place. This is more in-line with what Steve and Andy are arguing for, though it is still not the community at large who determined the queen, runner-up, and so on, it was a smaller group of judges. As with the Vice President, the runner-up may not be the city's / country's / world's favorite choice (especially considering these people can change over time), but those are the rules.
In any case, this is part of the by-laws as they are currently written. There is no magic, voodoo or unethical behavior going on. The board voted 11-1 in favor of Kendal and James, and frankly I am quite satisfied that they did so in the best interests of the organization and the community it serves. There were only two spots to fill and at least five willing and capable people to choose from. I am also thankful that the circumstances surrounding their decision is private - while I agree that there needs to be some transparency, if people were not chosen due to personal or other reasons that are not public, they should stay that way. If there were such circumstances, the board could choose to inform those who were not chosen about why they were not chosen, and leave it up to the individual to make that known to the greater community if they so choose. Does that mean the by-laws don't need to be revisited and potentially re-written, such that both initial and replacement board members are decided by the community in some way? No. I think that the stir this has caused shows that the current process is imperfect and should be considered carefully before the next election. If they do change the by-laws to make the vacancy appointments more dependent on the previous election, I hope they factor in my point about timeliness.
|
-

This month's T-SQL Tuesday is being hosted by Dave Howard (@DaveH0ward), and the topic is "Second Chances." What he's letting us do is look back at all the previous T-SQL Tuesdays and pick a topic that we missed or maybe thought we needed another crack at.
I chose a recent topic that I missed due to travel, "Data Presentation," hosted by Robert Pearl in September 2011 (T-SQL Tuesday #22).
At my previous job, one of the most complicated pieces of presentation work we had to do was to provide users with data in their preferred time zone. We separated data by customer in separate databases, which would have made things relatively easy, if only all the users of each customer were in the same time zone. In that case, we could have just stored all the customer's data in that time zone.
But of course, customers had offices spread throughout the country and in fact the world, so we had to implement a way to show the same data, translated to local time, for users in different time zones. A complication to the requirement was that, based on the user's login to the web application, we needed to show the data in their preferred time zone even if they were using a computer in a different office - so we couldn't rely on the web browser's preferences or local machine's regional settings to determine the time zone. And in addition to these requirements, we also had to know the actual UTC offset for given dates, both in the past and in the future, since several of the supported time zones would observe daylight saving time.
In order to fulfill these requirements, we decided to implement the logic within the database. This way the application wouldn't have to know anything except who the user was, and the logic would work the same way regardless of which application was requesting the data (we also had several applications that weren't tied to the web application, that wouldn't always know the user, so each customer had a default to fall back on). The servers in the data center were all set to GMT and to *not* observe daylight savings time. This is actually on my informal best practices list because it prevents ever having to worry about data collisions or gaps twice a year.
So here is a skeleton of the schema and minimal data to demonstrate the functionality. First I'll let you create a dummy database to target:
USE [master]; GO
IF DB_ID('TSQLTuesdayAaron') IS NOT NULL BEGIN DROP DATABASE TSQLTuesdayAaron; END GO
CREATE DATABASE TSQLTuesdayAaron; GO
USE TSQLTuesdayAaron; GO
|
Now we need tables for TimeZones and UTCOffsets:
CREATE TABLE dbo.TimeZones ( TimeZoneID TINYINT PRIMARY KEY, Name VARCHAR(32) NOT NULL UNIQUE ); GO
INSERT dbo.TimeZones(TimeZoneID, Name) VALUES (1,'Eastern US Time'), (2,'Greenwich Mean Time');
CREATE TABLE dbo.UTCOffsets ( TimeZoneID TINYINT NOT NULL, [Date] SMALLDATETIME NOT NULL, Offset SMALLINT NOT NULL, PRIMARY KEY CLUSTERED(TimeZoneID, [Date]) ); GO
ALTER TABLE [dbo].[UTCOffSets] ADD CONSTRAINT [FK_utc_TZ] FOREIGN KEY([TimeZoneID]) REFERENCES [dbo].[TimeZones] ([TimeZoneID]);
|
For brevity, I'm just going to populate the transition days, e.g. when daylight savings time changes. In actuality, the table is truly a calendar table; each date from 2005-01-01 through 2030-12-31 is represented, once for each time zone:
INSERT dbo.UTCOffsets(TimeZoneID, [Date], Offset) VALUES (1,'2011-03-12',-5), (1,'2011-03-13',-4), (1,'2011-11-05',-4), (1,'2011-11-06',-5), (1,'2012-03-10',-5), (1,'2012-03-11',-4), (1,'2012-11-03',-4), (1,'2012-11-04',-5), (2,'2011-03-26', 0), (2,'2011-03-27', 1), (2,'2011-10-29', 1), (2,'2011-10-30', 0), (2,'2012-03-24', 0), (2,'2012-03-25', 1), (2,'2012-10-27', 1), (2,'2012-10-28', 0);
|
Now of course we need some customers and users:
CREATE TABLE dbo.Customers ( CustomerID INT PRIMARY KEY, Name NVARCHAR(32) NOT NULL UNIQUE, TimeZoneID TINYINT NOT NULL ); GO
ALTER TABLE [dbo].[Customers] ADD CONSTRAINT [FK_c_TZ] FOREIGN KEY([TimeZoneID]) REFERENCES [dbo].[TimeZones] ([TimeZoneID]); GO
INSERT dbo.Customers(CustomerID, Name, TimeZoneID) VALUES (1, N'Teks', 1), (2, N'MSDN', 2); GO
CREATE TABLE dbo.Users ( UserID INT PRIMARY KEY, Email VARCHAR(320) NOT NULL UNIQUE, CustomerID INT NOT NULL, TimeZoneID TINYINT NOT NULL ); GO
ALTER TABLE [dbo].[Users] ADD CONSTRAINT [FK_u_TZ] FOREIGN KEY([TimeZoneID]) REFERENCES [dbo].[TimeZones] ([TimeZoneID]);
ALTER TABLE [dbo].[Users] ADD CONSTRAINT [FK_u_Cust] FOREIGN KEY([CustomerID]) REFERENCES [dbo].[Customers] ([CustomerID]); GO
INSERT dbo.Users(UserID, Email, CustomerID, TimeZoneID) VALUES (1, 'foo@bar.com', 1, 1), (2, 'bar@foo.com', 1, 2), (3, 'foo@bar.net', 2, 1); GO
|
And then functions to determine the offset for a given day and user or customer:
CREATE FUNCTION [dbo].[GetUTCOffSet_ByUserID] ( @UserID INT, @Date SMALLDATETIME ) RETURNS SMALLINT AS BEGIN RETURN ( SELECT utc.Offset FROM dbo.UTCOffsets AS utc INNER JOIN dbo.Users AS u ON utc.TimeZoneID = u.TimeZoneID WHERE utc.[Date] = DATEDIFF(DAY, 0, @Date) AND u.UserID = @UserID ); END GO
CREATE FUNCTION [dbo].[GetUTCOffSet_ByCustomerID] ( @CustomerID INT, @Date SMALLDATETIME ) RETURNS SMALLINT AS BEGIN RETURN ( SELECT utc.Offset FROM dbo.UTCOffsets AS utc INNER JOIN dbo.Customers AS c ON utc.TimeZoneID = c.TimeZoneID WHERE utc.[Date] = DATEDIFF(DAY, 0, @Date) AND c.CustomerID = @CustomerID ); END GO
|
So now, given a date/time value and a user or company, we could determine the offset that should be used, depending on the known CustomerID or UserID:
SELECT dbo.GetUTCOffset_ByCustomerID(1, '20111106'); SELECT dbo.GetUTCOffset_ByCustomerID(2, '20111029'); SELECT dbo.GetUTCOffset_ByUserID(1, '20111106'); SELECT dbo.GetUTCOffset_ByUserID(2, '20111029'); SELECT dbo.GetUTCOffset_ByUserID(3, '20111106') | Once a user is logged in, we could simply store their timezone in a session variable, so we wouldn't have to retrieve that value more than once. For any date we needed to display, we could just pass in the date and the timezone to the following function:
CREATE FUNCTION [dbo].[GetPresentationDateTime] ( @TimeZoneID TINYINT, @DateTime SMALLDATETIME ) RETURNS SMALLDATETIME AS BEGIN RETURN ( SELECT DATEADD(HOUR, @Offset, @DateTime); ); END GO
SELECT dbo.GetPresentationDateTime(1, '20111106 04:32'); SELECT dbo.GetPresentationDateTime(2, '20111029 04:32');
|
We could have done all of the above in a single, overloaded function or procedure, but decided it would be better to make the access methods much more defined. You also might suggest DATETIMEOFFSET, but this solution was initially implemented in SQL Server 2005, long before the new date/time data types were introduced. Besides, I'm not sure how it would have helped, since most values were input without any TZ awareness (e.g. GETUTCDATE()), relying on the system clock to store UTC date/time correctly. Most data did not need to be timezone aware until presented to the user, and since it could be different for every user, the value of storing one user's TZ information with the data was pretty weak. The above didn't really help if we wanted to pull a report with lots of datetime values - we would have to call the scalar function for every row (and sometimes for multiple values) - in those cases we just wrote out a join the long way (if we didn't already know the offset) or used DATEADD inline when we did. Now of course, you're probably thinking this isn't isn't quite complete. And you'd be right. There's that funky time we haven't accounted for - the first couple of hours in each DST-observing time zone on the forward/back day, before the change is actually supposed to take place. We opted to defer that since most events happen during business hours, and the issue never came up with any of our customers. Certainly a concern you will want to take into account if you're thinking about this type of solution. And in the spirit of keeping with the topic, if I were to be presented with this same challenge today, I would probably use it as a second chance to get the job done a little better, rather than take the easy "I-solved-this-before" route.
|
-
Hide databases from users who shouldn't be able to see them
This is a long-standing request from Erland Sommarskog which I've highlighted in previous digests. But the underlying problem keeps coming up in multiple venues, so I thought it would be good to call attention to the item one more time. Some will argue that the contained database feature provides a solution for this, but that only works well if you want to restrict a user to exactly one database, and only works well if your application is compatible with the limitations of the feature. Please comment on the item and explain how this feature will help you in your environment.
#273830 : Need VIEW DEFINITION permissions per database
Contained Database users are people too
In playing with the contained database feature as a solution to Erland's concern above, I discovered an unfortunate bug: a database-level user (with password) who has connected to their contained database using SSMS will not enjoy most of the important IntelliSense features. I'm highlighting this Connect item not so that you can vote for it, but rather just to be sure you're aware of this limitation if you intend to utilize contained databases in the short term. As an side effect, I also discovered that there doesn't exist a straightforward way to set up a contained user that can bypass the password policy in place, unlike server-level logins (where you can say CHECK_POLICY = OFF). Personally I think they got this backwards - logins are the security entity where you want to make it harder to implement simple passwords. If you want a contained user with a simple password, you can create a server-level login, associate it with a database user, and then use sp_migrate_user_to_contained (note that I haven't tried this).
#717063 : SSMS : IntelliSense does not function for a contained user
#717069 : Contained User syntax does not support bypassing password policy
Please just go parallel, regardless of other factors
Paul White (@SQL_Kiwi) has asked for an option that is kind of the opposite of MAXDOP. I say "kind of" because he doesn't want to be able to say MINDOP x, but rather try to coerce the optimizer to use a parallel plan and then follow the same rules it normally would in determining the level of parallelism.
#714968 : Provide a hint to force generation of a parallel plan
Expose SHOW_STATISTICS through a DMV
Greg Low has proposed adding a DMV that would mirror DBCC
SHOW_STATISTICS output, making it easier to work with the results. I'm
all for this, as it can be quite a hassle to mix monitoring queries with
DBCC calls.
#611155 : DBCC SHOW_STATISTICS info should be available as a DMV
Check constraints during CHECKDB
Thanks to Ola Hallengren, they are considering adding the ability to
check all constraints (and, where appropriate, mark them as trusted) as a
part of the DBCC CHECKDB process (specifically, using the
EXTENDED_LOGICAL_CHECKS option). There are already plenty of votes, but
more votes (and, more importantly, comments about how this will help in
your environment) will help.
#508837 : Option to check constraints in DBCC CHECKDB
|
-
I’ve seen the question a few times now: “How do I compare two plans side-by-side in Plan Explorer?” Management Studio allows you to view multiple graphical plans that are visible simultaneously, either by generating plans for every statement in a batch, or by generating plans in separate query windows and then splitting the panes. However, at least in my estimation, a lot of the other metrics and properties are more valuable when performing comparisons. In SSMS this is tough, because you can only have one tooltip visible at a time (regardless of how many plans are visible). Same for the Properties panel – only a single plan or operator can populate this panel at one time. So even though you can compare high-level graphical details quickly, you will likely end up with multiple instances of SSMS to perform true side-by-side comparisons of the lower-level details.
In Plan Explorer (free download) it is very easy to compare a lot of the metrics side-by-side for two (or more) plans. Since Plan Explorer can generate estimated or actual plans for you, it’s as simple as entering multiple queries in the Command Text pane, and then generating an estimated or actual plan:

I recommend trying the Vertical Layout, so that the various plan-related tabs and the Statements Tree are a lot closer together (reducing mouse and eye movement):

Here is a quick comparison of the two default options (vertical on the left, horizontal on the right – click to enlarge):

(There is also a Custom Layout option if you wish to change from the default options.)
And you can still compare a lot of the other details about two different statements by looking at the other columns in the Statements Tree tab. For example, comparing costs, row counts and operation counts between estimated or actual plans can give you a much more accurate picture of the performance impact of different statements:

And you can switch between graphical plans quickly by using arrow keys or mouse clicks to switch between statements:

Since the other tabs are context sensitive, you can do this for any tab: simply highlight a tab (e.g. Top Operations) and, as you switch through the rows in the Statements Tree, the tab below will update accordingly.
For more complex plans, and depending on how much screen real estate you have, you may prefer to open two instances of Plan Explorer so that you can interact with the plans a little more independently. In this case you won’t be able to see percentages or other relative comparisons directly against one another, but for most eyeballing, it’s still better than what you get in Management Studio IMHO.
That all said, comparing two graphical plans side by side is definitely on the list for a future update; in the meantime, I hope the above is helpful.
|
|
|
|
|
|