THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Aaron Bertrand

Aaron is a Product Evangelist for SentryOne, makers of Plan Explorer and a performance monitoring and event management platform for the Microsoft Data Platform and VMware. He has been blogging here at sqlblog.com since 2006, focusing on manageability, performance, and new features, and also blogs at blogs.sentryone.com and SQLPerformance.com; has been a Microsoft MVP since 1997; tweets as @AaronBertrand; and speaks frequently at major conferences, user group meetings, and SQL Saturday events worldwide.

T-SQL Tuesday #20 : T-SQL Best Practices

T-SQL Tuesday, the invention of Adam Machanic (@AdamMachanic), is what he calls a recurring, revolving blog party. Each month, a new host picks a topic, invites bloggers of all levels to join in, and then after all the submissions are in, writes a summary of all of the entries. This encourages people to participate in the community more frequently, sometimes helping them get their feet wet in a topic they're not already familiar with, and I suspect that everyone involved learns a little something each time.

This month, T-SQL Tuesday #20 is hosted by Microsoft's Amit Banerjee (@banerjeeamit), and the topic he chose is T-SQL Best Practices. I like this topic choice because I have long been a proponent of using positive, consistent practices and kicking all those bad habits. I put out a number of "bad habits" posts here about a year ago, wrote a compressed chapter on this topic for the upcoming SQL Server MVP Deep Dives Volume 2 title, and will be presenting a similarly-themed PASS Summit 2011 session. It's certainly easy to point out bad habits, like SELECT *, and encourage folks to stay away from them. I think it's equally easy to point out some *good* habits that people can incorporate into their coding mantra, whether they be for performance reasons, readability / maintainability reasons, or both. So, my T-SQL Tuesday entry is going to list out a few of these good habits, and explain why I think you should be making them part of your natural coding style.


Always use statement terminators

I wrote about this back in 2009, where I explained that CTEs (starting in 2005) and MERGE (starting in 2008) require the previous statement to be properly terminated with a semi-colon.  I also explained that Microsoft will eventually not allow statements without a proper terminator at all (though I admit). SQL Server "Denali" introduces a new command which requires the previous statement to properly terminated: THROW (I wrote about this back in November).

As more and more keywords are added to Transact-SQL, you will find more and more cases where this requirement will be in place. I suspect that we'll all be tired of the syntax error messages and be doing this instinctively long before Microsoft makes it mandatory, but why not start now? As with many other habits, I don't advocate going back and correcting your entire codebase. But if you're refactoring code (for example, to move from RAISERROR to THROW in Denali), and you add those semi-colons on the statements that now precede those lines, you may as well add all the semi-colons applicable to the stored procedure.


Always set isolation level at query level, rather than as a table hint

On just about every system I touch, I see NOLOCK being used quite liberally, peppered onto every table in a query. In and of itself, the use of NOLOCK is okay - I give folks the benefit of the doubt, and assume that they have calculated the risk they're taking by using READ UNCOMMITTED (though I will often query why they aren't using snapshot, both out of curiosity and to perhaps plant a seed that might otherwise not be there). My point in the context of this post is that there are a couple of particular pitfalls to using NOLOCK at the table level. One is that the following syntax is often used:

FROM dbo.table NOLOCK
-- instead of
FROM dbo.table WITH (NOLOCK)

Without the WITH keyword, this can actually interpreted as a table alias (e.g. AS NOLOCK) instead of a table hint. Furthermore, if you have a complicated query against a lot of tables, and add WITH (NOLOCK) to each one, you're creating a lot of work for yourself if you later decide to transition to snapshot (hint, hint). Why not set the isolation level using:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

...which is much easier to change later should you need to, prevents you from missing the hint on one or more tables buried in your query, makes the query itself more readable, and better documents the actual intention of the code. Plus, if someone is on a mission later to change everything to snapshot isolation, they will likely just throw the SET statement at the beginning of each procedure. If there is already a SET statement at the top, they can override it, instead of having to pore through the code and remove every single NOLOCK...


Always prefix stored procedure calls with EXEC / EXECUTE

I see a lot of what I'll call "lazy" stored procedure calls. People write their applications or scripts with the following:

sp_who2;

Then they execute their code from their application (or maybe call it in a different way), and are surprised to suddenly see this error, which they don't see in Management Studio:

Msg 102, Level 15, State 1, Line 2
Incorrect syntax near 'sp_who2'.

The reason for the error is that, while you can call a stored procedure without an explicit EXEC when it is the first statement in the batch, you don't always have control over what *else* might be sent in the batch (SET commands, for example). I see this problem burn people quite a bit, so do yourself a favor: always prefix your stored procedure calls using EXEC or EXECUTE. At the very least, it prevents you from having to change your code when you do want something else to happen before that statement in the batch.

Always use schema / table prefix

Granted, I wrote about a portion of this topic in my "Bad Habits" series, but from the opposite angle ("stop avoiding the schema prefix"). As the use of schema becomes more common, I see more people getting confused when their object is not found or when the user complains that they are getting permission denied errors. Even if you only use the dbo schema for now, I strongly recommend getting into the habit of always properly schema qualifying your objects - whether at creation time or at query time.

Another thing I see a lot is the inconsistent use of table/alias prefix when referencing columns in a query. While it may be obvious to you which table a certain column came from, it is not going to be obvious to everyone reading the query - especially when they're troubleshooting an issue in real time. Using a sensible alias on each table name in the query will allow you to add an alias prefix to every column (both in the SELECT list and in the other clauses) with relatively little effort/ In turn, this will make your query more readable for everyone, and might even prevent errors later should the schema change.


Consider developing on a case-sensitive database

This is probably going to be my hardest sell here, but I think it can be very important to become disciplined in using consistent casing for object names, column names and data. Why? Well, you never know when your next deployment may target an instance that has a case sensitive collation - particularly if you distribute databases or T-SQL code as an ISV. IntelliSense (whether native or 3rd party) can assist with this greatly, making correct casing virtually automatic, but even when using this assistance there can still be many instances where you inadvertently (or even knowingly) decline the suggested code. While this may cost you a bit of development time up front, it will save you a lot of time later if you end up deploying mixed-case code to a case sensitive server.

So, if you typically develop using a case insensitive collation (e.g. Latin1_General_CI_AI), consider using the same core collation, but with the case sensitive option instead (e.g. Latin1_General_CS_AI). You could go a step further and say, "well, why not develop on several foreign language collations as well?" You might want to do that too, but I'm going to limit my advice to the more common scenario - where you are developing for the more common Western languages.


Above all else, use a consistent coding standard

In the end, it doesn't really matter that much whether you adopt some or all of the above practices - no matter how blue in the face I may get telling you that there are good reasons behind them. What is truly important is that you strive to develop consistent code, and understand the side effects of *not* choosing to implement some of these habits. Setting up a list of general rules for your team, and consistently following your own standards with no exceptions, can go a long way in streamlining development in the long run, and troubleshooting issues when things go wrong.

 

Comments

 

Arthur said:

All these are very valid points. Do you have any tools that help enforce the coding standards? My pain is most of developers use SSMS 2005 and unless they have proper plugins the word casing is not easy to follow (no auto-complete), the ; statement terminator is not getting placed when designing queries, etc.

July 12, 2011 9:50 AM
 

Geoff said:

The case sensitivity is an interesting point I hadn't considered before. I haven't run into a SQL Server instance that was case sensitive. However, that is generally my habit since I learned how to program in Unix back in the day.

I still refuse to put spaces in any of my file names.

July 12, 2011 9:56 AM
 

Daniel Gras said:

Thanks, I really like the isolation level tip!

July 12, 2011 10:07 AM
 

AaronBertrand said:

Arthur, it's a great question, and I do have plans to talk about enforcement in a future post. That said, I will tell you that there aren't many easy ways to truly *enforce* - it is still easy using a variety of tools for the developer to simply be stubborn about opting in to them.

July 12, 2011 10:10 AM
 

Louis Davidson said:

>>Always set isolation level at query level, rather than as a table hint<<

Of course there can be reasons to set the isolation level at the table level individually, but 99.9% of the time, I 100% agree :)

I really like this:

FROM dbo.table NOLOCK

I hadn't thought about this though a tip I would have added was always use AS in your aliases to make it more clear. I have seem too many people say:

SELECT column1 column2, column3

FROM   table

and not realize why column1 was giving them the wrong datatype (usually this is a really large query, of course.  Using AS makes it clear

SELECT column1 AS column2, column3

FROM   table

Either that is an alias, or you messed up

July 12, 2011 11:23 AM
 

Peter Schott said:

I think I'd agree with just about everything except the Case-Sensitive part. We did that because it was the default way back when. Then we tried to run our code on a Case Insensitive DB and it completely blew up. That was not a fun experience. Besides, we had to mask all of our searches because FIRSTNAME <> FirstName and similar things, which incurred some overhead due to UPPER() or LOWER() commands.

As for the Snapshot Isolation, I have some reservations about what this would do to our IO before enabling it. I like the idea and think it could benefit us in some ways, but not if it adversely affects the system in other ways.

July 12, 2011 12:28 PM
 

AaronBertrand said:

paschott, can you explain how your code blew up just by moving to a case insensitive collation? Code that works on case sensitive should work perfectly fine on case insensitive. My point was that going the other way can often introduce problems.

Also, the UPPER() & LOWER() are going to be necessary if you deploy your code to a case sensitive collation anyway (and if those operations really provide a noticeable impact on your performance, *shrug*). If you are 100% certain you will never need to support CS, then by all means, ignore my advice. But if there is any chance, *developing* (note I wasn't suggesting *running in production*) on a case sensitive collation will make you much better prepared.

July 12, 2011 12:31 PM
 

AaronBertrand said:

I agree about snapshot isolation. It's not the kind of thing where you should just go switch all of your NOLOCKs to snapshot; you definitely need to test the impact as it can cause serious stress on tempdb. But the concern is definitely on the performance side, because consistency / accuracy is guaranteed to be no worse than NOLOCK. :-)

July 12, 2011 12:41 PM
 

Chuck Rummel said:

paschott, my memory might be fading on this detail but I don't recall when a case sensitive collation was ever the default during installation, and later installers had to carry the sql-based CI default forward even after the addition of Windows collation sequences.  Using the binary sort order used to give a performance boost back in the days of 7.0 but since 2000 the difference has been minimized, and you gained the ability to alter collations at the column level.

However, to me there's a difference between how collations effect your object names (e.g. FROM table vs. FROM Table) vs. logic and data integrity (e.g. PK on a case sensitive LastName column will allow Smith and smith as unique values while case insensitive would not).  It's the logic and integrity that I would consider the bigger impact when changing collations, as if you have two tables whose name only differs by case I'd question why.

July 13, 2011 10:38 PM
 

IL said:

Aaron,

I agree with Chuck about developing on a case-sensitive database. Examples (and links) are needed to understand the topic you are talking about. Are you really considering switching on case-sensitive naming of database objects? Is it how C# (case senstivie)/VB.NET (case-insensitive) naming approaches are different? If yes, I think I've never heard that it is possible with SQL Server too. And I don't think that switching to case-sensitive collation even on a server level could make object names Table1 and table1 different. So is the object of your topic case-sensitive object names or data?

July 14, 2011 2:28 AM
 

AaronBertrand said:

My intention is more about forcing habits of always using the correct case name for objects, columns etc. Data is a different story of course.

The issue is that if you create dbo.myTable and then say SELECT * FROM dbo.MyTable, on a case sensitive database, you will get message 208 (invalid object name). If you use the wrong case for a column name, you will get a similar message (207).

I'm not worried about having two tables, one named dbo.table and one named dbo.Table. I'm worried that when I develop on a case sensitive server, am not careful about casing my objects correctly, and deploy my code to some yokel's case sensitive server, if I cased it incorrectly in even one line of code, I'm going to have to answer to it.

The flip side of the coin is that a case sensitive database also requires that data matches... so if you say "WHERE name = 'Bob'" and it was stored as "bob", you get no rows. This can also cause problems that require workarounds like paschott suggested, but there are other workarounds too - persisted computed columns with the value lower-cased, or simply collate the string columns where you're going to require matching using a case insensitive collation. It's a whirlwind because until your database is contained in Denali this can cause issues with tempdb collation conflicts. I think the safest approach (again, if your code is deployed to many environments, some of which may be case sensitive) is to set SQL Server itself to case insensitive, your development database to case sensitive, and all the (n)varchar columns to case insensitive.

I think I will write another blog post about this issue alone because I am having way too much stream of consciousness to make a more concise comment.

July 14, 2011 9:51 AM
 

Paul Hunter said:

Liked the post ~ especially the part on using a consistent coding standard.  If it's not written down then it's not a standard.  I'll only let developers loose on my dev server (SQL or otherwise) AFTER they sign off that they have read, understand and will follow the standards.  Some grumble that I'm "constraining their creativity", but most actually apreciate the standard.  The standard is very detailed and includeds naming standards (singlar nouns), column names (no abbreviations unless they're industry standards (FIPS, APR, etc.), relationship names, index names, scriptable object names (no prefix [sp_, v_] but suffix [_get, _find, _sav, _del, _v]), case (I prefer lower case), spacing, tabs, matching case for parameters, columns, variables, etc.  I had not considered the issue with case-sensitivity, but the standards address this.  The problem comes in the enforcment area.  I understand that PBM can help with some of it and coding assitance in the form of intellisense really help.

July 16, 2011 12:57 PM
 

Easwaran said:

Great Post!!!

July 26, 2014 1:35 AM
New Comments to this post are disabled

About AaronBertrand

...about me...

This Blog

Syndication

Archives

Privacy Statement