THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

James Luetkehoelter

Nearly any SQL topic presented at times in a slightly eclectic manner.

5 things every SQL developer should know like the back of their hand...

Since I had a post on 5 things a DBA should know, I thought I'd follow-up with a quick post about what *I* think any developer should know...

1) Always prefix object names - I can't stress this enough. Even if you have no schemas in place (SQL 2005 schemas, that is), always reference objects with at least DBO.[objectname].  If done consistently when calling a table, view, stored procedure, function, etc., you can see a *visible* difference in performance. The problem is that if you simply write your code as...

SELECT LNAME,FNAME FROM PERSON

...you potentially create 2 operations on the server. First it will check to see if your particular security context to see if you own the object, then it will use the alias DBO (incidentally, if you're logged in with sysadmin or db_owner priveleges, you resolve to DBO automatically - and that is NOT a justification for having an application connect as sysadmin or db_owner).  Say I'm logged in as the SQL user 'James'. In the previous example, the query engine first has to first has to check to see if 'James.PERSON' exists; if not, it will use DBO.PERSON. Why waste that time. I would go so far as to always qualify column names as well:

SELECT PERSON.LNAME,PERSON.FNAME FROM DBO.PERSON

Of course a table alias is fine in the query ('p.LNAME'). This is more from a potential ambiguity perspective when performing joins, but there is a bit a performance benefit (probably more measureable than visible).

2) Understand set-based logic - Many will blindly say "Avoid TSQL cursors like the plague", but to me the real thing absorb is set-based vs. iterative logic. SQL (in all its many flavors - this isn't just a SQL Server issue) is a set-based language. It isn't designed or optimized for taking a list of records and parsing through them line by line (before you say it, yes, even the set-based log is implemented by a form of cursor, now offically called the default result set - some of us remember it as a firehose cursor). Sometimes iterating through a list of records is unavoidable. Sometimes you need to do it in TSQL - that's why they're there. However, if you need to do iteration, taking a set out to another language (ADO.NET for example) to process can be more effective.

What I see more often is simply a lack of understanding of set-based operations - many developers tend to think iteratively. When working with SQL-based databases, gaining a working knowledge of set-based logic can be a huge step towards moving away from TSQL cursors.

3) Access objects in a consistent order - In SQL Server specifically, pessimisstic blocking is the default (only SQL 2005 offers optimistic, and that has a cost). What's the practical issue? If someone is reading data, no one can write to that data. Given the locking mechanism that is used, this can result in  plenty of blocking, the biggest "hidden" performance cost in SQL Server.

What's worse, is the possibled of deadlocks - SQL simply chooses a victim and one goes through (don't ask me fore the exact algorithm - it isn't as simple as FIFO). To avoid deadlocks, and minimize blocking, always access objects in the same order. For example instead of this:

Session 1:

UPDATE PERSON...

UPDATE ADDRESS...

SELECT PERSON...

SELECT ADDRESS...

Session 2:

UPDATE ADDRESS...

UPDATE PERSON...

SELECT ADDRESS...

SELECT PERSON...

Which will probably result in blocking between sessions 1 and 2, all  sessions performing these similar types of actions should do things in the same order - that way locks are generally released more quickly for other sessions, and the likelihood of deadlocks are lowered dramatically.

By minimizing blocking (and that's a 500 page book in and off itself), you can make a serious performance improvement to your SQL code. This is true in generally, but especially true in SQL Server due to pessimistic locking.

4) Use consistent SARGs - Search arguments (known to the database geek as SARGs - not really an acryonym, is it) can cause havoc with the Query Optimizer. If you aren't consistent in how you access objects in your WHERE clauses, the QO can come up with some very strange results. For example, if you have a compound index on LNAME, FNAME but use 'WHERE FNAME='James' and LNAME='Luetkehoelter'", there is a slight possibility that the index won't be used. There are a lot of factors involved, but if those factors are in place this simple ordering is the straw that broke the camels back. What's worse, it makes it very difficult for a DBA to tune indexes and such if the code isn't predictable. Internal query plans are affected, as are statistics on columns and indexes, and the possibility for auto-parameterization for ad-hoc queries is all but impossible.

Just decide on what the order should be for any where clause, and stick with it. Even if it isn't "optimal" order, consistently will avoid some very ugly tuning situations.

5) Avoid the temptation of XML - OK, so SQL Server 2005 has a native XML datatype, and XML indexes, and support for XQuery 1.0. This all means that data could be stored as XML and searched and retrieved as individual documents, even leveraging indexes. I really think these are great features, however...

 Accessing XML is slow compared to standard relational data. While it is attractive from a development standpoint to use XML to store data (and I think there is a place for this), XML IS NOT BY DESIGN INTENDED FOR PERMAMENT STORAGE AND RETRIEVAL. Some might see this as controversial - to me, it is fact. I have used the XML data type for "dehydrating" an object instance, or for storing historical or original XML documents. I would never advocate the XML data type for continual read, write and update. Even with all of the impovements from 2005 (and coming in 2008), it is still a type of data being "shoe-horned" into a relational storage structure. If you must use XML, try to minimize continual operation on the data *within* SQL Server.

There are some of my "must knows" - what would you add?

Published Friday, September 28, 2007 12:33 PM by James Luetkehoelter
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alexander Kuznetsov said:

Hey James,

I loved this post. Actually for new development in 2005 I don't create any objects in dbo schema at all. Tables are in data and staging schemas, procedures are in Writers and Readers schemas, so the developers have no choice but to qualify all the objects.

September 28, 2007 5:20 PM
 

James Luetkehoelter said:

Hi Alexander,

That's a nice usage of 2005 schemas. That brings up another thing that I try to get people to avoid relying on - users having a default schema. I personally always set that to DBO. Yes, having a default schema for a user does eliminate the double-user lookup if used correctly, but that's the kicker - it must be used correctly. If you error on caution, it just makes sense to always qualify.

J.

September 28, 2007 10:27 PM
 

JAHAN said:

Thank You Guys for your useful tips

September 29, 2007 4:01 PM
 

mark.inmon said:

You forgot a main part that I know most of the people who read this would not leave out. The item is the use of primary and foreign keys. I work with many medical systems and most don’t have any keys at all. I ask myself why no one has been killed using this system. Waiting on the lawsuit when someone gets a limb removed who needed some other procedure done because of how the database is designed.  

October 3, 2007 9:09 AM
 

James Luetkehoelter said:

Hi Mark,

I actually thought about putting in Primary/Foreign key issues, but there are some systems that deliberately avoid them for legitimate reasons (assuming this is handled propery at the application layer).

For high, high volume transactional systems, sometimes leaving out the foreign key lookups can improve performance. Others might leave them out to try to make the database design "platform agnostic".

Personally, I'd always include primary and foreign key constraints at a minimum...

October 3, 2007 9:53 AM
 

Nick67 said:

Hi James,

'...there are some systems that deliberately avoid them' for obfuscation purposes :(

Simply Accounting used to have an MS Access backend, with ~100 tables with 8 character names and utterly no relationships between them.  It makes integrating any other app with it, without pay $1000s for an SDK very difficult indeed--and I have no idea if the SDK would be at all useful.  the columns were also 8 character names.  Extremely good obfuscation techniques indeed.

February 14, 2008 4:58 PM
 

Jack Corbett said:

I really like Alexander's use of schemas.  I have been looking for a good way to use them and that looks promising.

February 29, 2008 12:14 PM

Leave a Comment

(required) 
(required) 
Submit

About James Luetkehoelter

I am passionate about what I do - which is DBA, development, IT and IT business consulting. If you don't know me, haven't met me or have never heard me speak, I'm a little on the eccentric side. One attendee recently described me as being "over the top". Yup, that about says it - because I only speak on topics that I'm passionate about.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement