THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

Sixth pillar – Well Performing

I guess eventually I had to admit that performance matters. Whenever I speak, and in all of my books, I try to stress over and over that performance is NOT the only thing. The fact is all too much time is spent trying to make database applications run faster when the real goal should be to architect data structures that solve the problems of the user in a natural way with proper integrity.

But performance is important. Extremely important. Like a fancy sports car, it is the first second thing that a user will notice about the system…The first will be the UI (the paint job). 

The problem is that it isn’t the most important thing. Just like a sports car, the most important thing is that everything works. If the car runs 100 miles an hour in first gear, it doesn’t matter if the seats aren’t bolted in and the steering wheel works backwards.  I initially defined well performing as “Gives you answers fast”, and in the context of performance, that is true, but in reality, that is a very simplistic attitude.  Certainly taken out of context, this is definitely NOT a good explanation of well performing. Maybe: “Does all operations required in a minimal amount of time in consideration of the realities of physics?” 

When considering performance, there are many facets to the problem:

  • Initial Database design – Sometimes the reality of the problem to be solved cannot be done quickly, but often it is just poorly designed data structures that get in the way. Designing the database with the concepts of Normalization close at hand is the first step in getting things right. The relational engine of SQL Server is named relational because it likes data to be formed in a relational manner. The concepts of Normalization are the backbone of relational design (sorry, I have to make mention of Normalization every time I blog, I think).
  • Concurrency – Hey, if all I ever had to work on were single user systems, life would be easy.  But the fact is, on the main database system I work with, there are FAR more than one user. And most of the “users” aren’t humans (no, they aren’t zombies either) but rather machines, reading in data from mail, internet, and phone calls and processing the data.  Actual human beings work slowly compared to what a computer can pump in data. And to make sure that the data isn’t complete rubbish by the time the humans actually look at it, we have to use locks, and locks slow things down…But the people want to feel like they come first…it is a difficult task, but good design and decent throttling mechanisms can be used to make it happen. (And don’t get me started on all of the moving parts, disk, memory, etc, etc. Beyond data level locks, you have hardware/resource locks called latches that make certain that the laws of physics are honored.)
  • Indexing – The first thing most people think of when they are adjusting performance, and for a good reason.  An index can turn a full table scan operation into a single row operation, simply by adding a primary or unique constraint (usually part of the logical database design) or any other unique or non-unique indexes that may be needed for the queries.  But indexes aren’t free, and you have to be careful not to put useless and irrelevant indexes on “just in case.”
  • Hardware – SQL Server can be run on simple cheap hardware, and for most situations you wouldn’t notice if it was run on a server that could easily be a file server. Unfortunately, the problem is that as you start to need “real” power, the hardware configuration cannot be so…simplistic.  Adding CPU power and RAM is simple, but the worst part of the process is disks.  Until solid state drives really hit it big (and even afterwards in many ways) disk drive speed and redundancy is big in making your server run fast.
  • Good code that accesses your data – No matter how well your database is designed, if you code in a crappy manner, using loops where queries would work better, lots of unnecessary temp tables, poorly formatted search arguments like WHERE datediff(day, columnName, getdate()) > 1 (would not use an index on columnName), you are hosed.  If you have the foresight to use stored procedures, you can go back and tune later in the process as you need to (in case you get stuck with programmers who aren’t really relational “yet”), or at least have an architecture where you can adjust the code being executed from your app, you can fix poorly performing code…if not, well, good luck (and get ready to throw away a lot of bucks on hardware.)
  • Adjusted Database Design – sometimes you just can’t use the design you have, and as a good designer/architect, you have to be big enough to admit that.  Denormalization can be a solution (especially if you are trying to optimize reports…) but often you just have to go in and adjust the design in other ways.  The better your code/encapsulation layer, the easier it is to do (it might take a bit longer, but it will be straightforward/safe).
  • Etc – Network speed, faulty hardware, poorly performing applications…I could go on for days, but I won’t.  The fact is, there are many problems that can make performance bad, and sometimes they are database related, and sometimes not.

I am not numb to the fact that performance is an issue.  It is really a major pain point, because it changes so much with all of the aforementioned factors, but also because it is negotiable.  Data integrity is not a negotiable factor. You can’t say “bah!” to users because they complain that “their numbers aren’t adding up”, but “the server seems kind of slow” can easily be written off as whining (mostly because it usually is.)  The problem is, whether your company admits it to itself or not, time is money, and if a user has to do a task 200 times a day and performance issues makes it take 10 seconds instead of 4 seconds, it doesn’t seem like much… but 1200 seconds is not a small amount of time in the least.  And if there are 100 people doing these same repetitive tasks…that’s a lot of time (and no uncommon in say, a call center.)

So what’s to be done?  Planning, testing, having proper environments for testing performance are great things but to do it right you will probably need to double your spending on hardware to have a production and a “pre” production environment to test out your designs under load.  Most smaller companies (particularly if the hardware isn’t the backbone of what they do) don’t have that.  Production is where the load is tested, and they hope for the best.

Well stop hoping for the best, and design for great performance. Whether you can test performance or not, you can still be prepared by building your database for performance and just as important, for tuning in the easiest manner.

Published Tuesday, October 13, 2009 8:57 PM by drsql

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

AllenMWhite said:

One of my biggest peeves is the mantra "denormalize for performance".  Most frequently, writing better queries (selecting only needed columns, etc.), creating proper indexes, eliminating iterative processes, etc. will yield far better performance improvements than any denormalization will provide.

I just evaluated some poorly performing stored procedures that use OpenXML to shred incoming XML data.  I rewrote the query using XQuery and the same process went from 247 milliseconds to 15 milliseconds.  Denormalization will not help that process, understanding the tools at your disposal will.

October 14, 2009 10:44 AM
 

BobW said:

I'll trade performance for reliability.  Our ERP software runs on SQL Server.  The code looks clean logically, but occasionally weird stuff happens.

In one instance the ERP vendor issued a patch that stated "installing this patch will make it less likely that transactions are lost."

I started out 30+ years ago as a mainframe programmer and got out out IT for a while.

At what point did it become acceptable to "lose transactions."

I don't care if you can make an ERP transaction post 1 ten thousandsth of a second faster.

Give me reliability so I don't end up loosing 6 hours a month trying to figure out what was lost and why.

October 19, 2009 9:33 AM
 

drsql said:

>>"installing this patch will make it less likely that transactions are lost."<<

I have seen stuff like that out of dev types that work on in-house systems, and perhaps it is just a matter of the necessities of life, but on a product...that costs (almost certainly) thousands of dollars, no way. Forgetting about the reality of the statement, what manager type lets that get out the door anyhow. Perhaps: "Fixed a situation wher transactions are lost due to something that customers do unwittingly and we didn't forsee their ignorance", toned down a bit, of course :)

October 19, 2009 10:40 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Links to my other sites

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement