THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

The second pillar - Normal

The first pillar was easy, since no reasonable person is going to argue that having a design that is not coherent is desirable. No matter what the type of system, any design that isn’t easy to understand is likely to be a bad design (obvious caveats are that it must be understandable to other people of a given level of intelligence in the given subject/tool/language).  The second one is where things get a bit more interesting, or even “religious.” 

In the initial post, I defined this pillar as: normalized as much as possible without harming usability/performance (based on testing).

Sounds easy enough, right? It is, but the problem with this is that there is a real lack of understanding what normalized really means, and to be honest, why you would want to do it in the first place.  To fully cover the question of what and why of normalization would take a lot more than one post ( I do an hour long presentation where I have to talk like a chipmunk at Starbucks to cover an overview, and in my book there are 60+ pages dedicated to the subject.)

Briefly, the basic gist of normalization is that you don’t duplicate data. Every single piece of data in a relational database has a very specific job and it is the ONLY point of data that does that job.  It is neither the source of another piece of data, nor sourced from some other piece (or pieces) of data.

This single point of data should also be as simple as possible, but not simpler (to totally steal an Einstein quote).  If you ever need to break down a value in a SQL statement to use it (like using substring) then you are probably not normalized enough. There are plenty of reasonable anti-examples where this breaks down, such as searching in a large string object on occasion, but even that can depend on the purpose of the large string.  Having data at this (so called) “atomic” (cannot be reasonable broken down further) level means less need for code to manage duplicate data whether copied or summarized for performance. (It will also be important to you when you want to index some data to make a search go faster.)

The reason that this is so important is that normalized data follows the same pattern of usage that the relational engine was built to handle.  Most of us would never use a hammer to screw in a nail, and it would be even less likely to drive a nail with a screwdriver. Why?  Because when we were starting to learn to use tools we learned that these were not good ideas. As young children it would not have been surprising to see a kid smacking a nail with the hard end of a hammer, but not an adult.) Sometimes you might do something less than perfect, (like using a wrench as a hammer) but as a rule, you know that the “best” practice is to have a bag full of tools that you carry around your house to get jobs done.

So why do people expect SQL Server to do a good job as a data manipulation tool when they won’t learn how it supposed to be used.  And ever wondered why functional developers and dbas and  data architects clash? Because the ratio of data architects and dba’s that understand how SQL Server works is inversely proportional to that of functional developers.  It is not a rare occasion for me to see a messy data solution to a problem and be able to formulate a SQL solution that works in a much easier fashion, and still have the ability to follow good relational data patterns.  To the person who doesn’t “get” relational, very iterative, one row = one object instance solutions are normal, leading to lots of singleton retrieves from the database to do data manipulation. But to the relational programmer, we know that taking the normalized structures and letting the SQL engine take our set based queries and do all of the ugly work for us (almost always) provides a solution that works and works fast.

But it all starts with shaping the data architecture in the correct manner (unlike the data architect, which can be almost any shape at all).

Published Sunday, March 01, 2009 3:14 PM by drsql

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Glenn Berry said:

One of the real problems is that SQL Server is almost too easy to get installed and running. There are far too many "accidental" DBAs and developers "playing DBAs on TV", who have never heard of normalization, much less know how to actually do it.

Then you have the exotic concept of indexes, which seem very mysterious to many developers, and unfortunately some DBAs. I have no qualms about bad-mouthing developers somewhat, since I used to be one.

This situation certainly provides more consulting work for people, but I would like to see more BPA and PBM support in the product to try and catch some of these issues.

March 2, 2009 12:30 AM
 

drsql said:

I am a little sad.  I expected at least one or two "your mother wears combat boots" or even "your father was a hamster, and your mother smelt of elderberries" attacks for this one.  I will take this point in time to say that normalization is now the standard that ALL people will follow.  All in favor, say aye (or eye, or even I, I will count them all.)  

All opposed, walk in front of a moving N scale train.  The eyes have it, they have seen the light.  Aye aye captain?  Perhaps I shouldn't blog comment after taking a mild (prescription) narcotic?

March 11, 2009 2:02 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Links to my other sites

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement