THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

James Luetkehoelter

Nearly any SQL topic presented at times in a slightly eclectic manner.

The "meaning" of NULL - a different approach

I know we've all beaten the discussion of NULL to death on this blog, but I had an epiphany of sorts when flying to Germany for the European PASS conference. Thus, I'm going to pick up the topic just one more time...

It occurred to me that the word "Null" has a very specific meaning depending on the language you are speaking - TSQL, "academic"-SQL, English or...German, where null literally means zero. Thus, I would naturally expect some confusion when seeing the word "NULL" come up in a result if it is shown to your average German users.

That brings me to what I think is the fundamental problem with the term NULL. In a very practical sense in SQL Server NULL is:

  • A recording in the NULL Bitmap for a specific row that a specific field (a tuple in the academic lingo) has the property of NULL, i.e., the tuple has a property, it has nothing to do with the value or lack of value in the column

The problem is that language comes in to play. No one likes trying to explain NULL to users in an academic sense, so we come up with definitions. Here are some of the more popular ones (correct or incorrect):

  • "The absence of value" - I think Codd would have approved of this, as I think C.J. Date and others would endorse. The problem I have with this definition is that there is a connotation with this phrase. To me (and others that I've experienced), the implication is "Why is there the absence of value? Did someone ignore the field? Did the clear it out deliberately? Is it not applicable? Did the consumer not supply an answer to the question?" Thus, NULL chaos ensues, where people take it to mean whatever the conceive of (regardless of the intent of the DBA or developer). Here we get reports supplying questionable information.
  • "Empty Set" - I wouldn't agree that this is an accurate definition, but again it implies - "Why is the set empty?" There must be some reason why there is a lack of data within the "tuple". Thus we get the practical affect of converting NULL to an empty string ('') or a zero (0). Was the logic of the NULL transferred properly from the DBAs and Developers to the end user - no. Again, what I call NULL chaos ensues.
  • "Nothing" - Again I would disagree a bit with the definition, but the connotations remain the same - "Why is it nothing? Did no one enter data? Did they remove data? Is the "tuple" not applicable?" All of these things come to mind naturally to me (althought one could question whether I'm an accurate representation of a "normal" person).
  • "Unknown" - As far as SQL Server goes, this is the best definition. The connotations are almost child like - "It is unknown. - Why is it unknown? - I don't know why....Why doing you know why?...ad infinitum". "Unknown" is also a very practical definition when it comes to understanding things like mathematics where NULLs are involved (6+NULL=NULL -->6 + "unknown" must equal "unknown"

In any event, we need to keep in mind as database professionals that the actual description used for NULL will naturally carry with it specific connotations, for good or for bad. Language plays a great role in the understanding and ultimate usage of NULL - let's remain aware of that.

Published Thursday, April 17, 2008 2:28 PM by James Luetkehoelter

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Denis Gobo said:

The same happens in Holland, take this sentence

ze hebben vijf nul verloren (they lost 5 nothing or they lost 5 0)

when you look at some documentation in Dutch (which I never do) you will see nul(0) used to differentiate between NULL and 0

April 17, 2008 4:42 PM
 

James Luetkehoelter said:

Great comment, thanks Denis. I really think this is the fundamental problem with the concept of NULL within both database theory and practical implementation,.

April 17, 2008 5:46 PM
 

AaronBertrand said:

I thought the common European term for zero was "NIL"?  Personally I think we should learn from tennis and use "love" when we are talking about zero.  :-)

April 17, 2008 5:56 PM
 

James Luetkehoelter said:

Ooo, that's good - what's in that field -- " Love "..considrer the connetations of that! It's a complex problem,isn't it?

April 17, 2008 6:06 PM
 

Denis Gobo said:

Aaron,

Europe has many languages and dialects

in Holland you can say niets, nul

In Croatia it would be nula or nista but in Istria (nortwest part bordering Slovenia) it would nula,nis or nic (depending which town)

April 17, 2008 8:00 PM
 

AaronBertrand said:

Thanks Denis, I knew that all languages would not use the exact same word, but I thought I had detected a difference between "NULL" and "NIL" in several different languages.  However, I am not a linguist in any way, shape or form.  :-)

April 17, 2008 8:04 PM
 

Denis Gobo said:

What if they named it VOID or Unknown instead of NULL?

April 18, 2008 7:22 AM
 

AaronBertrand said:

Void would also be commonly misunderstood and fall into the same problems where people want to pass 'VOID' as a string as opposed to VOID as a keyword.  You void a check (sorry, cheque), and things become "null and void."  I don't think there is a word that you can come up with that clearly means absence of value without also having some other connotations... and other connotations inevitably lead to misunderstanding and misuse.  I just think people need to learn and understand the concept, which many are not doing now (and perhaps that is just as much our fault as it is theirs).

April 18, 2008 11:32 AM
 

Hugo Kornelis said:

I have to disagree with your post, James.

While it is true that on the physical layer in a SQL Server database, a NULL is just a bit in a bitmap, this is nothing but implementation details. A different DBMS might implement it in a different way. And in fact, SQL Server 2008 will have a different way to implement NULL for so called "sparse" columns. However, all this is (or rather: should be) irrelevant, since one of the fundamental principles of the relational model is the seperation of the model from the implementation. From a theoretical POV, we should not need to concern ourselves withhow stuff is implemented. (From a practical POV, we should, since it affects things like database size, performance, etc).

As you already indicated, this discussion is not new here. I've done a four-part series on NULL. The most relevant here would be this one: http://sqlblog.com/blogs/hugo_kornelis/archive/2007/07/06/null-ndash-the-database-rsquo-s-black-hole.aspx.

Now to pick on some of your points:

>> To me (and others that I've experienced), the implication is "Why is there the absence of value? <<

Well, if you want to know that, you have found a new attribute. And any time you find a new attribute, you add it to the model. NULL is not unique in this way. An value of 21 in the "Age" column of a table that records members of a club for children will result in similar eyebrow raising. So if there are legitimate situations where this should be allowed, and if there is a need to record the reason, an attrbitue for this will be added to the model. And nobody would ever consider overloading some value to represent both the age of 21 years and the reason why this adult can be a member of a children's club. So why treat NULL different?

>> "Unknown" - As far as SQL Server goes, this is the best definition. <<

Absolutely not. If someone's current employment is "unemployed", then the NULL in "Employer" does not mean that the employer is unknown. If someone's age is "male", then the NULL in "Number of pregnancies" does not mean unknown either. Etc. Sure, these are situations where a data model with seperate tables foor the subtypes could have been used, but that choice has some performance ramifications that might not always be acceptable.

Also, a NULL in the result of an OUTER JOIN does not at all mean "Unknown". And neither does the NULL in the "Valid-to" column of a history table on the "current" row (though some might prefer to use a magic value of 9999-12-31 - I guess we'll have a huge Y10K problem someday...)

April 22, 2008 3:45 AM
 

James Luetkehoelter said:

Ah, Hugo, I was waiting for a response from you :)

I'm aware of the sparse column technique in 2008 (although I think I have a problem with that practical implementation - still mulling over it though).

What I was going for with this little blurb was just how differently people interpret the word NULL, and implications of language when using and describing it. As for your last two points, I agree with the first one. Any time a description prompts a "why", you have a new attribute (or we're dealing with three or four part logic - Null and N/A, etc). The second one, what I read from you is an interpretation (baring know the exact intent of the metadata) of Null. For example, the "Number of pregnancies" field having a Null in it could practically mean "zero" or "the data entry person never asked the question" or "not known at this time". You're absolutely right, it is a weekness, for practical purposes or because of ignorance, of the data design. But I was try to look at it from an end user perspective - they'll impose whatever they think if they see Null (and if you're German and and aware of database theory, that means "zero").

Good points though, thank you. I don't disagree with any of the entirely, I think you and I have have to sit down for a coffee (or a beer) and figure out where we don't connect on Null :)

For anyone else reading these, Hugo's series on Nulls is excellent.

Hugo, what do you think of the idea that Null is really more syntax and less semantics? (Sorry, Philosophy of Language course kicking in...)

April 22, 2008 1:38 PM

Leave a Comment

(required) 
(required) 
Submit

About James Luetkehoelter

I am passionate about what I do - which is DBA, development, IT and IT business consulting. If you don't know me, haven't met me or have never heard me speak, I'm a little on the eccentric side. One attendee recently described me as being "over the top". Yup, that about says it - because I only speak on topics that I'm passionate about.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement