Strange title, yes. After reading Hugo Kornelis' excellent post on NULLs, I found myself getting very little sleep as I pondered the points he made. If you haven't already read it, do so, now prefereably. The two main points he brought out were (correct me if I'm wrong Hugo):
1) Unknown (among others) is not the best working academic definition of NULL - "Is Missing" fits how we would naturally deal with it conceptual, as does SQL Server. There are not different meanings or "flavors" of NULL - it just means data is missing - no reason given or necessary.
2) NULL actually implies two separate attributes - A true implementation of NULL (Unknown, Is Missing) implies another describing the reason for NULL. He used the idea of having a column "Birthday" that would be set to NULL at times. We really need to know why it was set to NULL - it was ignored, someone was offended at having to reveal their age, it wasn't known at the time of entry, etc. All are legitimate reasons why the column is left NULL, and it may be information that is important to capture. However if we did want to capture "reasons for no birthday", we'd have to create a separate column hold that information. If we do this for every NULLable field, it becomes very impractable (especially when performing queries that include a WHERE statement).
3) If we use "Is Missing" as our practical definition of NULL, it is eliminates the implication of why something "Is Missing" - we're making no claim other than that. And since we avoid the need for an explanation of why something is missing, it makes queries involving these "NULL" predicates much easier to deal with practically and intuitively.
I have a slightly different, odd take on NULL after reading this. Unfortunately this will be propeller-head academia discussion, but I will jump to the practical in part II, I promise :). Hugo and I are basically saying the same thing, I'm just putting a different spin on it that "tastes" better to me than using the definition of "Is Missing".
Hugo also made a comparison between a NULL in a database and black hole (not a real comparison, but more as a literary device within the article, but I think this brings out an important point).
A black hole, in theoretical terms, is not even similar to NULL. A black hole has theoretical attributes. It is the greatest source of gravity conceivable since nothing can escape once it gets too close to the black hole. The concept of an "event horizon", the point at which light can no longer escape the gravitational pull. At the center of the black hole is a "singularity", matter packed to maximum density. There are even real, measurable properties of a black hole. One is usually detected by astrophysicists by an unknown gravitational impact within a region of stars. There are even more properties of black holes, theoritical and practical, that are so complicated they make my brain hurt.
The problem why we have two implied attributes with a Birthday column when the column is NULL has nothing to do with the definition of NULL; the problem is that we insist on trying to associate a definition to NULL in the first place. As soon as we use the phrases "Unknown" or "IsMissing", our minds naturally impose the question: Why is this column "unknown" or "missing"? Once we do that, there is a laundry list of boundary conditions that could apply - it could have been ignored, simply missed, deliberately not entered, N/A. As Hugo discusses, if we need to capture those boundary conditions we have a real problem when it comes to database design.
For me a more accurate, though difficult to comprehend definition of NULL is - "no semantic meaning", or "no attribute". NULL doesn't even have a data type. It isn't even strictly associated with a column at a practical level (it is recorded in a NULL bitmap near the header of the row). The problem is the minute we try to describe NULL, we automatically imply other attributes due to our need to ask "why?"
Personally I like to think of NULL in terms of three-value logic (my background is in Philosophy and Mathematics, can you tell?). Your options are yes/no/unknown. That's it. When it comes to usage, unknowns combined with another value is still an unknown. And this is how the SQL engine works (with some tweaking exceptions).
NULL - no attribute. Nothing else.
Which brings us to the crux of the problem. In everyday usage, the definition of and usage of NULL is so varied, having any level of consistency is extremely difficult. However, here's where I through in my conversially claim:
In practical usage, NULL can be used to mean whatever the business needs it to mean, as long as that meaning is consistent.
I have seen the strangest behaviours when it comes to NULL that it almost the database purist in me cry. But, after thinking about it for over a decade, I'm conceding the point to the practical usage. More on this in part II of this post.
Hugo definitely brings out a great point about the separation of attributes when it comes to NULL and "reasons why it is NULL". My take is that the latter is imposed after the fact when we try to interpret what NULL represents. It isn't another attribute of NULL - it is our attempt at understanding why something is NULL (and recording that information). In academic terms, I see NULL as nothing, no attribute. In practical terms, it's fair game as to what NULL represents.
More to come on the practical (but not necessarily proper) usage of NULL.