In my previous post, I explained what NULL does and does not mean, how and why the rule of NULL propagation forces any expression involving NULL to result in NULL. I also mentioned that comparisons involving NULL do not result in NULL but, rather, in a new “truth value” called “Unknown”. But there’s a lot more to be written about Unknown and it’s effect on logical expressions, and that’s exactly what this blog post (and the next one as well) will be about.
Most software developers will be intimately familiar with what is popularly known as Boolean logic but should actually be called a twovalued Boolean algebra – a Boolean algebra that is defined on the two truth values True and False, and the three operators AND, OR, and NOT. However, with the introduction of Unknown as a third truth value, some rules of Boolean algebra are violated, so the original twovalued Boolean algebra has to be replaced by a new, nonBoolean threevalued algebra for logical operations. I’ll assume that most readers already know the basics of this threevalued logic, but I’ll briefly cover them anyway (in this post) before moving on to the more interesting side effects (mostly in a future post).
Truth tables for AND, OR, and NOT
Since we have already established that a proposition such as WHERE Age > 35 can result in Unknown, the logical next step is to define how this affects the result of propositions such as WHERE Age > 35 AND Country = 'USA', WHERE Age > 35 OR Country = 'USA', and WHERE NOT (Age > 35). In other words, we have to determine how the new truth value of Unknown affects the truth tables for the AND, OR, and NOT operators. (Note that some sources, including Books Online, label these operators as “Boolean” operators. This is not correct, since the TSQL versions of these operators operate on threevalued logic rather than the Boolean twovalued logic.)
I believe that simple, easytofollow examples are the best way to illustrate a complex subject, so that is exactly how I will explain the logic behind the truth tables for AND, OR, and NOT – by using a simple example. I’ll just stick with my previous example that revolves around my undisclosed age, since that saves me the hassle of not disclosing other personal information.
Let’s look at the AND operator first. In twovalued logic, True AND True evaluates to True, whereas True AND False, False AND True, and False AND False all evaluate to False. The additional truth value of Unknown means that we have to fill in five new cells, for True AND Unknown, False AND Unknown, Unknown AND True, Unknown AND False, and Unknown AND Unknown.
WHERE Age > 35 AND Country = 'USA'
Many readers will probably already know that I’m Dutch (and those that didn’t, do now), so my country of residence is definitely not the USA. If someone points a gun to your head and demands that you tell him or her whether or not it is true that Hugo Kornelis is over 35 years of age and living in the USA what will your answer be? You’ll probably answer that this proposition is false – since you now know that I’m not living in the USA, you don’t need to know my age anymore; regardless of my age, the proposition as a whole can only be false.
Congratulations – you have just filled in two of the missing cells in the truth table for AND: both Unknown AND False and False AND Unknown evaluate to False.
WHERE Age > 35 AND Country = 'Nederland'
Assuming you know that “Nederland” is Dutch for “the Netherlands”, you’ll find this proposition harder to assess. Okay, so the second part is true. But the proposition as a whole is only true if I’m also over 35, which you obviously don’t know. So if once again someone puts that proverbial gun against your head, you can only reply that you really don’t know, and hope that this is enough to convince the shooter not to pull the trigger.
For the truth table, we can now fill in that Unknown AND True (and True AND Unknown) evaluate to Unknown.
WHERE Age > 35 AND Age < 45
This leaves us with just one missing cell in the truth table for AND, for Unknown AND Unknown. The proposition above illustrates this one. Since you don’t know my age, you don’t know if I’m over 35, nor if I’m under 45 – so you have obviously no way of telling whether I’m both over 35 and under 45.
The only logical outcome of Unknown AND Unknown is Unknown.
The truth table of OR also starts with the four cells know from twovalued logic: True OR True, True OR False, and False or True evaluate to True, and False OR False evaluates to False. The same five new cells that were added to the table for AND are added here as well, and we’ll use similar examples to populate them.
WHERE Age > 35 OR Country = 'USA'
Since I still haven’t moved to the USA, the result of this proposition is True if I’m over 35, or False if I’m not. Since you don’t know my age, you don’t know the result of this proposition.
Unknown OR False and False OR Unknown both evaluate to Unknown.
WHERE Age > 35 OR Country = 'Nederland'
Here’s an example where you don’t have to fear the wrath of the shooter. No matter what my age is, you can safely answer that, since I’m living in the Netherlands, it is also true that I am over 35 years of age or living in the Netherlands.
Both Unknown OR True and True OR Unknown evaluate to True.
WHERE Age > 35 OR Age < 30
(Those who have ever seen me will have to stretch their imagination a bit. Just pretend that you really can’t tell for sure that I’m definitely not under 30, to make this a proper example of Unknown OR Unknown, ’kay?)
Not much to be said about this example  Unknown OR Unknown is still Unknown.
Now that the truth tables for AND and OR are complete, we only have the table for NOT left. This one grows from two cells (NOT True = False, and NOT False = True) to three, the third cell being for NOT Unknown. So, here’s the example:
WHERE NOT Age > 35
You don’t know my age. That’s why you don’t know whether I’m over 35. So, how can you possibly know whether I am not over 35? Duh!
NOT Unknown is, of course, Unknown.
So, to wrap things up, here are the completed truth tables for the operators AND, OR, and NOT in threevalued logic
The truth table for AND:

True 
Unknown 
False 
True 
True 
Unknown 
False 
Unknown 
Unknown 
Unknown 
False 
False 
False 
False 
false 
The truth table for OR:

True 
Unknown 
False 
True 
True 
True 
True 
Unknown 
True 
Unknown 
Unknown 
False 
True 
Unknown 
false 
The truth table for NOT:
True 
False 
Unknown 
Unknown 
False 
True 
Some oddness
If you ever forget the logic tables, you can use examples similar to these to easily reconstruct them on the spot. However, do take care that you use independent propositions. If you don’t, you can easily get at incorrect conclusions, as illustrated by these propositions:
WHERE Age > 35 AND Age < 30
WHERE Age > 35 OR Age < 45
This is a fine example of the difference between humans and computers. Humans think; computers only do as told. You do not need to know my age in order to tell everyone who asks you that Hugo Kornelis is definitely not both over 35 and under 30, or that he definitely is either over 35 or under 45. A computer lacks this logic insight – it just does as told. The instructions for the computer are clear: first evaluate the individual propositions (all resulting in Unknown, since each of the propositions by itself is indeed Unknown), then apply the AND or OR operator to the two truth values – so the computer will return Unknown for the two propositions above, even though every sane human knows that they should evaluate to False and True respectively.
CHECK, the odd one out
Once a proposition has been evaluated to a single truth value, the computer has to determine what to do with the result. In most all places where the SQL language rules allow a logical expression, the defined behavior is to treat True one way, and both False and Unknown in another way. For instance, a row is included in the result set of a query if the WHERE clause evaluates to True, but excluded if it evaluates to either False or Unknown. Same goes for the decision to include or exclude a group of rows based on the HAVING clause – it has to evaluate to True; both Unknown and False means that the group is out. The query processor prefers to be safe rather than sorry – if it can’t tell for sure that you want the row included, it won’t be.
For logical expressions in IF and WHILE statements and in CASE expressions, the same logic holds. A condition statement introduced with IF will be executed if the condition is True whereas the ELSE part (if any) is executed on both False and Unknown; a WHILE block will be repeated as long as the expression evaluates to True and execution stops on False and Unknown; and in a CASE expression, the WHEN that evaluates to True determines the result.
There is one notable exception to this – the CHECK constraint. If a row is inserted or updated that causes the CHECK constraint to evaluate to True, the modification is accepted; if it evaluates to False, it is rejected – but if it evaluates to Unknown, the modification is accepted! This is an unexpected change from the accustomed, but not one that has been made without reason – since Unknown usually results from a NULL value in the constrained column, rejecting rows if the CHECK constraint if the proposition evaluates to Unknown would merely duplicate the effect of a NOT NULL constraint. With the definition of a CHECK constraint as it is, illogical as it might seem, it’s very easy to define both an optional age column that will only accept ages over 35 or no age at all, and equally easy to define a mandatory age column that forces you to enter an age of over 35:
CREATE TABLE DemoPersons
(PersonID int NOT NULL
,AgeOptional tinyint NULL CHECK (AgeOptional > 35)
,AgeMandatory tinyint NOT NULL CHECK (AgeMandatory > 35)
,CHECK (AgeOptional = AgeMandatory)
);
Note the third check constraint, which ensures that both age columns are populated with the same age – but only if both age columns are populated at all, since this expression also results in Unknown if either of the age columns is NULL.
More oddness
The different treatment of Unknown in a CHECK constraint versus in a WHERE clause (or anywhere else, for that matter) can result in some interesting behavior. Let’s add some rows to illustrate this.
INSERT INTO DemoPersons (PersonID, AgeOptional, AgeMandatory)
SELECT 1, 36, 36
UNION ALL
SELECT 2, NULL, 37
UNION ALL
SELECT 3, 38, 38
UNION ALL
SELECT 4, 39, 39
UNION ALL
SELECT 5, NULL, 40;
Now, since the check constraint excludes row with an age of 35 or less, the query below should return 0, right? Well, no. It returns 2.
SELECT (SELECT COUNT(*)
FROM DemoPersons)
 (SELECT COUNT(*)
FROM DemoPersons
WHERE AgeOptional > 35);
It appears odd, on first sight, that a table with a CHECK constraint that only allows ages of over 35 still allows for rows that do not have an age of over 35 – but it does, in fact, make perfect sense. Remember that rows for which AgeOptional > 35 is Unknown are admitted by the CHECK constraint, but rejected by the WHERE clause!
In the next post on NULL, I’ll show more seemingly odd behavior of NULL, and explain why this behavior is, in fact, not odd at all. I’ll also answer the question if null if null is null null null is null – I bet you can’t wait!