THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SELECT Hints, Tips, Tricks FROM Hugo Kornelis WHERE RDBMS = 'SQL Server'

What if null if null is null null null is null?

In this fourth and final part in my series about NULL, I’ll discuss some well-known and some less well-known functions and keywords that are specifically created to deal with NULL values. And I will, of course, explain why null if null is null null null is null. In case you have not yet read them, you can click these links to read the first, second, and third part.

 

IS NULL is not = NULL

 

I have already explained why tests for equality with NULL will always return Unknown instead of True or False. This holds true in all circumstances – even when both sides of the equation are NULL. That’s why the query below will not help you find the people for which no birthday is on file:

 

SELECT FirstName, LastName

FROM   dbo.Persons

WHERE  Birthday = NULL;

 

Even though no birthday is on file for Hugo Kornelis, my name will not be returned – because when evaluating “my” row, SQL Server will see a comparison between a missing value (my birthday) and a missing value (the NULL) – and when asked whether two unspecified values are equal, the only guaranteed correct answer is “I ain’t the faintest, dude” (for the British readers, that would be “I haven’t got the foggiest idea, old chap”).

 

Since there are numerous occasions where you want to find rows where values are missing, a special operator was introduced for this: IS NULL. Rewriting the query above as follows will return my name, because my Birthday is indeed NULL in this database.

 

SELECT FirstName, LastName

FROM   dbo.Persons

WHERE  Birthday IS NULL;

 

Note that the operator IS NULL will return True if the operand (Birthday in this case) is NULL, or False if it is any other value – an IS NULL test can therefore never result in Unknown. So to find values that are not missing, the expression can simply be negated to

WHERE  NOT(Birthday IS NULL);

but there is also a shorthand form available that is much more common – in fact so much more that I don’t think I’ve ever seen the above version used!

WHERE  Birthday IS NOT NULL;

 

Do not forget that the NULL keyword is an integral part of the IS [NOT] NULL operator. You can’t use IS or IS NOT as a replacement for = and <> and expect the same special treatment of NULL values that IS [NOT] NULL provides. That would require the use of the IS [NOT] DISTINCT FROM operator that is specified in SQL:1999 but not yet implemented in SQL Server (vote here if you’d like to see this changed!)

 

SET ANSI_NULLS { ON | OFF }

 

Some of you may have seen existing code where a predicate such as Column = NULL is used, and does actually return rows. This is probably legacy code that has not been maintained in a long time (and if it isn’t, it has to be written by a legacy developer who has not maintained his knowledge in a long time).

 

Very early versions of SQL Server were released before agreement was reached in the ANSI committee on all features. As a result, some features were implemented in a different way than what the ANSI committee turned out to describe, facing Microsoft developers with the challenge to upgrade to ANSI compliant behaviour without breaking existing code.

 

Behaviour of NULL comparisons in predicates is one such example where the original version of SQL Server “got it wrong”. So when SQL Server was changed to return Unknown on a comparison with NULL, the SET ANSI_NULL OFF option was introduced to force the “old” behaviour that would return False on NULL vs. non-NULL comparison, and True on NULL vs. NULL comparison.

 

The SET ANSI_NULL option has been marked as deprecated in SQL Server 2005, so if you find code that still relies on this setting, you’d better change it ASAP. This would of course be very easy if you only had to look for “= NULL” and “<> NULL”, and change them to “IS NULL” and “IS NOT NULL” – but unfortunately, expressions of the form “= expression” where expression can be NULL are also affected by the ANSI_NULL setting; these are much harder to identify or fix!

 

ISNULL is not COALESCE

 

Despite the name similarity, and despite the fact that in Access, ISNULL(xxx) is equivalent to xxx IS NULL, the T-SQL ISNULL function is completely different from the IS NULL predicate. ISNULL is used to return the first non-NULL from its two inputs. However, ISNULL is also a leftover from the days before the ANSI standard was finalised – the ANSI standard function that should be used to replace ISNULL is called COALESCE. Despite the similarities, there are also a couple of differences that you should be aware of before replacing all your occurrences of ISNULL to COALESCE.

 

One reason to prefer COALESCE of ISNULL, in addition to its adherence to standards, is that COALESCE can take any number of arguments whereas ISNULL only takes two. Both will return the first non-NULL argument, or NULL if all arguments are NULL. The unlimited number of arguments makes COALESCE a far better option when tasked to find the first non-NULL value from more than two inputs – for example, to find the first non-NULL of Arg1, Arg2, Arg3, and Arg4, I’d rather use

COALESCE(Arg1,Arg2,Arg3,Arg4)

than

ISNULL(Arg1,ISNULL(Arg2,ISNULL(Arg3,Arg4)))

 

Another major problem with ISNULL is how it deals with implicit conversions. COALESCE will, like any other T-SQL operation, use the rules of data type precedence to find the data type of its result – but ISNULL will always return a value of the data type of its first argument, making this the only T-SQL keyword that does not respect the precedence rules.

 

Unfortunately, there is one situation where you can not simply rip out ISNULL and replace it with COALESCE – and that is in the case of computed columns. If you attempt to create the tables below, you will see that Test_ISNULL is created without problems, whereas the attempt to add the index to Test_COALESCE fails.

 

CREATE TABLE Test_ISNULL

     (Col1 int NOT NULL,

      Col2 int NULL,

      Col3 AS ISNULL(Col2, Col1) PRIMARY KEY);

go

CREATE TABLE Test_COALESCE

     (Col1 int NOT NULL,

      Col2 int NULL,

      Col3 AS COALESCE(Col2, Col1) PRIMARY KEY);

go

 

I consider this to be a bug. Clearly, a computed column using COALESCE should inherit its nullability from the last argument, just as is the case with ISNULL. Please vote on Connect if you agree with me that this bug should be fixed, as a first step towards deprecating and removing the superfluous and confusing ISNULL function.

 

NULLIF, the forgotten one

 

The last NULL related keyword that I will cover here is also the least well-known (although some strange coincidence cause Jeff Smith to blog about it earlier this week). That may be due to the fact that it is only useful in a limited number of cases, and also not as easy to understand as the other ones.

 

The NULLIF function takes two arguments. It returns NULL if the first argument is equal to the second one – in all other cases, the first argument is returned unchanged. According to the SQL standards, NULLIF is actually a shorthand form for a CASE expression:

NULLIF(Arg1, Arg2)

is defined as equivalent to

CASE WHEN Arg1 = Arg2 THEN NULL ELSE Arg1 END

 

The most common use of NULLIF is to prevent runtime errors such as division by zero. For instance, the query below will fail if there are any rows with (Col1 + Col2) equal to zero:

SELECT IdCol, Col1 / (Col1 + Col2) AS Col1Ratio

FROM   dbo.SampleTable;

 

One way to attempt to prevent this error is to exclude these rows:

SELECT IdCol, Col1 / (Col1 + Col2) AS Col1Ratio

FROM   dbo.SampleTable

WHERE  Col1 + Col2 <> 0;

However, this might fail because SQL Server might use an execution plan that calculates Col1Ratio before applying the WHERE clause – that is not a bug, the SQL standards explicitly allow the vendors such freedoms in the implementation of their products. And if other columns are displayed as well, completely omitting the rows is not a good way to tackle this problem.

 

This is where NULLIF comes to the rescue. In the query below, the divisor will be changed to NULL if it was 0, setting the result of the entire calculation to NULL instead of causing a runtime error.

SELECT IdCol, Col1 / NULLIF(Col1 + Col2, 0) AS Col1Ratio

FROM   dbo.SampleTable;

If so desired, a COALESCE function can be added to change this NULL result back to a chosen numeric value – or it can be left as NULL to make it instantly visible that for this row, a value of Col1Ratio could not be computed.

 

Another problem area where NULLIF can be used is to check if values are distinct. I have already explained the difference between unequal and distinct in the third part of this series, and I have also presented this method for testing for distinctness:

WHERE OldValue <> NewValue

OR   (OldValue IS NULL AND NewValue IS NOT NULL)

OR   (OldValue IS NOT NULL AND NewValue IS NULL);

 

This is of course rather clumsy; many developers prefer to replace this with a shorter expression. That is very easy to do if there is some magic value that OldValue and NewValue can never be equal to – for instance, if both are always above 0, you could replace the three lines above with this single line:

WHERE COALESCE(OldValue, -1) <> COALESCE(NewValue, -1)

 

But what if any value can occur in the domain? In that case, you can use NULLIF to shorten the expression a bit:

WHERE NULLIF(OldValue, NewValue) IS NOT NULL

OR    NULLIF(NewValue, OldValue) IS NOT NULL

Remember that NULLIF returns NULL if both arguments are equal, but also if the first argument is NULL. So for the expression above, both NULLIF expressions will be NULL if either NewValue is equal to OldValue, or both are NULL. If one of the two is NULL and the other is not, only one of the NULLIF expressions will be NULL. If neither is NULL and they are not equal, both NULLIF expression will be NOT NULL.

 

Do remember that neither of the three distinction tests presented above will allow the use of an index seek strategy. Until Microsoft implement IS [NOT] DISTINCT FROM, we’ll have to live with not being able to use an index seek for this.

 

What if null if null is null null null is null?

 

Now, I can finally answer this question – I only have to remove some spaces and add some commas and parentheses to change it to valid SQL Server syntax:

IF NULLIF(NULL, ISNULL(NULL, NULL)) IS NULL

 

Note that this is not a valid statement, as the IF statement misses its statement block. Don’t add it yet – first try to predict the results without running the code.

 

The result of ISNULL(NULL, NULL) should of course be NULL, since both arguments are NULL. That simplifies the IF statement to:

IF NULLIF(NULL, NULL) IS NULL

 

The result of NULLIF(NULL, NULL) should be NULL. Not because the two operands are equal (they are not – remember, NULL = NULL evaluates to Unknown!), but because in case of unequal operands, the first argument is returned. So now we have:

IF NULL IS NULL

which will of course evaluate to True.

 

Now complete the IF statement in the original query and run it to check our prediction:

IF NULLIF(NULL, ISNULL(NULL, NULL)) IS NULL

  PRINT 'Prediction was correct';

ELSE

  PRINT 'I goofed...';

 

Msg 8133, Level 16, State 1, Line 1

None of the result expressions in a CASE specification can be NULL.

 

So I didn’t predict the result correctly, nor did I goof … instead, I discovered an interesting caveat with NULLIF – or rather, with the CASE expression that this NULLIF expression expands to:

 

CASE WHEN NULL = NULL THEN NULL ELSE NULL END

 

The problem here, is that both the THEN (or rather, each THEN, as a CASE expression allows as many WHEN … THEN clauses as you wish) and the ELSE clause return the constant expression NULL. That this is the problem is easily verified:

 

SELECT CASE WHEN 1 = 2 THEN NULL ELSE NULL END

Msg 8133, Level 16, State 1, Line 1

None of the result expressions in a CASE specification can be NULL.

 

This error message is misleading. Result expressions of a CASE specification are allowed to be NULL; I’ve used that many times. The real reason that this particular query errors is related to how SQL Server determines the data type for a constant in the query. A numeric constant, like 42 or 3.14159265, is considered to be integer or numeric(9,8); a constant enclosed in quotes like ‘this’ is considered to be varchar(4). But since any data type allows the NULL “value”, the constant NULL can be of any data type and SQL Server has no way of guessing what you mean. This is usually solved by checking what data types are used around it. So in the expression

CASE WHEN 3 = NULL THEN NULL ELSE '3' END

the first NULL is considered to be of data type integer (as it is compared to the constant 3, which is integer as well), and the second NULL is considered to be varchar(1) (as both this NULL and the constant ‘3’ are possible results of the CASE expression).

 

Back to the error query – if all possible result expressions of the CASE expression are the constant NULL, SQL Server has no way of working out the data type of the result expression, and that is indeed an error – so SQL Server did the right thing in returning an error, it just chose a bad message! Interestingly, even though COALESCE is also defined as shorthand for a CASE expression, there is a specific error message for COALESCE(NULL,NULL) in SQL Server 2005 (maybe as a result of this connect entry?) – but instead of fixing the real problem, Microsoft chose to fix only the specific COALESCE case. I have of course filed a bug report on Connect for this misleading error message.

 

Anyway, now that I know the cause of the error message, I can also fix it, and run the code below to finally check if null if null is null null null is null:

 

-- Please, NEVER use this as a variable name in real code....

DECLARE @NULL int;

-- This superfluous assignment makes the code self-documenting.

SET @NULL = NULL;

-- Now check if null if null is null null null is null:

IF NULLIF(@NULL, ISNULL(@NULL, @NULL)) IS NULL

  PRINT 'Prediction was correct';

ELSE

  PRINT 'I goofed...';

 

Prediction was correct

Published Sunday, September 30, 2007 11:19 PM by Hugo Kornelis

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Denis Gobo said:

October 1, 2007 12:30 PM
 

Alejandro Mesa said:

Hi Hugo,

I really liked the whole serie.

The final question reminded me about a NULL Puzzle, posted by Steve Kass on Jan 2003. I still use it to show others the implication of using "SET ANSI_NULLS OFF". It good that Microsoft is deprecating it.

http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_thread/thread/4693f68a7140bb80/daca91a7912bdc76?q=steve+kass+and+null+puzzle&rnum=2&hl=en

Cheers,

AMB

October 1, 2007 3:25 PM
 

Hugo Kornelis said:

It's great to hear that you enjoyed the serie, Alejandro.

I really had to convince myself to write the final two parts, since my mind was already busy preparing psots on much more interesting subjects (no I won't tell - just wait and see <g>), so it's good to hear that I did not do that in vain!

And thanks for the link to Steve's puzzle. I'd seen it before, but had forgotten about it. There's some *REALLY* weird stuff going on with ANSI_NULLS OFF indeed!!

October 1, 2007 4:51 PM
 

James Luetkehoelter said:

I really like this post - I hope everyone reading it appreciates how complicated NULL can actually be when implemented correctly. Imagine what happens when someone literally types in the word 'Null' when they mean NULL....sigh...

October 2, 2007 11:20 AM
 

ScottPletcher said:

Another great article.  I'm glad you were willing to finish writing the series.  NULL is tricky enough that more info is always needed.

October 3, 2007 10:27 AM
 

Denis Gobo said:

This is part two of a three part series. Part one was about the phone interview , this part is about

December 10, 2007 3:02 PM
 

Rajni Padhiyar said:

Thanks Friend,

Nice article

Regards,

Rajni Padhiyar

Software Engineer

Techsture Technologies

August 13, 2009 2:32 AM
 

Yogesh Chandra Upreti said:

Thanks for this information described in such a nice way.

Thanks a lot.

Regards,

Yogesh Chandra Upreti

Sr. Associate Technology

Magnon Solutions, Noida

May 23, 2012 1:17 AM
 

Goutam said:

Thanks for the article, its really great.

July 2, 2012 6:53 AM
 

SQL Server Helper said:

August 11, 2012 1:06 AM
 

aa said:

aaa

February 9, 2013 5:34 AM
 

o.O said:

o.O  

O.o

December 10, 2013 2:48 AM

Leave a Comment

(required) 
(required) 
Submit

About Hugo Kornelis

Hugo is co-founder and R&D lead of perFact BV, a Dutch company that strives to improve analysis methods and to develop computer-aided tools that will generate completely functional applications from the analysis deliverable. The chosen platform for this development is SQL Server. In his spare time, Hugo likes to visit the SQL Server newsgroups, in order to share and enhance his knowledge of SQL Server.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement