THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

When the obvious answer is obviously wrong

This post is about how simple math in T-SQL can produce undesirable results, but first we begin with a math quiz. Answer the following as quickly as possible: You just read pages 100-300 of a book. How many pages did you read? QUICKLY NOW! For those of you who answered 200 pages, I have a new question: Which page did you not read? There were 201 pages to read. If you read 200 pages, you skipped a page!

What your answer be if I asked you how many pages did you read if you read pages 1-3? Three pages! I think everyone answers that one correctly.

Now that you know you should be cautious, let’s consider an actual T-SQL problem I looked at today.

create table #profit (

    employeeId int

  , period int

  , amount float

);

 

insert into #profit (employeeId, period, amount) values (101,1,1000);

insert into #profit (employeeId, period, amount) values (101,2,2000);


The percentage profit increase from period 1 to period 2 (e.g., from January to February) is:

                      ( (2000-1000) / 1000 ) * 100 = 100%

That’s pretty simple. Here it is coded:

select n.employeeId

     , nMinus1.period as period_from

     , n.period       as period_to

     , ( (n.amount - nMinus1.amount) / nMinus1.amount ) * 100 as pct_change

from #profit n inner join #profit nMinus1

on n.employeeId = nMinus1.employeeId

where n.period - 1 = nMinus1.period;

employeeId  period_from period_to   pct_change
----------- ----------- ----------- ----------
101         1           2           100

Let’s add some more data and run the query again.

insert into #profit (employeeId, period, amount) values (201,1,-1000);

insert into #profit (employeeId, period, amount) values (201,2, 1000);


employeeId  period_from period_to   pct_change

----------- ----------- ----------- ----------

101         1           2           100

201         1           2           -200

Perhaps the results can be justified technically, but from a business perspective, the results are incorrect. What happens if a high level manager reads the summary report? The evidence shows a negative change in the profitability for employeeId 201. But employee 201 actually doubled the increase in profits as measured in dollars when compared to employee 101. Regardless of the calculated percentage change, employee 201 moved profits in the positive direction. Management wants to see 200%, not –200% for employee 201.

In other words, we want both values to be positive. The absolute value function can do that, but should it be used?

select n.employeeId

     , nMinus1.period as period_from

     , n.period       as period_to

     , ABS( ( (n.amount - nMinus1.amount) / nMinus1.amount ) * 100 )

from #profit n inner join #profit nMinus1

on n.employeeId = nMinus1.employeeId

where n.period - 1 = nMinus1.period;

employeeId  period_from period_to   pct_change
----------- ----------- ----------- ----------
101         1           2           100
201         1           2           200

Let’s enter some additional test data and execute the query again:

insert into #profit (employeeId, period, amount) values (301,1,2000);

insert into #profit (employeeId, period, amount) values (301,2,1000);


employeeId  period_from period_to   pct_change
----------- ----------- ----------- ----------
101         1           2           100
201         1           2           200
301         1           2           50

A loss in profits occurred on employee 301’s watch. We want the percentage change to be negative in that case. There are several different solutions that I can think of and many more I can’t think of. The SIGN function is an obscure and rarely used function that can save our query.

select n.employeeId

     , nMinus1.period as period_from

     , n.period       as period_to

     , ABS( ( (n.amount - nMinus1.amount) / nMinus1.amount ) * 100 )
      
* SIGN(n.amount - nMinus1.amount) as pct_change

from #profit n inner join #profit nMinus1

on n.employeeId = nMinus1.employeeId

where n.period - 1 = nMinus1.period;

employeeId  period_from period_to   pct_change
----------- ----------- ----------- ----------
101         1           2           100
201         1           2           200
301         1           2           -50

Sqlblog reader Marc Shapiro simplified the query as shown below. It's definitely better. Now I'll have to think of a better example for using the SIGN function!

select n.employeeId

     , nMinus1.period as period_from

     , n.period       as period_to

     , ( (n.amount - nMinus1.amount) / ABS(nMinus1.amount) ) * 100 as pct_change

from #profit n inner join #profit nMinus1

on n.employeeId = nMinus1.employeeId

where n.period - 1 = nMinus1.period;

Another reader, Leonid, asked if nMinus1.amount could equal zero. It can if the following inserts are executed.

insert into #profit (employeeId, period, amount) values (401,1,   0);

insert into #profit (employeeId, period, amount) values (401,2,1000);

We all know that division by zero is mathematically undefined, but it demands a business definition in the form of complete requirements. What is the business rule for going from zero profit in one period to the next period? Until that question is answered, the solution cannot be coded. If the rule is to treat the percentage increase as 100%, then the following code would satisfy the requirements:

select n.employeeId

     , nMinus1.period as period_from

     , n.period       as period_to

     , case nMinus1.amount

           when 0 then 100

           else ( (n.amount - nMinus1.amount) / ABS(nMinus1.amount) ) * 100

       end as pct_change

from #profit n inner join #profit nMinus1

on n.employeeId = nMinus1.employeeId

where n.period - 1 = nMinus1.period;

Satisfying business requirements may necessitate implementing something other than the obvious solution. It is absolutely essential to understand the business requirements before implementing a solution. Now I will directly state what I was implying by example: Business requirements need to be turned into test cases. The test cases should drive the development process.

 

Published Wednesday, March 09, 2011 11:52 PM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

MarcShapiro said:

I think the last computation is simpler as

( (n.amount - nMinus1.amount) / ABS(nMinus1.amount) ) * 100

March 10, 2011 1:52 PM
 

Leonid said:

Is there a chance that nMinus1.amount=0?

March 15, 2011 5:21 PM

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is a Technology Solution Professional for Microsoft's data platform and works out of Microsoft's Houston office. Prior to joining Microsoft, he was a Microsoft SQL Server MVP. He is experienced in Microsoft SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. John is also a Registered Nurse who graduated from Vanderbilt University with a Master of Science in Nursing Informatics and is an active member of the Sigma Theta Tau nursing honor society. He volunteers as a nurse at safety net clinics. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement