THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
 in Alexander Kuznetsov (Entire Site) Search

Set based calculation of products of several numbers.

You can use SUM to calculate the sum of several numbers, but you cannot directly use set-based logic to calculate a product. Yet there is a very simple trick – you can use EXP(SUM(LOG(…))) and get a product of several numbers without a loop. The trick was originally posted on newsgroups by Tom Cooper.

For example, suppose that you have to allow for multiple discounts at a time. Consider the following UDF that mocks some complex logic that determines price discounts based on customer and date:

CREATE FUNCTION dbo.GetDiscounts(@CustomerID INT, @AsOfDate DATETIME)

RETURNS TABLE

AS

RETURN(

SELECT 0.1 AS Discount, 'Easter Sale' AS Reason

UNION ALL

SELECT 0.1 AS Discount, 'Preferred Customer' AS Reason

)

GO

Currently any customer is eligible for two discounts, 0.1 each. The total discount after applying them both is 0.19. Of course you can write a cursor or a loop to calculate it. However, the following script calculates total discount using set-based logic, without writing a loop:

SELECT 1-EXP(SUM(LOG(1-Discount))) AS TotalDiscount FROM dbo.GetDiscounts(1, '20090320');

TotalDiscount

----------------------

0.19

(1 row(s) affected)

Published Tuesday, March 17, 2009 6:01 PM by Alexander Kuznetsov

Eric J. Peters said:

Beautiful!  I have a very high appreciation for the old-school/slide-rule mathematical tricks that should be well understood by programmers.

However, it takes a pretty niche problem to need a product of more than a very small number of elements.  But in the virtually all cases, isn't the code-maintainability of a simple loop going to outweigh the costs of actually running the loop?

What's the real gain here?  Considering the cost of the EXP/LOG functions, is there a real gain here?

March 17, 2009 6:57 PM

Alexander Kuznetsov said:

Eric,

I am considering to have a persisted computed column for LOG(1-Discount) to speed it up. I need to frequently run up to 50 multiplications, will do benchmarks soon. I really like the simplicity of the code - no loop can be shorter or simpler than one line expression:

1-EXP(SUM(LOG(1-Discount)))

March 17, 2009 9:54 PM

Hugo Kornelis said:

Hi Alex,

The trick is nice, but you need to be sure that the input will not contain 0 (zero) or negative numbers, otherwise you'll get an error.

Itzik Ben-Gan presents two alternative versions in "Inside Microsoft SQL Server 2005: T-SQL Querying". One is readable, the other one is short.

The readable one (though the formatting will probably be lost):

CASE

WHEN MAX(CASE WHEN val = 0 THEN 1 END) = 1 THEN 0

ELSE

CASE WHEN COUNT(CASE WHEN val < 0 THEN 1 END) % 2 = 0

THEN 1 ELSE -1

END * EXP(SUM(LOG(NULLIF(ABS(val), 0))))

END

The short one:

EXP(SUM(LOG(ABS(NULLIF(val,0)))))*(1-SUM(1-SIGN(val))%4)*(1-SUM(1-SQUARE(SIGN(val))))

Note that "val" is the value to be multiplied, and that any errors are probably my fault (manually copying lots of nested parentheses from a book is hardly ever a good idea).

Best, Hugo

March 18, 2009 5:00 AM

Alexander Kuznetsov said:

Hi Hugo,

Thank you for sharing a general approach.

For a more specific case of discounts, all numbers are positive and not null, which is enforced by constraints. This allows for much simpler code.

March 18, 2009 8:54 AM

Joe Celko said:

Just for documentation, this method first appeared in SQL FOR SMARTIES Section 21.6 PRD() Function.

March 22, 2009 9:26 AM

Archivist said:

Joe, not sure about it. The first time this trick appeared in your books was SQL For Smarties Second Edition which was published around 1999. I can still find references to a newgroup post from Isaac Blank in 1997 and I'm pretty sure someone came up with this simple equivalence before.

March 22, 2009 2:16 PM

March 26, 2009 7:31 AM

:::IMPORTANT NOTE:::

There is a bug in the "short one" posted by Hugo. If your data set is all 0s, it returns NULL rather than 0. If there are any non-0 values and more than one 0, you'll get strange results. These issues do not happen with the longer version, so use that.

---

SELECT

EXP(SUM(LOG(ABS(NULLIF(val,0)))))*(1-SUM(1-SIGN(val))%4)*(1-SUM(1-SQUARE(SIGN(val))))

FROM

(

values (0),(0),(0)

) x (val)

SELECT

EXP(SUM(LOG(ABS(NULLIF(val,0)))))*(1-SUM(1-SIGN(val))%4)*(1-SUM(1-SQUARE(SIGN(val))))

FROM

(

values (1),(0),(0),(0)

) x (val)

---

September 7, 2009 4:33 PM

And in case anyone cares, Itzik has a fix for the second issue on his corrections page, and I've come up with the following version, a modification of his, that also fixes the first issue:

SELECT

COALESCE(EXP(SUM(LOG(ABS(NULLIF(val,0)))))*(1-SUM(1-SIGN(val))%4)*MIN(ABS(SIGN(val))), ABS(SIGN(SUM(val))))

FROM

(

values (0), (0)

) x (val)

September 7, 2009 5:21 PM

Alexander Kuznetsov said:

This is a nice correction, thanks!

September 7, 2009 8:56 PM

(required)
(required)
Submit