Originally posted
here.
It's been longer than I hoped since my last installment on
bitmask / big number handling.
Life caught up with me and I've had many thankless tasks to catch up
on. But that's over now and I'm back to the general slacking that
typifies my days, so welcome to Part 3, handling logical operators.
I'll be discussing four operators in this post. AND, OR, XOR,
and NOT. The first three are extremely easy given the framework already
built out in the previous two posts. The last one has some problems --
so I'll discuss those first.
I haven't been able to find any mathematical or computer
science texts that discuss how to deal with variable-length bitmasks,
which is what I'm attempting to define here. Most texts keep the
discussion to, at most, two or four bytes, and even then those two or
four bytes are stable. But the problem with using a variable number is
that it creates some inconsistencies with regards to signing. Take this
example:
SELECT 1 AS Oops
WHERE
-1 <>
CONVERT(INT, CONVERT(VARBINARY, CONVERT(SMALLINT, -1)))
Oops
----
1
So what's happening here? The -1 on the right side of the comparison
is being cast into a 2-byte SMALLINT, then converted to VARBINARY
(which produces 0xFFFF). Then it's being converted back to INT. But
guess what..?
SELECT CONVERT(INT, 0xFFFF) AS Arrgh
Arrgh
-----
65535
So what is SQL Server telling us? That a small -1 is not the same as a bigger -1. -1 <> -1!!!
You're probably asking yourself, "what is this guy talking
about, and what does any of this have to do with the topic at hand,
implementing a binary NOT operation?" And if that's what you were
asking yourself, then good job, because you've asked the correct
question. So what do they have to do with each other?
The truth-table for NOT is quite simple:
But let's delve a little deeper. The SMALLINT representation of
the decimal number 1 in hex is 0x0001. In binary, that's
0000000000000001. If you run that through a logical NOT, the result is
1111111111111110, or 0xFFFE. That's -2. But remember, that's only -2 if you're a SMALLINT.
If you're either an INT or a BIGINT, it's 65534. And that's just not
consistent. I want to know that any equivalent number into my function
yeilds an equivalent number on the way out. So NOT(0x000001) should
yeild the same result as NOT(0x000000000001).
... and that result, in the function I provide, will be: 0xFE.
One byte in, one byte out. Similar to some prison gang mottos, but
that's a topic for a later post.
You'll notice that this is similar to the way I handled the bitmask re-constitution in the last post. So I feel pretty good about this. Sparsity is a good thing.
But this has a very important side-effect. These
numbers are now officially and permanently un-signed. We can't deal
with sign because we can't know which byte corresponds to the highest
byte -- that's variable. And since we can't determine the highest byte,
we also can't determine the highest bit in that byte, and so can't know
whether or not our number is negative.
But I can live with that. And I hope you can, too. And if you can't, write a solution and send it to me and I'll post it.
So on that note, and without further ado, here's how you figure out which bit positions should be output by a NOT operation:
DECLARE @Bitmask VARBINARY(4096)
SET @Bitmask = 0x01
SELECT x.Number
FROM BitmaskNumbers x
LEFT JOIN dbo.SplitBitmask(@Bitmask) y ON y.Number = x.Number
WHERE y.Number IS NULL
AND x.Byte <=
(SELECT MAX(Byte)
FROM BitmaskNumbers z
WHERE z.Number =
(SELECT MAX(Number)
FROM dbo.SplitBitmask(@Bitmask)))
Number
-------
2
3
4
5
6
7
8
Pretty simple, really: Split the bitmask and take any numbers within
the same byte range that aren't in the bitmask. To reconstitute it,
simply modify the reconstitution pattern a bit, stuff it all into a
function, and you get:
CREATE FUNCTION bitwiseNot
(
@Bitmask VARBINARY(4096)
)
RETURNS VARBINARY(4096)
AS
BEGIN
DECLARE @BitsInBitmask TABLE(Number SMALLINT)
INSERT @BitsInBitmask
SELECT Number
FROM dbo.splitBitmask(@Bitmask)
SET @Bitmask = 0x
SELECT @Bitmask = @Bitmask +
CONVERT(VARBINARY(1),
SUM(CASE
WHEN x.Number IS NOT NULL THEN 0
ELSE BitmaskNumbers.BitValue
END)
)
FROM @BitsInBitmask x
RIGHT JOIN BitmaskNumbers ON BitmaskNumbers.Number = x.Number
WHERE BitmaskNumbers.Byte <=
(SELECT
CASE MAX(Number) % 8
WHEN 0 THEN (MAX(Number) - 1) / 8
ELSE MAX(Number) / 8
END + 1
FROM @BitsInBitmask)
GROUP BY BitmaskNumbers.Byte
ORDER BY BitmaskNumbers.Byte DESC
RETURN(@Bitmask)
END
GO
SELECT dbo.BitwiseNot(0x01) AS Not01
Not01
-----
0xFE
And of course, given the properties I described above:
SELECT dbo.BitwiseNot(0x0000000001) AS Not0000000001
Not0000000001
-------------
0xFE
And now on to the other three logical operations, which are much simpler...
The easiest is OR, which has the following truth table:
And what is that similar to, in relational parlance..? A UNION, perhaps?
SELECT Number
FROM
(
SELECT Number
FROM dbo.splitBitmask(0x01)
UNION
SELECT Number
FROM dbo.splitBitmask(0x03)
) x
... and how about exclusive OR (XOR)?
Similar to the UNION, but we only want intersections with exactly one bitmask position... Luckily, SQL is equipped for that:
SELECT Number
FROM
(
SELECT Number FROM dbo.SplitBitmask(0x01)
UNION ALL
SELECT Number FROM dbo.SplitBitmask(0x02)
) x
GROUP BY Number
HAVING COUNT(*) = 1
Finally, the AND operation:
Just like XOR, but you need exactly two bit positions in each intersection:
SELECT Number
FROM
(
SELECT Number FROM dbo.SplitBitmask(0x01)
UNION ALL
SELECT Number FROM dbo.SplitBitmask(0x02)
) x
GROUP BY Number
HAVING COUNT(*) = 2
Putting it all together, I present the following OR, XOR, and AND UDFs:
OR
CREATE FUNCTION bitwiseOr
(
@Bitmask1 VARBINARY(4096),
@Bitmask2 VARBINARY(4096)
)
RETURNS VARBINARY(4096)
AS
BEGIN
DECLARE @BitsInBitmask TABLE(Number SMALLINT)
INSERT @BitsInBitmask
SELECT Number
FROM
(
SELECT Number
FROM dbo.splitBitmask(@Bitmask1)
UNION
SELECT Number
FROM dbo.splitBitmask(@Bitmask2)
) x
SET @Bitmask1 = 0x
SELECT @Bitmask1 = @Bitmask1 +
CONVERT(VARBINARY(1),
SUM(CASE
WHEN x.Number IS NULL THEN 0
ELSE BitmaskNumbers.BitValue
END)
)
FROM @BitsInBitmask x
RIGHT JOIN BitmaskNumbers ON BitmaskNumbers.Number = x.Number
WHERE BitmaskNumbers.Byte <=
(SELECT
CASE MAX(Number) % 8
WHEN 0 THEN (MAX(Number) - 1) / 8
ELSE MAX(Number) / 8
END + 1
FROM @BitsInBitmask)
GROUP BY BitmaskNumbers.Byte
ORDER BY BitmaskNumbers.Byte DESC
RETURN(@Bitmask1)
END
SELECT dbo.bitwiseOr(0x01, 0x03) AS Or_01_03
Or_01_03
--------
0x03
XOR
CREATE FUNCTION bitwiseXOr
(
@Bitmask1 VARBINARY(4096),
@Bitmask2 VARBINARY(4096)
)
RETURNS VARBINARY(4096)
AS
BEGIN
DECLARE @BitsInBitmask TABLE(Number SMALLINT)
INSERT @BitsInBitmask
SELECT Number
FROM
(
SELECT Number
FROM dbo.splitBitmask(@Bitmask1)
UNION ALL
SELECT Number
FROM dbo.splitBitmask(@Bitmask2)
) x
GROUP BY Number
HAVING COUNT(*) = 1
SET @Bitmask1 = 0x
SELECT @Bitmask1 = @Bitmask1 +
CONVERT(VARBINARY(1),
SUM(CASE
WHEN x.Number IS NULL THEN 0
ELSE BitmaskNumbers.BitValue
END)
)
FROM @BitsInBitmask x
RIGHT JOIN BitmaskNumbers ON BitmaskNumbers.Number = x.Number
WHERE BitmaskNumbers.Byte <=
(SELECT
CASE MAX(Number) % 8
WHEN 0 THEN (MAX(Number) - 1) / 8
ELSE MAX(Number) / 8
END + 1
FROM @BitsInBitmask)
GROUP BY BitmaskNumbers.Byte
ORDER BY BitmaskNumbers.Byte DESC
RETURN(@Bitmask1)
END
SELECT dbo.bitwiseXOr(0x01, 0x03) AS XOr_01_03
XOr_01_03
--------
0x02
... And finally, AND
CREATE FUNCTION bitwiseAnd
(
@Bitmask1 VARBINARY(4096),
@Bitmask2 VARBINARY(4096)
)
RETURNS VARBINARY(4096)
AS
BEGIN
DECLARE @BitsInBitmask TABLE(Number SMALLINT)
INSERT @BitsInBitmask
SELECT Number
FROM
(
SELECT Number
FROM dbo.splitBitmask(@Bitmask1)
UNION ALL
SELECT Number
FROM dbo.splitBitmask(@Bitmask2)
) x
GROUP BY Number
HAVING COUNT(*) = 2
SET @Bitmask1 = 0x
SELECT @Bitmask1 = @Bitmask1 +
CONVERT(VARBINARY(1),
SUM(CASE
WHEN x.Number IS NULL THEN 0
ELSE BitmaskNumbers.BitValue
END)
)
FROM @BitsInBitmask x
RIGHT JOIN BitmaskNumbers ON BitmaskNumbers.Number = x.Number
WHERE BitmaskNumbers.Byte <=
(SELECT
CASE MAX(Number) % 8
WHEN 0 THEN (MAX(Number) - 1) / 8
ELSE MAX(Number) / 8
END + 1
FROM @BitsInBitmask)
GROUP BY BitmaskNumbers.Byte
ORDER BY BitmaskNumbers.Byte DESC
RETURN(@Bitmask1)
END
SELECT dbo.bitwiseAnd(0x01, 0x03) AS And_01_03
And_01_03
--------
0x01
... And that's enough for today's installment. Enjoy..!