Originally posted
here.
Continuing in my series of things you should probably not do in SQL Server but sometimes have to, I'm going to do a few posts on dealing with very large bitmasks.
Let me first state my utter hatered of bitmasks in databases. I think
they're annoying, make the system difficult to understand, and whether
or not they violate the First Normal Form (that's up for discussion),
using them is just a sign of bad design.
But as I've said in other posts, I realize that short deadlines
and tiny budgets are an issue at most shops, and sometimes we just need
to shoehorn in a solution real quick (yeah, as if it's not going to
last for the next 5+ years?)
In one case in the past, bitmasks were a very convenient
solution to a problem I faced with an access control system. But alas,
I only had 8 bytes available to me. Only enough for 64 values. And so
that solution failed. And the company failed. And many, many tears were
shed... If only I'd been able to figure out how to manipulate a bigger
bitmask, I might have saved the little children...
I won't go into any more detail on that particular issue since
there are still a few pending lawsuits, but suffice it to say that if
that situation were happening today, I probably wouldn't use a bitmask
anyway. But you might need one -- so here's how you do it:
First we're going to modify the table of numbers that I'm always telling you that you absolutely must have in every single database.
SELECT (a.number * 256 + b.number) AS Number,
CASE (a.number * 256 + b.number) % 8
WHEN 0 THEN ((a.number * 256 + b.number) - 1) / 8
ELSE (a.number * 256 + b.number) / 8
END + 1 AS Byte,
POWER(2, CASE (a.number * 256 + b.number) % 8
WHEN 0 THEN 8
ELSE (a.number * 256 + b.number) % 8
END-1) AS BitValue
INTO BitmaskNumbers
FROM
master..spt_values a,
master..spt_values b
WHERE
a.type = 'p'
AND b.type = 'p'
AND (a.number * 256 + b.number) BETWEEN 1 AND 32767
GO
CREATE CLUSTERED INDEX IX_Byte ON BitmaskNumbers (Byte)
GO
This will produce a table with 32767 rows. Each row has a Number,
which will represent a bit position in our bitmask, a Byte, which will
help to parse the bitmask, and a BitValue, which is the value that the
individual bits within each byte represent. Feel my 1960-esque skill!
The brighter bulbs in my audience have now figured out that I'm
going to show you how to handle a 4096-byte bitmask -- capable of
handling up to 32767 values. Not bad. But if you need more, just put
more rows in the BitmaskNumbers table.
So what do you want to do with bitmasks? Most of the queries
I've seen involve access control And for those queries, you want to use
a logical and
and see if it evaluates to a number other than 0. That is, we want to
see if both bitmasks we're comparing have any of the same bits set.
Using what little math knowledge I have managed to retain, I
conjured up the following, which indicates which bit positions, based
on the "number", are filled in a bitmask. For instance:
DECLARE@x VARBINARY(4096)
SET @Bitmask = 0x1F0000000000000000000000000000000000000000000000000000000000000000000000123000000000000000000000000001
SELECT Number
FROM BitmaskNumbers
WHERE (SUBSTRING(@Bitmask, Byte, 1) & BitValue) = BitValue
AND Byte <= DATALENGTH(@Bitmask)
Number
----------
1
2
3
4
5
290
293
301
302
401
Fun stuff, no?
The Byte <= DATALENGTH(@x) allows SQL Server to
utilize the clustered index on Byte, so that a full table scan doesn't
have to happen every single time. Small optimization. I couldn't think
of any others. If you can, drop me a line...
Those of you who've read this far are probably yawning and
wondering where the access control stuff is... Who cares about chunking
up the bitmask into its bit positions? Well, it's simply the first
step. Bear with me.
What we need to do is wrap this in a UDF. Then if we had two
bitmasks, we could join the bit positions to eliminate those that
aren't in common...
CREATE FUNCTION dbo.splitBitmask
(
@Bitmask VARBINARY(4096)
)
RETURNS TABLE
AS
RETURN
(
SELECT Number
FROM BitmaskNumbers
WHERE (SUBSTRING(@Bitmask, Byte, 1) & BitValue) = BitValue
AND Byte <= DATALENGTH(@Bitmask)
)
GO
DECLARE @Bitmask1 VARBINARY(4096)
SET @Bitmask1 = 0x1F0000000000000000000000000000000000000000000000000000000000000000000000123000000000000000000000000001
DECLARE @Bitmask2 VARBINARY(4096)
SET @Bitmask2 = 0x0E0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
SELECT x.Number
FROM dbo.splitBitmask(@Bitmask1) x
JOIN dbo.splitBitmask(@Bitmask2) y ON x.Number = y.Number
GO
Number
----------
2
3
4
We now know that these two bitmasks share bits 2, 3, and 4 in
common. But for most access control situations, we don't care what bits
they share in common -- just that they share some.
CREATE FUNCTION dbo.HasAccess
(
@Bitmask1 VARBINARY(4096),
@Bitmask2 VARBINARY(4096)
)
RETURNS BIT
AS
BEGIN
DECLARE @Result BIT
SELECT @Result =
CASE COUNT(*)
WHEN 0 THEN 0
ELSE 1
END
FROM dbo.splitBitmask(@Bitmask1) x
JOIN dbo.splitBitmask(@Bitmask2) y ON x.Number = y.Number
RETURN (@Result)
END
GO
DECLARE @Bitmask1 VARBINARY(4096)
SET @Bitmask1 = 0x1F0000000000000000000000000000000000000000000000000000000000000000000000123000000000000000000000000001
DECLARE @Bitmask2 VARBINARY(4096)
SET @Bitmask2 = 0x0E0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
SELECT dbo.HasAccess(@Bitmask1, @Bitmask2) AS HasAccess
GO
HasAccess
-------------
1
... And that is pretty much it for this installment. We can now
determine whether or not two bitmasks have bits in common, and if
necessary which bits they share.
Future installments will cover how to manipulate large bitmasks
in other ways -- flip specific bits, perform a logical and that
produces a bitmask instead of a result set, and perform a logical or
that produces a bitmask. All very useful stuff if you need to work with
these bitmasks. But now I just need to figure out how to do all of that
stuff.
So until next time.... Don't use this technique.