THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
 in Adam Machanic (Entire Site) Search

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

# Texas Hold 'em Hint #1

The other day I annouced the Texas Hold 'em SQL Challenge. I haven't gotten any feedback on it yet, so I have no idea if anyone is working on it, but I thought I'd get the ball rolling and come up with my own solution...

The first step I've decided to take in solving this problem is to figure out how, given two or more 5-card hands, to figure out a winner. What's necessary is a way of weighting each hand so that better hands will have a higher value. Or at least, that's the tactic I took.

I first checked out this list of poker hands from strongest to weakest. After staring at it for a while, I realized that some patterns can be derived from the list:

• A two-pair is two pairs (wow, getting deep here, aren't we!)
• A full house is a pair plus a three of a kind
• A straight flush is a straight plus a flush
• A royal flush is a type of straight flush with the highest cards in the deck

Obvious, yes; but the final point is important -- is there a way of weighting the cards such that we can determine, using only a single number, the weight of a given hand, independent of whether or not it falls into one of these categories? That is, how can we know that a straight flush with a high king is worth less than a straight flush with a high ace? And how can we know that a hand with a pair of 10s and a high jack beats a hand with a pair of 10s and a high 9?

The answer is to use our old friend, binary:

CardValue
22
34
48
516
632
764
8128
9256
10512
Jack1024
Queen2048
King4096
Ace8192 (1)

Note that the ace will be weighted as 1, should it happen to be part of a straight with a high 5.

So what does this binary weighting give us? The property that's especially useful in this case is that for every power of 2, it's impossible to reach that number by adding up any unique combination of powers of 2 less than that power. Restated in terms of this project, if a hand has only a high card, say a king, then that hand will be weighted higher than any other high card hand that doesn't have either a king or an ace.

But weighting alone isn't enough. This can be shown using the following two hands:

2 3 Q K A
2 3 Q K K

The weighted values of these hands are 14342 and 10246, respectively (2 + 4 + 2048 + 4096 + 8192 and 2 + 4 + 2048 + 4096 + 4096). However, the second is a better hand. How to solve this issue? Bonuses.

After conducting a completely non-scientific trial-and-error analysis, I arrived at the following bonus structure, which seems to work:

To weight a hand, take the base weight (i.e., the sum of the weights of each card in the hand) and add the following bonuses per condition:

• Pair: 2000000000 + (8192 * pair card weight)
• Three of a kind: 6000000000 + (131072 * three of a kind card weight)
• Straight: 5500000000
• Flush: 5800000000
• Four of a kind: 9500000000 + (131072 * four of a kind card weight)

Note that "pair card weight," for example, refers to the weight of the card type that makes up the pair. So for a pair of twos, the calculation would be (8192 * 2).

This formula can be expanded for each type of hand. Two pair? Add 2000000000 + (8192 * the weight of the card for the first pair) + (8192 * the weight of the card for the second pair). Full house? Combine the pair with the three of a kind. Same with straight flush. And since the base weights are being added, the hands will naturally be weighted heavier for higher cards.

There may have been a more elegant solution, but this one does the trick for now.

But enough math, on to the SQL! What I wanted was a table containing every possible hand and its weights. But first I needed a table with all of the possible cards:

`CREATE TABLE dbo.Cards(	CardID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,	Suit CHAR(1) NOT NULL,	Card VARCHAR(2) NOT NULL,	Weight INT NOT NULL)INSERT dbo.Cards (Suit, Card, Weight)SELECT S.Suit, C.Card, C.WeightFROM(	SELECT 'H'	UNION ALL	SELECT 'D'	UNION ALL	SELECT 'S'	UNION ALL	SELECT 'C') S (Suit)CROSS JOIN(	SELECT CONVERT(VARCHAR(2), number), POWER(2, number-1)	FROM master..spt_values	WHERE type='p'		AND number BETWEEN 2 AND 10	UNION ALL	SELECT 'J', 1024	UNION ALL	SELECT 'Q', 2048	UNION ALL	SELECT 'K', 4096	UNION ALL	SELECT 'A', 8192) C (Card, Weight)ORDER BY 	Suit, 	WeightGO`

Yes, I know that (Suit, Card) could have formed a natural primary key, but the table of hands is going ta have 2598960 rows -- which you can prove using the formulas from this website:

`52 cards, 5 cards per hand:            n!n_C_k = ----------        k!(n - k)!      52!= -----------  5!(52 - 5)!      52!= -----------    5!(47)!  52 * 51 * 50 * 49 * 48= ----------------------     5 * 4 * 3 * 2 * 1   311875200= -----------      120= 2598960`

...And to believe that my high school algebra teacher had the gall to flunk me!

Anyway, with 2.59 million rows, I'd rather just use a single-column surrogate. It makes like a lot easier. So how do we generate all of these rows?

First, we need every possible hand:

`SELECT	C1.CardId AS Card1,	C2.CardId AS Card2,	C3.CardId AS Card3,	C4.CardId AS Card4,	C5.CardId AS Card5FROM dbo.Cards C1JOIN dbo.Cards C2 ON 	C2.CardId > C1.CardIdJOIN dbo.Cards C3 ON 	C3.CardId > C2.CardIdJOIN dbo.Cards C4 ON	C4.CardId > C3.CardIDJOIN dbo.Cards C5 ON	C5.CardId > C4.CardId`

Easy enough. And notice how nice and simple that surrogate makes life... Laziness is a virtue!

Next step, apply the weighting logic. This is quite a bit more complex, and I'm not going to spend a lot of time going into detail since it's currently quite late here and I need to work in the morning. But remember the following rules: If you have two or more of the same index in a hand (pair, three of a kind, four of a kind), that hand can't be a flush. And if you have only two of a given card, that hand can't have a three of a kind of that card. And if you have only three of a card, that hand can't have a four of a kind for that card!

Given these rules, I realized that you can test for all of these conditions and add up the results, without doing any kind of backtracking. This made the job quite a bit simpler. I did the following: First, calculate the base weight and figure out whether we have a straight, for any given row. Add up the results. Then "unpivot" (using a cross-join to five rows of numbers) and use aggregates to figure out what pairs, threes of a kind, fours of a kind, and flushes exist in each hand... Re-pivot, and stuff everything into a table. Oh, and I implemented some rudimentary batching so that 2.5 million rows wouldn't be inserted in a single transaction.

The complete SQL is here:

`CREATE TABLE dbo.PokerHands(	Card1 INT NOT NULL,	Card2 INT NOT NULL,	Card3 INT NOT NULL,	Card4 INT NOT NULL,	Card5 INT NOT NULL,	Weight BIGINT NOT NULL,	CONSTRAINT PK_PokerHands PRIMARY KEY CLUSTERED		(Card1, Card2, Card3, Card4, Card5))GODECLARE @CardId INTSET @CardId = 1WHILE @CardId <= 52BEGIN	INSERT dbo.PokerHands		(Card1, Card2, Card3, Card4, Card5, Weight)	SELECT 		y.Card1,		y.Card2,		y.Card3,		y.Card4,		y.Card5,		-- 2		CASE 			SUM(				CASE y.CardWeight					WHEN 2 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 2)			WHEN 3 THEN 6000000000 + (131072 * 2)			WHEN 2 THEN 2000000000 + (8192 * 2)			ELSE 0		END +		-- 3		CASE 			SUM(				CASE y.CardWeight					WHEN 4 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 4)			WHEN 3 THEN 6000000000 + (131072 * 4)			WHEN 2 THEN 2000000000 + (8192 * 4)			ELSE 0		END +		-- 4		CASE 			SUM(				CASE y.CardWeight					WHEN 8 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 8)			WHEN 3 THEN 6000000000 + (131072 * 8)			WHEN 2 THEN 2000000000 + (8192 * 8)			ELSE 0		END +		-- 5		CASE 			SUM(				CASE y.CardWeight					WHEN 16 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 16)			WHEN 3 THEN 6000000000 + (131072 * 16)			WHEN 2 THEN 2000000000 + (8192 * 16)			ELSE 0		END +		-- 6		CASE 			SUM(				CASE y.CardWeight					WHEN 32 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 32)			WHEN 3 THEN 6000000000 + (131072 * 32)			WHEN 2 THEN 2000000000 + (8192 * 32)			ELSE 0		END +		-- 7		CASE 			SUM(				CASE y.CardWeight					WHEN 64 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 64)			WHEN 3 THEN 6000000000 + (131072 * 64)			WHEN 2 THEN 2000000000 + (8192 * 64)			ELSE 0		END +		-- 8		CASE 			SUM(				CASE y.CardWeight					WHEN 2 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 128)			WHEN 3 THEN 6000000000 + (131072 * 128)			WHEN 2 THEN 2000000000 + (8192 * 128)			ELSE 0		END +		-- 9		CASE 			SUM(				CASE y.CardWeight					WHEN 256 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 256)			WHEN 3 THEN 6000000000 + (131072 * 256)			WHEN 2 THEN 2000000000 + (8192 * 256)			ELSE 0		END +		-- 10		CASE 			SUM(				CASE y.CardWeight					WHEN 512 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 512)			WHEN 3 THEN 6000000000 + (131072 * 512)			WHEN 2 THEN 2000000000 + (8192 * 512)			ELSE 0		END +		-- J		CASE 			SUM(				CASE y.CardWeight					WHEN 1024 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 1024)			WHEN 3 THEN 6000000000 + (131072 * 1024)			WHEN 2 THEN 2000000000 + (8192 * 1024)			ELSE 0		END +		-- Q		CASE 			SUM(				CASE y.CardWeight					WHEN 2048 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 2048)			WHEN 3 THEN 6000000000 + (131072 * 2048)			WHEN 2 THEN 2000000000 + (8192 * 2048)			ELSE 0		END +		-- K		CASE 			SUM(				CASE y.CardWeight					WHEN 4096 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 4096)			WHEN 3 THEN 6000000000 + (131072 * 4096)			WHEN 2 THEN 2000000000 + (8192 * 4096)			ELSE 0		END +		-- A		CASE 			SUM(				CASE y.CardWeight					WHEN 8192 THEN 1					ELSE 0				END			)			WHEN 4 THEN 9500000000 + (131072 * 8192)			WHEN 3 THEN 6000000000 + (131072 * 8192)			WHEN 2 THEN 2000000000 + (8192 * 8192)			ELSE 0		END +		--Flush		CASE COUNT(DISTINCT y.CardSuit)			WHEN 1 THEN 5800000000			ELSE 0		END +		--Base Weight + Straight		MIN(y.TotalWeight) AS Weight	FROM	(		SELECT TOP 100 PERCENT			C1.CardId AS Card1,			C2.CardId AS Card2,			C3.CardId AS Card3,			C4.CardId AS Card4,			C5.CardId AS Card5,			--Straight			CASE WHEN 				(C2.Weight = 2 * C1.Weight				AND C3.Weight = 2 * C2.Weight				AND C4.Weight = 2 * C3.Weight				AND (					(C5.Weight = 2 * C4.Weight)					OR 					(C2.Card = '2' AND C5.Card = 'A')))				THEN 					CASE 						WHEN (C2.Card = '2' AND C5.Card = 'A')							THEN 5500000000 - 8191 --Make A = weight 1						ELSE 5500000000					END				ELSE 0			END +			--Base weight			C1.Weight + C2.Weight + C3.Weight + C4.Weight + C5.Weight				AS TotalWeight,			CASE x.number				WHEN 1 THEN C1.Weight				WHEN 2 THEN C2.Weight				WHEN 3 THEN C3.Weight				WHEN 4 THEN C4.Weight				WHEN 5 THEN C5.Weight			END AS CardWeight,			CASE x.number				WHEN 1 THEN C1.Suit				WHEN 2 THEN C2.Suit				WHEN 3 THEN C3.Suit				WHEN 4 THEN C4.Suit				WHEN 5 THEN C5.Suit			END AS CardSuit		FROM dbo.Cards C1		JOIN dbo.Cards C2 ON 			C2.CardId > C1.CardId		JOIN dbo.Cards C3 ON 			C3.CardId > C2.CardId		JOIN dbo.Cards C4 ON			C4.CardId > C3.CardID		JOIN dbo.Cards C5 ON			C5.CardId > C4.CardId		CROSS JOIN		(			SELECT number			FROM master..spt_values			WHERE type='p'				AND number BETWEEN 1 and 5		) x		WHERE C1.CardId = @CardId		ORDER BY			C1.CardId,			C2.CardId,			C3.CardId,			C4.CardId,			C5.CardId	) y	GROUP BY		y.Card1,		y.Card2,		y.Card3,		y.Card4,		y.Card5	SET @CardId = @CardId + 1END`

... And that's about it! Assuming I made no logic mistakes -- which of course is a very safe assumption because it's me and I never do that -- this is a relatively good start to the Challenge! So take it from here... Or at least until I post the next piece of the puzzle.

In retrospect, I think this would be a good use for a CLR UDT. If I have some time, I'll post that version in the coming days. Enjoy!

Published Wednesday, July 12, 2006 10:46 PM by Adam Machanic
Filed under:

(required)
(required)
Submit