THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Michael Coles: Sergeant SQL

SQL Server development, news and information from the front lines

Here Comes the FY11 Earmarks Database

RepublicratI'm really interested in politics (don't worry, I'm not going to start bashing politicians and hammering you with political rage).  The point is when the U.S. FY11 Omnibus Spending Bill (the bill to fund the U.S. Government for another year) was announced it piqued my interest.  I'm fascinated by "earmarks" (also affectionally known as "pork").  For those who aren't familiar with U.S. politics, "earmark" is a slang term for "Congressionally Directed Spending".  It's basically the set of provisions added to bills by Senators and Representatives in the U.S. Congress, directing money to be spent on certain programs or projects without all the hassle of introducing a separate bill, sending it through umpteen committees, and getting enough supporters in both chambers to vote it up.  Most of the time Senators and Representatives introduce earmarks that send money back to their home states (see this guy for details).

Now it looks like the Omnibus Bill is dead for now, but the record of the earmarks is still available at http://appropriations.senate.gov/news.cfm in PDF format (no, you don't have to visit Wikileaks).  Just look for the "Earmarks" links at the bottom of the page under "All News Items".  So what I have been doing the past few days, since the bill was announced, is loading the earmarks into a SQL database.  It turned out to be a little more complex than it should have been.  There was the issue of extracting the data out of the PDF files, normalizing it, ETLing it into SQL Server, cleaning it up (a *lot* of manual cleanup required), adding and cleaning up reference data, and finally linking the earmarks to the reference data.

You can download the resulting SQL 2008 database backup file from this post and restore it to your test server to play around with.  Here are some sample queries I ran:

SELECT SUM(e.Amount)
FROM dbo.Earmark e;

Result:
8313820025

(That's $8.3 Billion dollars in earmarks -- note that some earmarks did not specify $ amounts, so this number is low)

Want to see total earmarks requested by party affiliation?

WITH CTE
AS
(
  SELECT DISTINCT e.EarmarkID, e.Amount, c.Party
  FROM dbo.Earmark e
  INNER JOIN dbo.EarmarkCongress ec
    ON e.EarmarkID = ec.EarmarkID
  INNER JOIN dbo.Congress c
    ON ec.CongressID = c.CongressID
)
SELECT Party, SUM(Amount)
FROM CTE
GROUP BY Party;

Results:
D, 6779032457
R, 2933829318
I, 216149500

This one actually double-counts some items since the same earmark can be sponsored by a Democrat (D), a Republican (R), and/or an Independent (I) all at the same time.

And if you want to see how much was requested by your Senators and Representatives:

SELECT c.Chamber,
 
c.LastName,
  c.FirstName,
 
c.Party,
 
SUM(e.Amount)
FROM dbo.Earmark e
INNER JOIN dbo.EarmarkCongress ec
  ON e.EarmarkID = ec.EarmarkID
INNER JOIN dbo.Congress c
  ON ec.CongressID = c.CongressID
WHERE c.State = 'NJ'
GROUP BY c.Chamber,
  c.LastName,
  c.FirstName,
  c.Party;

Results:
House, Adler, John D, 23848125
House, Andrews, Rob D, 27710000
House, Holt, Rush D, 103953000
House, Pallone, Frank D, 21810000
House, Pascrell, Bill D, 15143000
House, Payne, Donald D, 52908000
House, Rothman, Steven D, 107153000
House, Sires, Albio D, 120141000
Senate, Lautenberg,  Frank D, 239061125
Senate, Menendez,  Bob D, 243982125

Again there can be double-counting since an earmark can have multiple sponsors (very likely when you're talking about Senators and Representatives from the same state).

Also note that I had to add in the non-voting Congressional delegates from U.S. territories (Puerto Rico, U.S. Virgin Islands, Guam, etc.) and Washington, D.C. to the reference data.  They don't vote, but they're allowed to attach earmarks to bills to bring some cash back home.

I'd love to see what others are able to do with this data.  Some ideas -- use Reporting Services to map earmarks across the country; come up with clever queries to allocate earmarks equally across sponsors to avoid the double-counting issues; some Top n style queries by amounts, counts, etc.; combine it with spatial data and other reference data out there; compare things like # of sponsors for an earmark vs. the amount of the earmark, or seniority of sponsors vs. the amount of the earmark; maybe do something clever with it in PowerPivot.  If you find this type of data interesting and you download the database to play with, let us know the results and what you come up with.

Democracy - It's what's for dinner

Published Sunday, December 19, 2010 12:51 AM by Mike C

Attachment(s): Earmarks.zip

Comments

No Comments
New Comments to this post are disabled

This Blog

Syndication

News

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement