THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Buck Woody

Carpe Datum!

The Top 20 Questions in Database Design

I'm still re-reading the "Fourth Paradigm" book by Microsoft Research, and one section continues to intrigues me. There's a part where the book explains database design, and puts forth that the most important thing when you're designing large data sets is to find out the "Top Twenty Questions" the database has to answer. The quote is this:

 "Most selections involving human choices follow a 'long tail,' or so-called 1/f distribution, it is clear that the relative information in the queries ranked by importance is logarithmic, so the gain realized by going from approximately 20 (24.5) to 100 (26.5) is quite modest."

I find this facinating - it just doesn't seem to make "common" sense. Surely you have to ask a lot more questions than that to "get" the shape of the data? I researched the mathematical concept he's describing (, and I'll try some experiments here. I'll let you know what I uncover!

 Here's the link for the book if you want to read it:

Published Thursday, December 17, 2009 7:18 AM by BuckWoody
Filed under: , , , ,



PercyReyes said:

Thank you!

December 17, 2009 11:21 AM

Wesley Brown said:

Love the 4th paradigm. One of the best free reads of all time. Some of it seems very counter intuitive but the math works out. Fun stuff!

December 17, 2009 11:40 AM
New Comments to this post are disabled

About BuckWoody

This Blog


Privacy Statement