THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Buck Woody

Carpe Datum!

The Top 20 Questions in Database Design

I'm still re-reading the "Fourth Paradigm" book by Microsoft Research, and one section continues to intrigues me. There's a part where the book explains database design, and puts forth that the most important thing when you're designing large data sets is to find out the "Top Twenty Questions" the database has to answer. The quote is this:

 "Most selections involving human choices follow a 'long tail,' or so-called 1/f distribution, it is clear that the relative information in the queries ranked by importance is logarithmic, so the gain realized by going from approximately 20 (24.5) to 100 (26.5) is quite modest."

I find this facinating - it just doesn't seem to make "common" sense. Surely you have to ask a lot more questions than that to "get" the shape of the data? I researched the mathematical concept he's describing (http://www.scholarpedia.org/article/1/f_noise), and I'll try some experiments here. I'll let you know what I uncover!

 Here's the link for the book if you want to read it:

http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf

Published Thursday, December 17, 2009 7:18 AM by BuckWoody
Filed under: , , , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

PercyReyes said:

Thank you!

December 17, 2009 11:21 AM
 

Wesley Brown said:

Love the 4th paradigm. One of the best free reads of all time. Some of it seems very counter intuitive but the math works out. Fun stuff!

December 17, 2009 11:40 AM

Leave a Comment

(required) 
(required) 
Submit

About BuckWoody

http://buckwoody.com/BResume.html

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement