THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Lara Rubbelke

Interesting Things in the World of SQL Server

Big Data Learning Resources

I have recently had several requests from people asking for resources to learn about Big Data and Hadoop.  Below is a list of resources that I typically recommend.  I'll update this list as I find more resources.  Let's crowdsource this... Tell me your favorite resources and I'll get them on the list!

 

Books and Whitepapers

Planning for Big Data Free e-book

Great primer on the general Big Data space.  This is always my recommendation for people who are new to Big Data and are trying to understand it.

Hadoop: The Definitive Guide by Tom White 

This will dive deep under the hood of Hadoop.  This should not be a first book for someone who is just starting with Hadoop, Map Reduce or Big Data.  Make sure you don’t get the first edition.  The third edition is the best as it also dedicates a chapter to HBase, Hive, and other tools in the ecosystem that are important to understand.

Programming Pig by Alan Gates

Great (and entertaining) book about Pig.  The first chapter is a really good primer on Hadoop.

Programming Hive By Edward Capriolo, Dean Wampler, Jason Rutherglen (est publication date 10/9/2012)

Nothing to say about this book yet – it isn’t yet released.  I will add a quick blurb when I have a chance to read it.

“If You Have Too Much Data, then ‘Good Enough’ Is Good Enough” by Pat Helland

Great whitepaper to discuss the tenets behind distributed systems.

 

Websites

Apache Hadoop: http://hadoop.apache.org/

Microsoft Big Data Solution: www.microsoft.com/bigdata

Windows Azure: www.windowsazure.com/en-us/home/scenarios/big-data

 

Webcasts

Hadoop Videos on Microsoft TechNet: http://social.technet.microsoft.com/wiki/contents/articles/6204.hadoop-based-services-for-windows-en-us.aspx#videos

Hortonworks Video Series: http://hortonworks.com/videos/

Cloudera Video Series: http://www.cloudera.com/resource-types/video/

 

 

Tim O'Reilly and Dave Campbell Explore How to Accelerate Insights from Data

 

Denny Lee talks about Big Data

 

Blogs

Andrew Brust on ZDNet: http://www.zdnet.com/blog/big-data/

Denny Lee: http://dennyglee.com/

Carl Nolan: http://blogs.msdn.com/b/carlnol/archive/tags/hadoop+streaming/

 

Cindy Gross: http://blogs.msdn.com/b/cindygross/ 

Oakleaf Blogs (good for Hadoop on Azure): http://oakleafblog.blogspot.com/

Buck Woody: Big Data: A Microsoft Tools Approach http://sqlblog.com/blogs/buck_woody/archive/2012/02/20/big-data-a-microsoft-tools-approach.aspx

Forrester Blogs: http://blogs.forrester.com/category/big_data

 

Try Now

Preview of the Hadoop-based service for Windows Azure: https://www.hadooponazure.com  

Published Monday, September 10, 2012 5:38 PM by Lara Rubbelke
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

RichB said:

Surely before you start on that path you need to have at least brushed through Data and Reality (William Kent) to remind you of the pratfalls that await even an experienced operator...

Just because you have more power, doesn't mean the fundamental problems of data disappear!

September 11, 2012 6:04 AM
 

AndrewC said:

Thanks Laura, this is exactly the list I've been looking for!

September 11, 2012 12:31 PM
 

Ross McNeely said:

Lara,

Great presentation at the Minnesota SQL Saturday.  Thanks for the resource post.

October 1, 2012 4:07 PM
 

Alex Popescu said:

Here's another resource that has been around since 2009 and is focused only on NoSQL databases, data processing, and Big Data in general: http://mynosql.org.

(I'm the creator and main maintainer).

April 30, 2013 2:07 AM
 

Lynn Langit said:

I have a channel on YouTube with over 100 screencasts on BigData topics -- http://www.youtube.com/user/SoCalDevGal

September 3, 2013 6:24 PM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement