THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Greg Low (The Bit Bucket: IDisposable)

Ramblings of Greg Low (SQL Server MVP, MCM and Microsoft RD) - SQL Down Under

Has the term "big data" completely lost meaning yet?

This blog has moved! You can find this content at the following new location:

http://greglow.com/index.php/2013/02/11/has-the-term-big-data-completely-lost-meaning-yet/

Published Sunday, February 10, 2013 2:29 PM by Greg Low

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

cdp said:

Agree completely. Good post!

Chris.

February 10, 2013 11:03 AM
 

WantToRemainNameless said:

Absolutely agree 100%.  I've just started a position that claimed 'big data'.  I took the position.  Um, yeah, 200GB is not big data.  Needless to say, I will not be staying...

The data that I used to work with at my previous position was around 6 TB data.  That, I would say is big data.  

Big data, IMHO is being thrown in with the likes of Agile, Scrum and the Cloud.  Mere marketing terms these days.  I miss Bill Hicks.

February 10, 2013 12:21 PM
 

John Donnelly said:

Sounds very similar to the UK. I've been meeting with a big data user group for the last 6 months but finding very few people who actually have an appropriate data set - frequently big data would fit in main memory on a reasonable laptop. I think for many this is more an aspirational rather than a reality.

Part of the issue appears to be in the common definition. Volume, Velocity, Variability may cause you to have a big data problem, but very few are ready to stick their neck out and quantify what counts. A year ago I'd have loosely said it was any data analysis task where it was necessary or more economic to handle through scale out database systems rather than scale up, but the market place is now too polluted with v.small big data solutions for this to stick.

February 10, 2013 1:19 PM
 

jchang said:

Amateurs talk strategy and tatics

veterans talk logistics

My view: Big is not about how big your data is, and whose "data" is bigger. it is about moving data from storage to cpu so you can do something with the data.

so perhaps the proper term should: Big Data Movement?

February 11, 2013 4:35 PM
 

Bart Czernicki said:

I like the big data "three vectors" definition...volume (huge amount of data), variety (lots of data with different schemas in a single context DB) and velocity (dramatic growth of data).  If your data has one of these vectors, then you have a potential "big data" problem.

For example, you could have 0 gig of data initially...however, if you plan on storing every stock transaction going forward you will have a "big data" problem because of data velocity.

February 13, 2013 1:07 AM
 

BuggyFunBunny said:

"Big Data" is well defined, although few are willing to openly admit what that definition is.  To wit:  Big Data is the excuse to dump standard RDBMS/SQL datastores with their (nearly) transparent, and client language agnostic, syntax in favour of bespoke file storage tied to a specific client language.  The amount of data needed to meet "Big" threshold moves down as the Kiddie Koders flummox yet more Suits.  Yet another attempt to get Back to the Future of COBOL/VSAM applications.

Ironically, those systems are finding that writing a TPM for each and every application is a pain, so some are setting out to reinvent CICS.  Such folks are blind to the irony.  But that shouldn't be surprising, they've already demonstrated their blindness to data management.

February 16, 2013 9:59 AM
 

DBAdmin said:

Great comments, BuggyFunBunny!

February 16, 2013 6:36 PM
 

CodePro said:

Oh good grief, "big data" is a set of techniques for analyzing data, not a quantity.

February 18, 2013 11:01 AM
 

Greg Low said:

Good comment CodePro. I agree that it's more of a philosophy for how data is analyzed and for the use of newer techniques but that makes the name itself even inappropriate.

February 18, 2013 5:31 PM
 

Larry Den said:

Big data is mostly big JUNK data when we look into what's really being stored in these BD solutions. Most of the "big data" platforms are used as containers for social network blogs, comments, ratings and so on. They are not ideal for RMDBs so BD comes in to help. So far so good.

Problems come up when we try to make use of such data. They are not really useful data to begin with (how much value is there when an anonymous poster gives some article 4 stars anyway). It's hard to efficiently analyse data that's stored in a nonstructural way. There's no short-cut here. We don't have a data-structure storing it, we pay the price later. Low efficiency + vast volume of data = analysis headache. To make matter worse, such "social" data decays. If we don't analyse it fast enough its value rots away so we end up with a big pile of worthless data (junk) wasting hard drives.

That "big data big deal" guy is right. Lots of the big data advocates are merely selling the perception of value to gullible CIOs / CEOs. Big data is a hype.

February 18, 2013 11:11 PM
 

Chuck said:

Like codepro said, what we're really talking about is storing and analyzing semi-structured data, in ways that can scale to many terabytes, but can also be applied to much smaller data sets.  It's a question of the right tool for the job.  Log files, social graph data etc don't fit well into an RDBMS, regardless of size.

February 20, 2013 2:37 PM
 

Leisure Suit Larry said:

"Big Data" is a marketing term, just like "In-Memory"

EVERYTHING runs in memory.  If is cannot, it is moved into memory before hand!!

February 21, 2013 3:12 AM
 

Warwick Leitch said:

Hi Greg,  Thanks for the mention. I had quite a lot of response from my blog you mentioned.  ! had to write a follow up! http://www.calumo.com/blog/big-data-hit-a-nerve-ouch/

March 2, 2013 5:47 AM
 

DBAsa said:

I am working for a large company that handles vldb's around the size of 4tb tables.. Regardsless of the db size totalling 8tb.

This is big - as everything takes long to maintain on these db's.

March 12, 2013 9:04 AM
 

mymmb said:

I have questions from DBAsa .

What is your Database engine?

what kind of reporting tools used?

August 10, 2013 7:02 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Tags

No tags have been created or used yet.

Archives

Privacy Statement