THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

Coping with Little Data

With all of the hype about Big Data, Little Data is being overlooked. Not every business has zetabytes of data. Businesses that have a few bits here and maybe a few bytes there are being overlooked, but there is hope on the horizon. The most fundamental part of the Little Data ecosystem is Gnorm. Gnorm is named after the gnat in Jim Davis’s comic strip Gnorm Gnat. Jim wasn’t happy with a small success, so he abandoned Gnorm and created Garfield. But enough about Jim.

Apache Gnorm is a set of algorithms for undistributed storage and undistributed processing of very small data sets on a single desktop computer. It was ported from an abacus. MapExpand is used to process the data into something large enough to see. Apache Hive is overkill for processing Little Data, so the developers created Apache Cell after extracting a single cell from Apache Hive to use as the data warehouse. Version 1 was a worker bee cell, but it in version 2 it was adapted from a queen bee cell. Similarly, Apache Zookeeper is too large for coordination of Little Data tasks, so Apache Petri was created. Real-time analysis is done with Breeze and machine learning is done with 65.

I spoke with Liz N. Knot of the IT recruiting firm Loosey and Guice. She said it is very difficult finding IT professionals for Little Data projects. She said her clients want to solve simple problems like did the business bring in enough money to cover expenses today, but so many applicants only want to do correlations using R or Python. She just can’t get them to switch over to Worm.

Published Wednesday, April 1, 2015 7:02 PM by John Paul Cook
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is a database and Azure specialist in Houston. He previously worked as a Data Platform Solution Architect in Microsoft's Houston office. Prior to joining Microsoft, he was a SQL Server MVP. He is experienced in SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. John is also a Registered Nurse currently studying to be a psychiatric nurse practitioner. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2. Connect on LinkedIn

This Blog

Syndication

Archives

Privacy Statement