THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a data mangler in London working for Dunnhumby

Thoughts on Data Explorer

To my mind the most interesting piece of news to come out of the recent PASS conference was the unveiling of a new SQL Azure Labs project coming from the SQL Server organisation that has the codename "Data Explorer" (not a very imaginitive codename I'm sure you'll agree) and for which there is information available at (in case you've surfed on here a few months on from when I originally wrote this blog post you should expect that that that URI will have become a dead link).

My good buddy Chris Webb (blog | twitter) has already blogged about Data Explorer at  Pass Summit 2011 - Day 1 Keynote and Self-Service ETL with Data Explorer in which he made a very telling observation:

It allows you to mash up data from various different sources then publish the result as an OData feed – very similar to Yahoo Pipes

I couldn't agree more with that assertion. I blogged about Yahoo Pipes over four years ago at Taking Yahoo Pipes for a test drive and I referred to it then as "ETL for RSS feeds"; it interested me greatly because here was a tool that enabled non-developers to pull data from multiple sources and make it available as a single data source that could be easily consumed; moreover it ran as a cloud service which has also long been an interest of mine. Granted, it only did this for RSS feeds but the premise was still really interesting to me; I believe that making data easily consumable is far more important than the tool chosen to consume it hence why I'm such a massive advocate of iCalendar for BI and why you'll rarely find me talking about the likes of Business Objects, Cognos, Qlikview, Tableau and Power View on this blog (no disrespect intended to those tools or the people that use them - they're just not what floats my boat).

Where Yahoo Pipes consumes RSS feeds and provides RSS feeds, Data Explorer consumes from loads of different places and provides OData feeds (something I've been banging on about for a while now) and if you're in the Microsoft ecosystem OData is increasingly looking like the lingua franca for platform and device independent data integration. Moreover, according to recent blog post Creating a custom RSS reader in Montego (cloud) by project lead Tim Mallalieu Data Explorer will also be able to pull data directly out of web pages and that is stepping firmly into the territory of Kapow which, again, is a tool that Chris and I have blogged about before at Kapow – ETL for HTML and Kapow Technologies. Chris referred to Kapow as:

a cross between a screenscraper and an ETL tool

and again I wouldn't disagree. Data Explorer looks like filling the missing link that I was alluding to in the final paragraphs of my June 2009 blog post Enterprise Mashups.

Are you spotting a common theme here? Data Explorer is an ETL tool and given my obvious SSIS affiliations that makes it very interesting to me. That it runs as a cloud service and will be available to non-developers only makes it more intriguing and I can't wait until Data Explorer becomes available for us to tinker with later this year. No doubt Chris will be keeping a watching brief too.


UPDATE:Some further thoughts...

It would be interesting to see what else could be done with this data once its exposed as a feed. I'll wager that in the not too distant future you'll be able to (for example) sell the output from your Data Explorer mashup on Azure Datamarket or view geocoded feeds on Bing Maps (note that Geospatial support is coming to OData in the very near future). There are lots of possibilities I'm sure and I'm looking forward to seeing what ideas others have for using and sharing this data.

I'm also wondering whether there will be an option to host Data Explorer (and hence Data Explorer mashups) inside the enterprise. Today most enterprise data is contained within the corporate firewall thus will not be accessible from a Data Explorer service provided via SQL Azure; it would be a shame if such data could not be accessed by Data Explorer and hence why I hope there will be an on-premise version available. I can think of many scenarios at my past clients where the ability to easily make data consumable over HTTP and behind the firewall would have been invaluable.

Published Monday, October 24, 2011 11:55 AM by jamiet
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



Chris Webb said:

It also supports ODBC endpoints as well as OData, as Tim mentions here:

And what about the scripting language? There's so much to this tool... Following on from that, I could imagine it going in a SAS-like direction as well, so we could do complex calculations (scaled out in the cloud, using Hadoop or Dryad?) as well as just ETL.

October 24, 2011 7:24 AM

Matt Hall said:

Definitely interested to know more on how it could be used within the Enterprise.

October 25, 2011 6:41 AM

SSIS Junkie said:

Yesterday the public availability of Data Explorer, a new data mashup tool from the SQL Server team,

December 7, 2011 8:47 AM

SSIS Junkie said:

Earlier today I posted Data Explorer walkthrough – Parsing a Twitter list in which I explained that I

December 7, 2011 3:57 PM

SSIS Junkie said:

A short recap At the PASS Summit 2011 a project that existed as part of the now-defunct SQL Azure Labs

February 27, 2013 12:46 PM

Leave a Comment


This Blog


Privacy Statement