THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Backing up Azure Table Storage

I read with interest today a blog post from Yaron Goland entitled Do I need to backup/journal my Windows Azure Table Store?
in which he spoke about the need to do exactly that i.e. Backup any data that you put into Windows Azure Table storage.

Yaron's assertion was not that we need to protect ourselves from Azure failures, instead he asserted that we have a need to protect ourselves from ourselves. He outlines three classes of mistakes that developers of apps that utilise Azure Table storage may  (nay, will) make:

Yaron makes an important point; data backups don't just protect against hardware failure, they protect against application failure too. This has been true for years and there is nothing special about this new class of distributed, geo-redundant, super-resilient data storage services that make our tried-and-tested procedures any the less important hence the message is simple:

make sure your data is recoverable no matter what storage mechanism you are using

 

Having said all that cloud storage services such as Windows Azure Table storage do present somewhat unique opportunities for protecting us against failures. Yaron proposed what he termed a "journal" system which captures all PUT, POST & DELETE operations against a service:

I would like to have a journal that records every command issued against the system ... then when I find out about a data logic corruption bug caused by my front end I could at least try to figure out which of my users was likely to be affected by reviewing the journal

The very nature of Windows Azure Table storage (i.e. CRUD operations against a simple partition key/row key scheme) means that such a journaling system could (and argubly should) be provided by Windows Azure itself rather than being something that the application developer has to build themselves. The service could also provide the ability to "replay" all operations against a given partion key/row key since a specified point in time.

I have no idea whether the Windows Azure team or indeed any other cloud storage provider has contemplated offering such a service but I for one believe that it would be very valuable. What say you?

@Jamiet

Published Wednesday, December 23, 2009 2:34 PM by jamiet
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Simon Munro said:

While I agree that not-for-backup backups are needed in some cases, I'd be very careful about what to wish for.  Creating a built-in journaling system into Azure storage starts making it look, feel and smell more like an RDBMS, which is not what we want.  Once you start journaling, you question when it should be done, land up putting it before the commit and before you know it you have transaction logs, i/o bottlenecks and all of the baggage associated with SQL databases.  While some suggestions may be good (e.g. being able to undelete an entire table), I would suggest that Azure storage should stay as NoSQL as possible, so that we can take advantage of what the architecture offers.  If out-the-box functionality is needed, then use SQL Azure which does a really good job and is tried, tested and familiar

December 24, 2009 9:25 AM
 

Yaron Y Goland said:

A journal of table commands would be replayable. But when I used the term command journal I meant a journal of end user commands which I argue on http://www.goland.org/thelimitsofcommandjournals/ are probably not replayable.

December 24, 2009 12:51 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement