THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a data mangler in London working for Dunnhumby

The One Billion Node machine plus one

I have been reading Jas Dhalliwal’s blog post series on the One Billion node machine (Part 1, Part 2) with interest where he talks about the notion of a network of machines acting in unison and the opportunities that that presents. In his own words:

I describe a possible evolution of Public Clouds into Planetary IT complexes, where billions of nodes are available planet-wide to work on some of the most challenging problems of our times – what I affectionately called the Billion-Node Machine

None of this is particularly ground breaking thinking –the idea of the internet as a computing single resource is nothing new- nonetheless I enjoyed Jas’ take on it, in particular the tagline “The Billion Node Machine” and that did prompt some of my own thoughts that I’d like to present here.

Regular readers will know that I have an interest in the emergent world of cloud computing and, given my Microsoft affiliations , I have a particular interest in Windows Azure and SQL Azure. The recent announcement that Microsoft will soon be selling the Windows Azure Platform Appliance (literally a data centre in a box) is a very interesting move and could have significant repercussions for the industry. The Azure appliance is characterised by:

  • The physical hardware residing at a client site
  • The software platform still being managed by Microsoft

That is unique, I know of no-one else that really has a comparative cloud offering (if you would like to put me straight that statement then please feel free in the comments).

The benefits to Microsoft of selling the Azure Appliance are clear:

  1. They can get the Azure platform to proliferate in countries where it wouldn’t have made economic sense for them to build their own data centre
  2. They can sell Azure to customers that wish to keep their data in-house
  3. They can leverage their immense partner ecosystem to help Azure proliferate

All fascinating stuff but what interests me most about this announcement is the simple fact that Microsoft will demonstrate an ability to manage a software infrastructure that doesn’t reside in their own data centres and that too is a new paradigm. The question I then ask is:

If that software infrastructure can live on servers in big data centres why can it not also live on my laptop?

You can see where I’m going with this, right? Why shouldn’t I be able to install Azure onto my own home computer and let that machine/virtual machine/whatever live as a node in in the Azure cloud? Why can’t I “rent” usage of my home computer to anyone that is willing to pay for it? Can the cloud act as a Mechanical Turk for computing power?

There are loads of advantages in this scenario to both the cloud provider, the customer and the home computer owner:

  • It allows the cloud provider to call on additional computing power should it be needed on hardware that they don’t have to manage
  • It allows the cloud platform to proliferate further into corners of the world where the cloud provider is not going to build a data centre and (in the case of Microsoft) where partners are not going to stick an Azure appliance in their car park.
  • It provides the customer with a variable cost model. “Want compute power on the cheap? Why not run it on home computers around the world?”
  • It provides an income stream to the man on the street. “Want to do something with those spare CPU cycles in between bouts of playing Farmville and watching YouTube videos? Why not rent them out to invisible customers around the world?”
  • For Microsoft specifically this could provide a differentiator that could prop up sales of its aging Windows product or perhaps even introduce a new business model where Windows is subsidised by “loaning” it to Microsoft for use with Azure.

No longer is this Jas’ One Billion Node machine, I plug my laptop in and its The One Billion Node Machine plus one!!!

Does this sound like a realistic future scenario? Perhaps this is why celebrated Windows Technical Fellow Mark Russinovich has recently moved into the Windows Azure team. There are obvious security implications in play with the model I describe here but that’s a discussion for another blog post; for now let’s just consider the opportunities of enabling such an infrastructure.

Jas points out in Planetary Cloud: The Rise of the Billion-node machine that this crowdsourcing of IT resources is already up and running and he cites SETI@Home as one such example. I doubt it will be long before the big cloud providers jump onto this model.

Let me know your thoughts.


Published Sunday, August 1, 2010 10:34 PM by jamiet
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



David Chou said:

Hi Jamiet, interesting thoughts! While I agree that the One Billion Node Machine is interesting and does have many practical uses, it uses more of a grid computing model (including SETI@Home) as opposed to cloud computing; and they are still relatively different models.

Specifically for Windows Azure, the Appliance strategy will work much more effectively than a heterogeneous and geographically dispersed grid of unused CPU cycles in personal computers. Some more detailed thoughts below.

Windows Azure doesn’t just run on any hardware. It runs on specialized hardware (as specified by Microsoft), and even uses a specialized version of hypervisor to run virtualization, then a specialized version of Windows Server, and centrally managed by an Azure fabric controller. These specializations are necessary to support the heuristics required for figuring out how to allocate resources according to the customer demand, as well as ensuring performance, reliability and scalability, by assuming a homogeneous infrastructure.

Comparatively, the cost savings from using a massive grid of under-utilized heterogeneous PC’s likely won’t be significant enough to trade-off the gains from the higher level of control and utilization on homogeneous commodity servers. In addition, we would be further compromising other aspects such as raw compute performance, network latency (over residential ISP networks?), reliability, security, etc.

Also, SETI@Home works because of its simplistic application model – loosely coupled distributed and parallel batch job processing (jobs that are logically and contextually independent, don’t have strict SLAs to meet, basic input/output patterns by downloading data to execute on common runtimes, minimal concerns with network latency, and minimal collaboration between distributed nodes and with central repository). That model can’t really be applied to the full range of application models that exist today

Jas Dhalliwal assumed that a 1000-core machine is an advantage. In reality, 1000 single-core machines will outperform one 1000-core machine, be more reliable and more scalable, and cost far less. And in my opinion (respectfully) his thesis may still be realized one day, but there are a lot more challenges we need to solve in order to get there.

If we live in a world where all software can be boiled down to steams of bits that can be chopped up and processed separately and independently by the same runtime, and that network latency is negligible, then this model may accommodate more types of work. But while that technically would be fantastically interesting to engineer, financially would it be cost-effective and compare favorably (when it won’t run as fast, as reliable, etc.) to the current class of cloud computing providers which are already close to being free?

There is a lot more details around this, but the fundamental point is that the grid computing model doesn’t provide a feasible approach for the Billion Node Machine (or one that can support more types of compute workloads). On the other hand, the cloud computing model Azure implemented actually strikes a relatively ideal balance between a planetary grid of 1000-core machines and a grid of billions of personal computers; and is on its path to becoming a real usable “billion node machine”. :)

Just my thoughts. Best, -David Chou (Microsoft)

August 12, 2010 7:18 AM

jamiet said:


I have nothing to add to that other than to thank you for providing such a well-thought out response. It really is much appreciated - especially as you are clearly much closer to the subject matter than I - its definitely given me some food for thought!

Thanks again


August 12, 2010 10:38 AM

stefan said:

I'd be happy to have .NET threads (.NET 4.0 Tasks capable of running on many remote boxes with the .NET framework handling network deployment, workload, tasks queueing, RAS ... on a grid-like system which could use all network power (PCs, servers, even mobile) to run all sorts of parallel tasks ...

It's not difficult to do at all ..

So one could hook up his (private o company) hardware and run more stuff much faster and use up all untapped power of the idle PCs

August 12, 2010 7:49 PM

Leave a Comment


This Blog


Privacy Statement