Owen on software

400 cores later – tales from clusters on AWS and Rackspace

10 January 2014 - Comments

Pulled across from my old blog.

… tales from clusters on AWS and Rackspace. Earlier this month I was working on a project where we needed to launch a modest sized cluster to process a CPU-bound compute task. As we didn’t have the computing power in-house, to process the workload in a timely manner, we went to the public cloud. This demonstrated several key points that should always be considered when building clusters in the public cloud:

Cloud computing is an incredibly flexible model

Processing our workload in the cloud enabled us to complete a job in less the two hours which would otherwise have taken us around four days with our in-house compute capability.

The 400 core cluster we launched in the cloud cost us just over $15/hr to run. This demonstrates the incredible power of cloud computing that is driving the rapid adoption of AWS and other providers.

Not all clouds are created equal

There are a plethora of providers in both the IaaS and PaaS space. Within the IaaS space, Amazon Web Services (AWS), Rackspace Public Cloud and Digital Ocean are leading the charge.

The growth of AWS is simply staggering, and it has stolen a massive lead on all its competitors in terms of its feature-set and overall capability. I’m a huge fan of AWS.

That said, selecting a cloud platform is not a foregone conclusion. All cloud providers differ in terms of the performance characteristics of the VMs, and the infrastructure, they provide and how they price them.

Benchmark your tasks

As a result, it is always worth benchmarking any cluster-based compute tasks on a few providers to determine which is most cost effective. These were the results of our benchmarking:

  • Amazon – c3.2xlarge cluster – just under $28/hour
  • Digital Ocean – 8 vCPU instance cluster – just over $16/hour
  • Rackspace – Performance 1 8 vCPU cluster – just under $15/hour

For this particular job Rackspace was almost half the price of Amazon, and just pipped Digital Ocean. This would only have been a matter of tens of dollars as our job was relatively short-lived. However, the price differential can quickly stack up for longer running jobs and bigger clusters.

400 cores and $30 later

We were able to spin up our 400 core cluster in less than 10 minutes. Flatline the CPUs for two hours before spinning it down. All for around $30. Brilliant.

Obviously, each job is different in terms of the performance characteristics, longevity and data security and compliance requirements. So it really is a case of picking the appropriate tool, or cloud, for each job.

Tags: AWS


comments powered by Disqus