HPC Clouds: A Bad Idea?

I learned a valuable lesson this week when my flight was cancelled and I was unable to deliver my talk “HPC Cloud: A Bad Idea” at the HPC Advisory Council Workshop in Hamburg this week.

The lesson? While provocative presentation titles may generate interest and curiosity in attendees, the title needs to stand on its own as in this case when the attendees were not able to hear my message. To clarify this as best I can, the Council has offered to put a video on their site.

In the meantime, let me briefly explain my stance on HPC Clouds. First, use of cloud technologies for HPC is a great idea. In fact, I believe cloud computing represents a large and important shift within the IT universe and that the HPC community can and should benefit from this emerging capability.

However, it would be a bad idea for the HPC community to continue to plow forward, extending its existing cluster management stacks to create HPC-specific cloud deployments. Why? Because we (HPC) should be leveraging the vastly larger investments being made by the vendor community to define and deploy enterprise cloud infrastructure. Many of the infrastructure issues that confront Enterprise and HPC customers are shared concerns. And this set of shared concerns is increasing as Enterprise infrastructure and workload characteristics continue to evolve.

Consider, for example, the massively scaled infrastructures deployed by Google, Amazon, Yahoo, and the other large web properties. While their workload characteristics are different than those of HPC in many ways, there is much commonality in the problems provisioning, management, and monitoring of these horizontally-scaled resources. Similarly, with the adoption of parallel computing frameworks like Hadoop for Big Data analytics and the proliferation of scale-out data-oriented frameworks (e.g., GemFire Data Grid, Cassandra, MondoDB, memcached, etc) we see interconnect becoming more important within the Enterprise as a component of application performance. Though interconnect in the Enterprise does not (yet) rise to the level of importance we see in HPC, we have seen evidence that RDMA could add significant value in scale-out Enterprise environments.

There is more to the argument, but let me leave it there for now. I will post a link to my video when available.

In the meantime: HPC Cloud — a Bad Idea. But HPC IN the Cloud — Absolutely!


