I am very excited that VMware had two papers accepted to the 5th Workshop on System Level Virtualization for High Performance Computing (HPCvirt 2011), which will be held as part of Europar 2011 in Bordeaux on August 29th. My only regret is that I won’t be able to attend because this overlaps with VMworld 2011 at which I am scheduled to do two presentations. But there will be VMware engineers attending to present their work — very cool!

Here are the abstracts for the papers accepted to HPCvirt 2011:

Virtualizing Performance Counters by Benjamin Serebrin and Daniel Hecht

Abstract: Virtual machines are becoming commonplace as a stable and flexible platform to run many workloads. As developers continue to move more workloads into virtual environments, they need ways to analyze the performance characteristics of those workloads. However, performance efforts can be hindered because the standard profiling tools like VTune and the Linux Performance Counter Subsystem do not work in most modern hypervisors. These tools rely on CPUs’ hardware performance counters, which are not currently exposed to the guests by most hypervisors. This work discusses the challenges of performance counters due to the trap and emulate method of virtualization and the time sharing of physical CPUs among multiple virtual CPUs. We propose an approach to address these issues to provide useful and intuitive information about guest performance and the relative costs of virtualization overheads.

Performance Evaluation of HPC Benchmarks on VMware’s ESX Server by Qasim Ali, Vladimir Kiriansky, Josh Simons, and Puneet Zaroo

Abstract: A major obstacle to virtualizing HPC workloads is a concern about the performance loss due to virtualization. We will demonstrate that new features significantly enhance the performance and scalability of virtualized HPC workloads on VMware’s virtualization platform. Specifically, we will discuss VMware’s ESX Server performance for virtual machines with up to 64 virtual CPUs as well as support for exposing virtual NUMA topology to guest operating systems, enabling the operating system and applications to make intelligent NUMA aware decisions about memory allocation and process/thread placement. NUMA support is especially important for large VMs which necessarily span host NUMA nodes on all modern hardware. We will show how the virtual NUMA topology is chosen to closely match physical host topology, while preserving the now expected virtualization benefits of portability and load balancing. We show that the benefit of exposing the virtual NUMA topology can lead to performance gains of up to 167%. Overall, we will show close to native performance on applications from SPEC MPI V2.0 and SPEC OMP V3.2 benchmarks virtualized on our prototype VMware’s ESX Server.