vNUMA: What it is and why it matters

In vSphere 5, we introduced vNUMA which allows interested guest operating systems to see that they are

running on a NUMA (Non Uniform Memory Architecture) topology. For those not familar, here is a one-diagram NUMA explanation. As you can see, in the UMA case,

mem.png

the cost of accessing a particular memory address is the same regardless of which socket your program is running on. In the NUMA case, however, it does matter. With memory attached directly to each socket there can be significant performance penalties if an application gener ates large numbers of non-local memory accesses.

The ESX hypervisor has been NUMA-aware for quite some time, making memory and CPU allocation decisions based on its full understanding of the topology of the system’s physical hardware. However, prior to vSphere 5 a VM that had vCPUs which spanned multiple physical sockets would believe it was running on a UMA system and therefore its own NUMA-aware resource management features would not function correctly. Similarly, NUMA-aware application runtimes would also be unable to optimize correctly in such circumstances.

Now that vSphere supports so-called Monster VMs with up to 32 vCPUs and 1 TB of memory, it is more likely that VMs will span machine sockets and more likely that NUMA could play a factor in application performance.

The recent VMware paper, Performance Evaluation of HPC Benchmarks on VMware’s ESX Server, presents a detailed analysis of the benefits of vNUMA using SPEC OMP and SPEC MPI has representative benchmarks. The figure below is taken from the paper and shows SPEC OMP results comparing virtual and native performance.

numa.png

The lefthand side of the diagram shows that the performance of the benchmarks within the SPEC OMP suite when using vNUMA is generally close to that of native performance for VMs having from four to 64 vCPUs. The righthand graph illustrates the effect of exposing NUMA to the guest (vNUMA) versus not exposing it (Default) over a range of VM sizes. In all cases, significant performance improvements are achieved.

For more information about using vNUMA is vSphere 5, see the vSphere Resource Management Guide [PDF] and Performance Best Practices for VMware vSphere 5.0 [PDF]. We hope to have the full text of the vNUMA paper available on the external website soon.

Other posts by

High Performance Computing Conference: VMware at SC’17 in Denver next week

VMware will once again have a demo kiosk inside of the Dell EMC booth at Supercomputing, which is being held next week in Denver. We will showcase the benefits of virtualization of on-premise HPC environments with good performance for an array of workloads. We have a lot to talk about! Here is a preview… New […]

vSphere Scale-Out for HPC and Big Data

I’m very excited that we’ve announced vSphere Scale-Out this week at VMworld here in Las Vegas. This new vSphere edition is specifically and exclusively designed for running HPC and Big Data workloads. This is an important development in our work to offer compelling virtualization solutions for these two emerging workload classes. Our strategy for addressing […]

Three Extreme Performance Talks from the Office of the CTO at VMworld USA

The Office of the CTO will be presenting three talks in the unofficial “Extreme Performance” series at the upcoming VMworld 2017 conference in Las Vegas. In addition, one of these talks will be delivered at VMworld Europe in Barcelona. Each of these talks focuses on important aspects of pushing the envelope to achieve high performance […]