I promised in the blog post vNUMA: What it is and why it matters to share the full research paper with you when it became available. It can now be downloaded here.
If you find the paper useful, please click to give it a star rating on that same page.
As a reminder, here is the paper abstract:

A major obstacle to virtualizing HPC workloads is a concern about the performance loss due to virtualization. We will demonstrate that new features significantly enhance the performance and scalability of virtualized HPC workloads on VMware’s virtualization platform. Specifically, we will discuss VMware’s ESXi Server performance for virtual machines with up to 64 virtual CPUs as well as support for exposing virtual NUMA topology to guest operating systems, enabling the operating system and applications to make intelligent NUMA aware decisions about memory allocation and process/thread placement. NUMA support is especially important for large VMs which necessarily span host NUMA nodes on all modern hardware. We will show how the virtual NUMA topology is chosen to closely match physical host topology, while preserving the now expected virtualization benefits of portability and load balancing.We show that the benefit of exposing the virtual NUMA topology can lead to performance gains of up to 167%. Overall, we will show close to native performance on applications from SPEC MPI V2.0 and SPEC OMP V3.2 benchmarks virtualized on our prototype VMware’s ESXi Server.