The Future of...

High Performance Computing Conference: VMware at SC’17 in Denver next week

VMware will once again have a demo kiosk inside of the Dell EMC booth at Supercomputing, which is being held next week in Denver. We will showcase the benefits of virtualization of on-premise HPC environments with good performance for an array of workloads.

We have a lot to talk about! Here is a preview…

New Pricing for HPC & Big Data

We have created vSphere Scale-out , a new edition of vSphere that offers the same high performance as other vSphere editions with a feature set chosen to address common HPC requirements and at a price point that is quite different than that of our Enterprise editions.

GPGPU Computing with Shared NVIDIA GPUs

NVIDIA and VMware have long partnered to deliver Virtual Desktop (VDI) solutions that include hardware-accelerated graphics capabilities for each virtual desktop. The heart of this approach relies on NVIDIA GRID vGPU technology, which allows a single physical GPU to be shared among multiple Virtual Machines for graphics acceleration.

With NVIDIA GRID 5.0, NVIDIA has expanded the vGPU concept to include GPGPU, meaning that it is now possible to share a single physical GPU among multiple VMs for running CUDA (compute) applications. We’ve talked with a number of customers who are quite interested in this new capability for supporting a range of users, many of whom do not need access to an entire high-end GPU for their applications. Of course, it is still possible to use passthrough (VM Direct Path I/O) to assign an entire GPU or multiple GPUs to a single VM for high-end users who require large amounts of GPGPU compute power.

We have tested this new capability in the configuration shown below in which a single P100 GPU is shared between two virtual machines for running TensorFlow Machine Learning applications.

Beyond the release of NVIDIA GRID 5.0, VMware and NVIDIA have been doing joint development of new capabilities that were recently previewed at VMworld in Las Vegas and Barcelona. Specifically, we are working jointly to enable both save/restore of vGPU-enabled VMs as well as live-migration (vMotion) of such VMs. These two new capabilities will allow GPGPU to be more seamlessly integrated into the virtualized data center, allowing customers to take full advantage of the flexibility afforded by virtualization.

Latest MPI Performance Results

Over the last several years we have been measuring MPI performance on vSphere as VMware R&D has continued to reduce latency overheads for InfiniBand devices. Until fairly recently our scale testing had topped out at eight nodes, but since beginning our very fruitful collaboration with the Dell EMC HPC Innovation Lab in Austin, we have been able to extend our testing to 32 nodes using EDR InfiniBand running up to 640 MPI ranks using a variety of open-source and commercial HPC applications. A summary of some of our 16-node testing is shown in the diagram below. We will have more detailed test results available to share at SC’17, including some strong-scaling tests we’ve run up to 32 nodes to stress the interconnect to expose any inefficiencies in the platform as we continue to drive overheads out of the platform.

Self-Provisioning of Machine Learning & HPC Clusters

In addition to working with traditional HPC applications, we’ve been spending more time looking at Machine Learning workloads and how best to enable easy access to GPGPU-accelerated virtual machines to support data scientists and others interested in running ML in the vSphere environment. The architecture of the solution we’ve pursued is shown in the Figure below.

At SC’17, we’ll be showing attendees the steps needed to add an MLaaS (Machine Learning as a Service) capability to vRealize Automation, our private cloud layer. Specifically, we will show how templates are created that can be converted into private cloud blueprints which can be used by end-researchers to self-provision either single or multiple VMs with TensorFlow and other ML components pre-installed, and with GPGPU acceleration pre-enabled as well. The workflow we are demonstrating is general and can easily be adapted to create other types of blueprints – for example, either blueprints that support other ML frameworks, or blueprints that create more generic, virtualized HPC clusters.

Summary

We will be showcasing a variety of topics and technologies in Denver next week and we hope you will stop by the Dell EMC booth to learn more about how virtualization can improve the flexibility and agility of your on-premise HPC environment while offering performance comparable to that of bare-metal for a wide range of HPC applications.