Leveraging Kubernetes for Enterprise HPC
High-performance computing (HPC) is the use of parallel-processing techniques to solve advanced scientific and engineering problems. Examples of HPC applications are financial risk simulation, life-science modeling, weather forecasting, electronic design automation, digital-movie rendering, and machine learning/deep learning.
In a typical distributed HPC environment, workload managers or job schedulers are leveraged to schedule users’ jobs by allocating CPU, memory, or other resources, such as accelerators, on compute nodes. They also provide queuing, reservation, topology-aware scheduling, and monitoring features. Slurm Workload Manager, IBM Platform LSF, Univa Grid Engine, and Altair PBS Professional are a few popular HPC workload managers.
As microservices and containers are becoming prevalent in enterprises, they are also making inroads into machine learning and other similar HPC jobs. This trend blurs the line between traditional HPC and container technologies, sparking exploration of the possibility of leveraging Kubernetes, the de facto standard for container management framework, for HPC.
Benefits and challenges
There are many benefits of using containers and Kubernetes for managing and running HPC applications. First, containerization is a powerful tool for packaging complex dependencies and enhancing reproducibility. Second, as enterprises are embracing Kubernetes as a secure and multi-cloud platform for application modernization, hosting both HPC and enterprise container workloads in a shared environment will simplify operations and reduce costs. Third, while HPC is not its key design target, Kubernetes essentially offers a shared resource pool with CPU, memory, and accelerators, just like workload managers.
That said, there are challenges integrating Kubernetes with HPC workflows. First, the base scheduling unit in traditional HPC workload managers is a batch simulation job, which runs to completion in a period of time. However, the base scheduling unit in Kubernetes is a pod (one or more containers), which normally runs continuously. For traditional HPC users, a job is normally submitted as a batch script from a command-line interface via workload managers. For Kubernetes users, a “job” is packaged into a container, published in a registry, described as a desired state in the YAML file, and scheduled as one or more pods on a Kubernetes cluster. This Kubernetes process is foreign to the majority of HPC users. Second, unlike typical enterprise container workloads, such as web applications, HPC workloads exhibit different characteristics. HPC applications are often compute-intensive, I/O intensive, and run in parallel on clusters. To ensure all projects are adequately resourced, additional resource policies may need to be enforced.
Motivated by increasing customer interest in supporting HPC workloads with Kubernetes in their enterprise environment, we are working with Nimbix to demonstrate a solution for Enterprise HPC. This joint solution is based on the VMware Tanzu portfolio and Nimbix JARVICE XE. It offers the benefits of enterprise technologies and overcomes the challenges of integrating HPC workflows with enterprise-grade Kubernetes. It enables a seamless multi-cloud environment for managing HPC workflows and running HPC workloads, with minimal HPC end-user effort in adapting to Kubernetes. This multi-cloud solution will facilitate enterprise IT managing HPC in a modern way, with diverse infrastructures, such as the mixture of on-premises and off-premises environments.
VMware Tanzu + Nimbix HPC Kubernetes solution
Figure 1: Overview of Multi-Cloud Kubernetes Solution for Enterprise HPC, with VMware Tanzu portfolio and Nimbix JARVICE XE
Figure 1 shows a high-level overview of the different layers of the technology stack in the multi-cloud solution. In the solution, there are three main personas who manage and use this platform:
- Infrastructure administrator
- Nimbix system administrator
- HPC/ML application developers and users
The infrastructure administrator manages Kubernetes infrastructures, including privately owned or cloud resources, and maintains the lifecycle of different Kubernetes clusters. The Nimbix system administrator manages the JARVICE XE system running on top of Kubernetes clusters and grants accounts and resources, as well as resource limits for end users. Application developers build HPC containers, create workflows, and share applications with other team members. Users can easily launch jobs and visualize job outputs using the JARVICE XE portal.
Now, let’s take a deeper look into each part of the solution from the perspective of each persona.
Multi-cloud environment managed by a single pane of glass
We can broadly categorize the enterprise resources (which may or may not be dedicated for HPC usage) into two categories: on premises and off premises.
Typical on-premises resources include:
- Bare-metal clusters (while it’s uncommon in the enterprise, a large portion of HPC still remains unvirtualized)
- vSphere clusters in privately owned datacenters
Typical off-premises resources include:
- Public clouds, such as Amazon AWS, Microsoft Azure, and Google GCP
- Hybrid cloud solutions, such as VMware Cloud on AWS (VMC), Azure VMware Services, and Google Cloud VMware Engine, which extend on-premises vSphere environments to run natively on AWS, Azure, or GCP
- Certified VMware Cloud providers. The VMware Cloud Provider Program (VCPP) enables partners to leverage VMware products to offer services as public or hybrid clouds.
- HPC shared resources, such as XSEDE
In large organizations, resources can be globally distributed as a federation of clusters spanning different regions and zones. Tanzu Kubernetes Grid (TKG), aligned with open-source Kubernetes, is an enterprise-supported Kubernetes runtime. It can reliably deploy and run Kubernetes across your VMware private cloud and extend the same consistent Kubernetes runtime across public cloud and edge environments.
Tanzu Mission Control (TMC) is a centralized management platform for consistently operating and securing Kubernetes clusters at scale. It works either with TKG or other Kubernetes runtimes. It enables provisioning clusters across environments such as vSphere Tanzu and Amazon AWS. It also supports the attachment of all conformant Kubernetes clusters for centralized operations and global visibility. Tanzu Observability by Wavefront, integrated into TMC, enables monitor-as-a-service with metrics, traces, span logs, and analytics at granular customization and controls.
From the infrastructure admin’s perspective, TMC and Tanzu Observability offer a unified and scalable approach to manage all resources (either on premises or off premises) across the different Kubernetes runtimes (either TKG or any other Kubernetes) using a single pane of glass.
HPC application workflow using Kubernetes powered by Nimbix JARVICE XE and HyperHub
Nimbix JARVICE XE wraps advanced computing workflows to work with Kubernetes. It supports accelerated applications that take advantage of hardware accelerators, including InfiniBand, GPGPU, and FPGA, on Kubernetes clusters. From the Nimbix system admin’s perspective, Nimbix JARVICE XE can be deployed on Kubernetes clusters spanning multiple regions, datacenters, and private or public clouds. HyperHub enables developers and HPC ISVs onboarding their own custom applications and workflows. For HPC users, those open-source or commercial applications are ready-to-run in the HyperHub catalog.
In this video, we demonstrate how the above solution works. We created a proof-of-concept using three endpoints — two on-premises clusters (a vSphere cluster and a bare-metal cluster) and one cluster in the cloud (VMware Cloud on AWS). SaaS-based Tanzu Mission Control was leveraged to manage all three clusters. SaaS-based Tanzu Observability, integrated with Tanzu Mission Control, was leveraged to provide granular monitoring metrics. Nimbix JARVICE XE components were deployed on all three clusters.
Three user-level use cases are showcased from the Nimbix HyperHub service catalog:
- An Ansys Fluent batch job running on two compute nodes using MPI (message passing interface)
- An interactive job running in a visualization desktop to run Ansys Icepack with GPU
- Launching a GPU instance for running Jupyter Notebooks with Tensorflow for deep learning
All use cases were running as pods on a specified Kubernetes cluster that is configured by Nimbix administrator.
In this work, we focus on re-architecting HPC using Kubernetes by leveraging the VMware Tanzu portfolio and Nimbix JARVICE XE. This is a radically different approach from traditional HPC job scheduler/workload management. In conjunction with our virtualization HPC reference architecture, this approach offers an alternative way for enterprise IT/HPC teams to support HPC in a modernized multi-cloud Kubernetes environment.