Guides and Best Practices Tech Deep Dives

Near-Cloud VMware Cloud Stack: the Middle Ground for the Hyperscaler Journey

Looking to expand your on-premises compute environment to the cloud without giving away control to a cloud hyperscaler? Consider a near-cloud metal provider. Not only will you benefit from the added flexibility of a metal environment at your fingertips, but you will be able to expand and contract your environment without losing or giving up control of your most important asset — your proprietary data. Since you own and control your data, the near-cloud placement will allow you to place some compute applications in a hyperscaler (with very low latency) without having to face the issue of bringing your data back on premises again at some point. This post explores the mechanics and benefits of this type of environment, as well as the optimal use cases.

A near-cloud provider is a global, interconnected, multi-tenant networking/compute/storage bare-metal environment. Since a metal environment usually adopts an “API-first” strategy, it is relatively easy for an enterprise to connect with, especially for on-the-fly set-up/configuration/lifecycle-type events. In addition, a near-cloud metal environment has superior network connectivity, not only via the provider’s regions and metros, but also to all of the major hyperscalers (or other areas of interest). Figure 1 depicts a high-level diagram of the connectivity from an enterprise to Platform Equinix (a provider of basic metal services) to a fault-tolerant environment hosted via the Equinix Fabric Cloud exchange to a redundant set of Kubernetes environments running on any combination of the hyperscalers.

Figure 1
Source: https://metal.equinix.com/media/pages/images/9fd81843ad7f202f26c1a174c7357585/PhOL-pure.storage.on.platform.equinix.jpg

What do these environments look like?

Above, PaaS (Pure as-a-Service) is illustrated as a storage provider, juxtaposed against using a native VMware solution (such as vSAN). It depicts the way an enterprise could expand its local compute/storage/networking environment (either via a private on-premises datacenter or via colocation at Equinix) and extend it to Platform Equinix running in its local region/metro. In this newly created metal environment (which could be hosting VMware’s Cloud Stack), an enterprise could further enhance its environment — for example, to host data-service backends that would be fully replicated with the on-premises databases. Using the IBX network from Platform Equinix, data could easily flow between Equinix Metro instances within Equinix regions, due to the low-latency connectivity.

Expanding the Equinix Metal environments further and utilizing the fabric-manager portion of the Equinix solution, the Metal environment can be opened up to one or more environments that are running at any of the hyperscalers. In this example, one can easily imagine fully redundant Tanzu Kubernetes environments running customer applications across multiple cloud providers. The near-cloud advantage gained with the Equinix Metal environment is that all of the customer data consumed by the applications running on the Tanzu Kubernetes environments can only exist in either the on-premises data center or within the Equinix Metal environment. If it is necessary to evacuate a hyperscaler in the future, reconciling golden data will not be a problem — the source was never split and it never fully existed within the hyperscaler. Furthermore, due to the very low-latency connectivity over the Equinix Fabric, an enterprise has full connectivity to all of the major hyperscalers (as opposed to having to build connectivity to each of them separately).

What to expect from a near-cloud metal provider

Automation is the key to making the metal environment work. You can build automation to plan, apply, and destroy the environment (using Terraform terminology). Since metal environments are completely API-driven, they allow more dynamic and on-demand-driven environments. For most enterprises, getting additional capacity “on the floor” is a lengthy exercise. Capacity is usually forecasted six-plus months ahead of time and the equipment cannot be stacked and racked until it arrives from the vendor. Testing and verification can be lengthy and cumbersome. Even though an enterprise can negotiate to have hardware and software preconfigured, it is still a slow process compared to pure metal environments. With metal providers, capacity can be available in hours, as opposed to months.

Most metal equipment providers follow the hyperscalers — offering equipment that is reserved, on-demand, or available via “spot pricing.” Since the customer is responsible for the operation from the OS level up, all of the same policies and configurations can be applied in the metal environment. Additionally, once the environment is handed over to a customer, the metal provider will no longer have access to the software or the applications running in the environment. Throughout this entire process, the metal provider is responsible for coordinating hardware upgrades for firmware and they will be constantly monitoring for anomalies at the hardware level.

Metal use cases

Your imagination is the only limit when it comes to the utility of your metal environment. Consider these ideas:

  • Set up a remote datacenter in a near-cloud metal environment. All of the controls can reside within your environment, while the remote compute capacity would reside at the metal provider.
  • Set up a stretch vSAN to the local metro datacenter to ensure fault tolerance.
  • Use a metal provider to expand your capacity.
  • Create a disaster-recovery site, in case something bad happens on the primary datacenter side. As one site fails (using Terraform automation as an example), a new environment could be brought online. If the application data was already available (via a previously set up vSAN), it would not take an excessive amount of time to fail over the environment. With a swing-capacity use case, you could get additional capacity online to facilitate cluster upgrade if you don’t have extra capacity to bring down a host. This normally occurs when N+1 is not enough slack if a host goes offline due to a hardware failure.
  • Explore and investigate new hardware capabilities. Setting up environments to test new storage or networking option or processor types is costly and time-consuming. Metal environments tend to evolve fast and have a good mix of new technology. It’s mighty convenient to be able to test new technology — such as storage or smart NICs — before committing to a purchase.
  • Using a metal provider for “build, move, and burn.” In this use case, instead of updating software in a cluster in place, you’d mimic what a lot of the hyperscalers do: build a new environment with the latest patches, versions, etc. Once the workloads have successfully been moved, you’d “burn” or destroy the environment and return the capacity to the metal provider.

What’s next?

Stay tuned for my next post, where I will demonstrate how to go from zero to vSphere in 45 minutes using a vRealize Metal Template for automation.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *