This week at VMworld 2015, we announced Virtual SAN 6.1, the third-generation of VMware’s hyper-converged infrastructure for VMs. With this release, we are introducing several major enhancements and new capabilities, including support for the latest generation of storage devices, new availability and disaster recovery features, and advanced management and serviceability tools. These capabilities build on top of already proven, best-of-breed performance, scalability and reliability features to provide enterprise-class storage for all virtualized workloads, including Tier-1 production, mission-critical applications, and high availability demanding use cases.
In this article, I want to take a step back and reflect on where we stand with the product today and explore our vision for the future. Caveat: the forward-looking statements in this article do not reflect committed VMware products or features.
As we discussed in the past, the main goal for VMware Virtual SAN, as it is shipping today, is to provide a cost-effective, enterprise-grade storage solution for typical virtualized environments. (The success of the product so far, with 2,000+ customers with production deployments is a testament to the product’s strengths.)
How do those “typical” environments look? They are designed around traditional “monolithic” applications. These are single-binary apps that are designed to do certain tasks and are self-contained in terms of the utilization of libraries and OS features for data access and networking. They are typically designed to run on stand-alone server platforms and, with the exception of a few clustered applications from the likes of Microsoft and Oracle, they are not designed as distributed, fault-tolerant software services.
VMware has been extremely successful to a big extent thanks to VMware vSphere features such as vMotion, DRS, HA, FT and various data protection solutions, which are used to meet the business continuity needs of an IT organization with traditional application which do not have such features natively. All these vSphere features are designed around the notion of a manually managed pool of compute resources, the vSphere cluster. A shared storage backend is a prerequisite to enable these features and management workflows. That’s exactly what VMware Virtual SAN does using a hyper-converged architecture – it aggregates the local storage devices of all the hosts in a cluster and makes them appear as a shared data store, accessible by all hosts in the cluster.
Storage Infrastructure Management (Today)
Given the primary use cases for VMware Virtual SAN today, we made some packaging decisions for the current product. First and foremost, we decided to make the Virtual SAN cluster to be the same as the vSphere cluster. This is not an inherent technology property, but makes management and integration with vSphere much easier.
- The user does not have to configure and manage storage clusters vs. compute clusters and then deal with all the complexity of what host accesses what Virtual SAN datastore. It is exactly the SAN management complexity such as zoning and fencing we are tackling with Virtual SAN after all!
- It facilitates seamless integration with key management workflows like Upgrades, Maintenance Mode, HA, over-provisioning for emergencies, etc. Even basic tasks such as automatically claiming disks have simpler semantics.
- All existing vSphere APIs and management workflows just work! We just extend existing APIs or add a few new ones for storage purposes.
- The consistency of compute and storage cluster membership simplifies the monitoring and troubleshooting of one’s infrastructure. It facilitates re-use of existing mechanisms such as vCenter alarms and tasks.
Storage Consumption Model (Today)
The predominant way for VMs to consume storage today is in the form of Virtual SCSI (VSCSI) Disks. This is the trick to ensure that any legacy application can run on any storage backend without compatibility concerns. In fact, VMware and all other virtualization vendors have IP to efficiently emulate SCSI controllers and devices in their hypervisors. Again, the Virtual SAN product has been packaged to support VMs and VSCSI disks. But with a very important new twist: fine-grained policy-based storage provisioning and management (SPBM).
In Virtual SAN, every VM or even every individual VMDK (VSCSI disk) is provisioned with its own individualized QoS properties. The user specifies, in the form of a policy, what they want and then Virtual SAN automatically decides how to distribute each VMDK in the cluster and what resources to assign to meet the user’s requirements: capacity, flash space for caching and performance, number of replicas for availability, number of stripes, etc.
There are many benefits of this approach, which we have extensively talked about in the past. The main point I want to make here is how Virtual SAN implements this fine-grained provisioning approach. Unlike VMFS and other similar products in the industry, Virtual SAN is not a clustered file system. It is an object-based storage system. A VM consists of a number of objects. Think of an object as a self-contained unit of data + metadata, which may contain, for example, parts of or the entirety of a file system, the contents of a VSCSI disk, etc. In that sense, Virtual SAN is roughly similar to RADOS, the Ceph object backend.
OK, why is this important? Because Virtual SAN is a generic object-based storage platform. It is not built exclusively for VSCSI disks and today’s virtualization use cases. Instead, the ESXi software modules for VSCSI disks and VM metadata (VMFS file system) are layered on top of the generic Virtual SAN interface. It is worth noting that this object interface and control plane is what we opened up and turned into the VASA Virtual Volumes specification.
All these are done so by design. The plan is to use Virtual SAN to serve more use cases down the road. This is where things are getting really interesting.
The Road Ahead
The IT world is in the midst of a transition driven by software. We are witnessing dramatic changes in the way applications are developed, deployed and managed.
For one, we are moving from a model where we have a large number of smallish applications towards use cases with cloud-scale applications that that span hundreds or thousands of nodes and sometimes even geographic locations. Next-gen apps (also called Cloud-Native Applications or 3rd-Platform Applications) are structured out of many instances of fine-grained microservices. They are not monolithic. Distribution, scaling and resource control is done at the granularity of micro services. Fault tolerance and availability are often implemented by the application itself.
The type of resource pooling and DRS/HA services built around vSphere clusters are not applicable to these applications. In fact, vSphere clusters are not even relevant as management abstractions anymore. We are talking about a completely different management model where the physical infrastructure is visible and managed by the application itself, an approach that fits well with the DevOps model that come hand-in-hand with this new generation of software.
These new use cases require fundamentally new data abstractions and storage management models.
When I think about these challenges, it helps me organize the problem space along two dimensions: Storage Infrastructure Management and Data Consumption Model.
Let’s look at the requirements of each of these areas in turn.
- Storage Infrastructure Management at Scale
Dealing with storage infrastructures that consist of tens of thousands of servers and hundreds of thousands of storage devices is not science fiction anymore. How do we manage such massive infrastructures in a scalable yet effective way?
The following are the key principles for management at scale:
- Tools to provide a bird’s-eye view of the infrastructure’s configuration and health. Allow for fast and effective “zoom in” to any problem areas and issues.
- Use big-data analysis (yes, the same tools that some of the applications running on those infrastructures utilize) to provide ANSWERS to the users, not just piles of data.
- Support dual interfaces:
- Single-pane of glass UI and visualization tools. Assist IT personnel with physical infrastructure troubleshooting and remediation.
- Programmatic interfaces (APIs) for integration with automation code and application logic (DevOps model).
The architecture of traditional infrastructure management services needs to be re-thought drastically to accommodate the new storage requirements. To give you an idea of what I mean, consider the following simple arithmetic:
I assume that any one of us would be more than willing to dedicate a 10th of a core on each host for running data analytics for infrastructure health monitoring, automated troubleshooting, trending, etc. It is a very reasonable “tax” to pay for the benefits of an automated service. For a “modest” infrastructure of 2,000 hosts, one would need 200 CPU cores! As a result, for storage infrastructure management at scale, it is required to have an architecture where the data collection and analysis is done in a distributed manner. A centralized management service may remain the central point of control and interface for data aggregation and presentation.
Distributed, scalable storage management architectures are the way of the future. The very same principles of Cloud-Native Applications are at work here.
At VMware, we are busy designing a new generation of storage management services and tools that follow these principles. You will see some of those ideas applied for first time this week at VMworld.
- New Storage Consumption Models
Virtual SCSI emulation served us well for many years, while we have been running legacy applications in VMs. However, with containers, whether they are run in VMs or natively, one utilizes an OS image, which is specially curated for their application needs. A vendor like CoreOS, Docker or VMware can utilize any driver they wish in the OS image – whether a lightweight block driver or a file system driver/client. Developers use the abstractions that make more sense for their applications and they package them together with the application in the container. No need for backward compatibility and legacy support.
It is in this context that the generic nature of Virtual SAN comes in handy. Virtual SAN is being extended as a platform to server data through abstractions other than just VSCSI disks. And it can do so for traditional VMs or for containerized applications running on vSphere (look for more exciting announcements on this topic at VMworld). Virtual SAN may serve lightweight block drivers (perhaps using the NVMe protocol), native files, or even objects through a REST API. Different abstractions and protocols that are all supported from a single platform with a single management experience and set of tools—a converged storage platform.
Files are especially important for container image management. The main requirement is for scalable, fast creation and deployment of near-identical images. The lack of open-source file systems with robust cloning features has led the community to utilize solutions such as “union” and “overlay” file systems, which have performance and manageability limitations. File system sharing is constrained within a single host. Deployment of containers involves shipping individual image copies to every single host, an inefficient and time consuming process.
Well then, what about a distributed file system designed to scale to infrastructures of thousands of hosts? A file system which can provide a practically unlimited number of clones (at file or volume granularity) created at O(1) time and accessible by any container and any VM on each of those hosts. If you think that this is too good to be true, then you should attend breakout session STO6050 – Virtual SAN: The Software-Defined Platform of the Future at VMworld. Or visit the VMware Office of the CTO Lounge to find out more about how such a file system can be designed taking advantage of Virtual SAN’s object architecture and its infrastructure management services.
Looking forward to seeing you at VMworld!
Article originally posted on Virtual Blocks.