A year ago I predicted that 2013 will be the time when Software-Defined Storage (SDS) will take off, with credible products either being released for the first time or reaching a maturity level that can be taken seriously as an alternative to traditional storage. Hype aside, SDS does reflect a major shift in how storage is managed and consumed in the data center. See my 2013 predictions blog for a definition of SDS. Also, look for a forthcoming blog by my colleague Richard McDougall that discusses the disruption of converged storage infrastructure and the industry trends that are causing it.
Indeed, a number of pioneers in the field, mostly products by startups, have begun gaining good traction with customers. Moreover, VMware released Virtual SAN (VSAN), currently in public Beta with GA planned for H1 2014. All of these products use some form of scale-out software architecture running on commodity hardware. They pool and abstract storage hardware, often including solid-state storage devices as well as magnetic disks. Some of them are designed for virtualized environments, with different degrees of integration. Some target enterprise use cases, while others target the needs of service providers with very large farms of white-box hardware.
Arguing about the pros and cons of each product’s architecture makes for a great discussion topic over a beer and perhaps warranting a separate blog article. However, what will make or break these products in the market is management. By management, I refer to the administrative actions required to a) deploy, configure, troubleshoot and maintain physical storage assets and b) provision, monitor and reconfigure the abstract storage entities consumed by applications.
Traditionally, a storage system instance involved 10s to 100s of LUs accessed by a handful of servers. Add to the mix data management services, such as snapshots, backup and remote replication for disaster recovery, and they make a for a full time job for a storage administrator. Those are IT professionals with deep storage expertise, often focused on a specific vendor’s products, who handcraft storage objects (e.g., volumes, file systems) to meet the quality of service requirements of different applications and workloads.
With the advent of SDS products, we now have storage systems with 100s to 1000s of nodes and up to millions of individual storage objects, whether virtual machine disks, blob data objects, or even individually managed files. This is not only the case in the data centers of the Googles and Amazons of the world, but it is becoming the status quo in mainstream enterprise data centers, especially in combination with virtualization. It is not uncommon to see SMBs with 100s of VMs deployed in production, with 1000s of virtual disks, 10s of file systems and a multitude of data protection solutions including archival and disaster recovery for compliance purposes.
In this brave new world, traditional manual storage management is infeasible. Moreover, sophisticated, scalable storage and data management solutions are a requirement beyond the few enterprises that deploy high-end disk arrays. I predict that 2014 will be the year of the democratization of storage management.
Among the many SDS platforms being offered in the industry, only those that offer a compelling management story will survive. As such, management will be the focus of the storage industry in 2014 and for the next few years. Following are what I consider to be the main properties of the new storage management paradigm:
- Automated, policy-based provisioning. Storage entities that are consumable by applications and users, whether virtual disks, objects or files, will be provisioned automatically according to policies specified by an IT professional. Those policies will evolve from low-level storage properties (e.g., RAID configurations, cache sizes) to Service Level Objectives (SLOs) capturing application availability and performance goals. The administrator will specify what they need, not how to achieve that. The latter will be done automatically by the storage platform based on a) the physical capabilities of the potentially heterogeneous hardware, and b) data and workload characteristics. For example, in one deployment, the performance requirements of an application may be met by utilizing a cache in a Flash tier, while in another instance striping may be a more cost-effective solution.
- Predictable quality of service. Today, storage experts try hard to ensure that storage meets the application needs in shared storage systems. Following the same theme of automation, algorithms will do the job in the future. In some cases, statistical (best-effort) approaches may suffice especially when predictability is required in aggregate (as in a large analytics farm). However, many users will still want the confidence that, for example, a critical database will achieve the required transactions per second despite “noisy neighbors” that try to use as much bandwidth from the storage as possible. SDS platforms need robust mechanisms for admission control, performance differentiation and security in multi-tenancy environments.
- End-to-end monitoring. Administrators need scalable and intuitive tools to continuously monitor their storage infrastructure. Such visibility goes beyond predictable quality of service and reporting of compliance with policies. It also involves a single pane of glass view of how the system is doing, resource utilization, and imbalances of resource usage. Such tools allow the administrators to understand the behavior of the system, recognize trends, plan their physical resource inventory and, when worse comes to worst, do troubleshooting and root-cause analysis. They are critical, especially in so-called hyper-converged platforms where SDS shares compute resources with other workloads.
- Multi-site management. Increasingly, enterprises have a physical presence in many geographic locations. Many remote locations have limited IT facilities and lack skilled IT personnel. Deploying and managing (remotely) enterprise-grade storage in those locations has been very challenging in the past. SDS with commodity hardware offers a major advantage for those use cases. Moreover, with the right tools, storage management is reduced to mundane tasks such as replacing faulty hardware components (servers, disks, controllers) that unskilled personnel can perform without service interruption; a huge advantage in terms of operational expenses.
- Converged hardware platform. The advantage of cookie-cutter commodity hardware components goes beyond the Remote Office and Branch Office (ROBO) use cases. It radicalizes the way hardware is procured and used in the data center at large. IT departments can standardize across a few hardware configurations that can be used to run applications as well as infrastructure services such as storage and networking. This is a fundamental aspect of the Software-Defined Data Center (SDDC) vision. However, without the right tools for sandboxing the resources used for storage and for end-to-end monitor and control of the converged platform, SDS can quickly turn from an IT dream to a nightmare.
- Converged administrative model. Even before the advent of SDS, management of storage has been moving beyond the boundaries of expert IT storage teams. As recent surveys of VMware customers show, an increasing number of vSphere/Cloud administrators also have some form of storage management responsibilities. In some sectors, this is the common case. SDS management approaches should be tailored for use by generalist IT professionals who are not experts in storage. At the end of the day, the success of the entire SDS trend will be determined by its adoption by the masses of IT professionals, not by the few storage experts.
In conclusion, I claim that the advent of Software-Defined Storage is an irreversible trend. I expect 2014 to be the year when the winning products in this space will start emerging. The managing experience exposed will be the differentiating feature amongst winners and losers. What is your opinion on SDS and how storage management is (or should be) evolving?