Why is everyone talking about IO filters?
At this year’s VMworld, a relatively little-known feature of vSphere is getting a surprising amount of attention. The VMware vSphere APIs for IO Filtering (VAIO) aren’t as well-known as other storage features like Virtual Volumes or Virtual SAN, but they’re very important for VMware vSphere as a hybrid cloud platform, especially with respect to cloud-based services for disaster recovery (DR) and business continuity (BC).
The new attention to VAIO is due to a number of well-known data protection vendors announcing that their solutions will begin to use the API. This article doesn’t take a deep dive into the API itself (there’s an excellent technical introduction in this blog post). Rather, the point of this discussion is to explain why IO filters are increasingly important to enabling DR/BC services and why the vendors in this space are updating their products to use the IO filters API.
First, a little background on us. The engineering team at JetStream Software has a long history with VAIO. When VMware began development of the IO filter APIs, we collaborated with VMware as the design partner, developing the first software solution to use VAIO for IO acceleration using solid state memory in the host. As VMware continues to improve the vSphere platform, we continue to work closely with the VAIO engineering team at VMware.
What Are IO Filters, and How Do They Work?
In a nutshell, VAIO provides a safe way for third-party software to be integrated into a VMware environment, to intercept data as it moves between virtual machines and the virtual disks, and perform a service for that data, such as IO acceleration, data encryption, or replication.
Additionally, IO filters enable the third-party software to be notified of any significant events, such as vMotion, snapshot creation/deletion, etc. Products that use IO filters are tested, certified and supported by VMware and listed in the VMware Compatibility Guide.
This assurance of compatibility has always been reassuring for customers deploying software in their on-premises VMware environments, but it’s even more important for hybrid cloud services, especially DR/BC. Let’s consider why.
Virtualization Enabling Better Data Protection
As organizations increasingly virtualized their compute and storage infrastructures, data protection and application recovery became much more closely connected to the virtualization platform, rather than the storage infrastructure. This shift began when data protection was still primarily an on-premises function, and it made sense.
An organization could deploy a number of different storage systems -- including increasingly popular flash arrays -- but because they had standardized their runtime architecture on a common platform (the "software-defined data center"), they could manage data protection consistently at the VM level.
VMware provided its own vSphere Replication and Site Recovery Manager, and it also introduced APIs (VADP) that enabled third-party software solutions for snapshot-based backups. Over time, these solutions have become easier to manage and more powerful functionally.
Consuming Data Protection as a Service
Data protection and application recovery services have been available for a long time. Keeping copies of data in a second site has always been a best practice for many organizations.
What’s changed recently is that cloud-based data protection services are increasingly replacing on-premises backup systems entirely. The market for hosted data protection services, which had once been considered only for the most critical workloads, now features simpler and less expensive options for all types of workloads.
This is a huge shift in the data protection market, as organizations look to software vendors and cloud services instead of traditional storage vendors to address their data protection requirements. Because these services use software to intercept data in vSphere rather than the customer’s storage infrastructure, service providers can operate large, scale-out, multi-tenant infrastructures to provide data protection for customers running a variety of different compute and storage systems.
In a sense, the "software-defined data center" has actually been a stepping stone toward the hybrid cloud.
Moving Beyond Backup to Continuous Data Protection
Taking snapshots for backups is now a fairly standard practice for any decently managed VMware environment. In truly well-managed VMware environments, administrators also test their backups to make sure they will be able to recover if disaster strikes. And as we’ve seen, it’s increasingly easy to create backups on-premises, but store them in the cloud, either partially or entirely.
But snapshot-based backups have a couple of key drawbacks. The first is that creating snapshots can detrimentally impact application performance. The second is that snapshot-based backup doesn’t protect data continuously. The time interval between snapshots defines the protected workload’s recovery point objective, or RPO. And today’s IT managers want shorter RPOs, in fact, as close to zero as possible.
The alternative to snapshot-based backup is continuous data protection, or CDP. The core idea of CDP is that data is captured and replicated at the moment it’s committed to the primary storage on-premises. At the storage level, this is what many DR appliances do.
In the world of VMware vSAN and most third-party hyperconverged infrastructure (HCI) platforms, stretched clusters allow data written in one cluster to be replicated to a recovery cluster, as long as the recovery cluster employs the same HCI platform. But the ability to continuously capture data in vSphere hosts and replicate it to a service provider’s environment has been challenging.
IO filters give a service provider a mechanism to provide CDP services to any VMware environment with a near-zero RPO, regardless of the customer’s infrastructure. Additionally, the service provider’s customers won’t suffer the performance impact of snapshots. Finally, the service provider isn’t asking the customer to run software in their data center that hasn’t been certified by VMware.
As the VMware platform extends beyond the software-defined data center to become the platform for the hybrid cloud, IO filters are a key to the software that service providers can use to deliver DR/BC capabilities with no strings attached, including:
- RPOs of seconds, instead of backup intervals of hours
- Better application performance
- Certification and support by VMware
- No agent software in the customer’s VMs
- Support for all types of VMware environments
With all those advantages, it’s easy to see why the leading vendors in the market for data protection for VMware are announcing that they’re adopting IO filters to capture data. They’re a key technology in realizing the full potential of VMware as a hybrid cloud platform.
Rich Petersen is co-founder and president of JetStream Software and has more than 20 years of experience in enterprise technology. Previously vice president of product management at FlashSoft, which was acquired by SanDisk in 2012, Petersen served at LogicBlaze/Red Hat, Infravio and Interwoven. He earned his MBA at the Haas School of Business at the University of California Berkeley. For more information, please visit www.jetstreamsoft.com, www.linkedin.com/company/jetstream-software-inc/ and @JetStreamSoft.