Storage Can Be the Key to Aggressively Priced Cloud Offerings

One of the key reasons for moving to a cloud-based infrastructure is to lower overall infrastructure costs. This is true regardless of whether you are building an in-house (private) cloud or are a public cloud provider looking to price your offerings competitively. Because storage comprises such a large part of the outlay of any cloud-based infrastructure, it's an obvious place to look for optimizations that can lower overall costs. A lower cost virtual infrastructure gives cloud providers pricing leeway, which can be used to either out-price competitors or to increase margins. New storage technologies introduced within the last five years provide significant opportunities to lower storage costs while still creating the type of high performance, scalable and highly available infrastructures that cloud providers need to meet their customers' business requirements.

Candidate Storage Technologies
The critical technologies in building a cost-effective storage infrastructure include the following:

Scalable, resilient networked storage subsystems. Ensure that the storage you choose is modularly expandable and will scale to meet your business requirements. Networked storage architectures offer better opportunities not only for expansion, but also for redundancy and for storage sharing - which is critical to support the live migration of virtual machines (VMs) necessary to meet uptime requirements. Storage layouts should use RAID for redundancy, and provide multiple paths to each storage device for high availability, as well as supporting online expansion and maintenance.

Thin provisioning. Historically, storage has been significantly over-provisioned to accommodate growth. Allocated but unused storage is an expensive waste of space, and thin provisioning is a storage technology, which effectively addresses this. By transparently allocating storage on demand as environments grow, administrators no longer have to overprovision. When thin provisioning technology is initially deployed in an environment, it's not uncommon for it to decrease storage capacity consumption by 70% or more. It allows for higher utilization of existing storage assets, reducing not only hardware infrastructure costs but also energy and floor space costs.

However, thin provisioning must be carefully watched when it's deployed in virtual environments. It can be difficult to stay on top of the capacity planning requirements to ensure that you don't unexpectedly run out of storage capacity. Running out of capacity shuts down VMs, so thin provisioning must be carefully managed to ensure that this does not occur. The savings, however, are significant so it's well worth it.

On a related topic, pay attention to how storage space reclamation occurs in your virtual infrastructure. When files are deleted, is the storage space that is freed up immediately returned to the storage pool, or does that only occur when the VMs that owned that data are rebooted? Both storage space reclamation and thin provisioning pose additional management challenges when multiple layers of virtualization exist, as is the case in hypervisor-based environments where an array that uses virtual storage technology is in use.

Scalable snapshot technologies. Snapshots have all sorts of uses - working from VM templates, ensuring a safety net during software updates, creating copies for test/dev environments, cloning desktops in VDI environments, etc. - all of which have significant operational value. If you've worked with snapshots in the past, you probably already know that snapshots can impose negative performance impacts. In fact, this performance impact can be so bad that administrators consciously limit their use of snapshots in some situations. In others, the value snapshots can provide has helped to drive the purchase of very high-end, very expensive storage arrays that overcome snapshot performance issues. In virtual computing environments, hypervisor-based snapshots generally also impose these same types of performance penalties.

Snapshots can also be very valuable when used for disk-based backup. Your customers will expect you to protect their data, and provide fast recovery with minimal data loss. To provide the best service to your customers, data protection operations should be as transparent as possible. The best way to meet these requirements will be to use snapshot backups, working with well-defined APIs like Windows Volume Shadowcopy Services (VSS) to ensure that you can create application-consistent backups for fast, reliable recovery.

The use of disk for backups also allows you to leverage storage capacity optimization (SCO) technologies like data deduplication to minimize the secondary storage capacity needed for data protection operations. Disk also makes it easier to leverage replication in creating disaster recovery (DR) plans for those customers that need them, and asynchronous replication products that use IP networks and support heterogeneous storage offer cost-effective DR options for virtual environments.

For cloud computing environments, the ability to use high performance, scalable snapshot technology has real operational value. Each cloud provider will need to evaluate how best to meet this need while still staying within budgetary constraints.

Primary storage optimization. SCO technology is not limited to use with secondary storage, and a number of large storage vendors offer what is called "primary storage optimization" in their product portfolios today. Similar in concept to deduplication (but not in implementation), these products effectively reduce the amount of primary storage capacity required to store a given amount of information. Because of their high performance requirements, primary data stores posed an additional challenge that did not exist for secondary storage: whatever optimization work is done must not impact production performance. Describing the different approaches for achieving primary storage optimization is beyond the scope of this article, but suffice it to say that they can generally reduce the amount of primary storage required for many environments by 70% or more, reducing not only primary storage costs but also secondary storage costs (since less primary storage is being backed up).

Note that certain primary and secondary SCO technologies can be used simultaneously against the same data stores, but care should be taken to ensure that they are complementary. Because there is so much duplicate data in virtualized environments (e.g., many VMs run the same operating systems and applications, etc.), SCO technologies are an excellent fit and can generate significant savings.

Storage Cost Challenges in Cloud Computing Environments
To meet performance, scalability, and availability requirements, cloud providers often invest in high-end, enterprise-class storage to support their virtual infrastructure. Higher per terabyte costs can be understood upfront, but there is another issue here that can hit cloud providers unexpectedly. Server virtualization is critical to cloud computing, but it poses cost challenges for legacy storage architectures. Because many VMs, each with their own independent workloads, will be placed on each host, the I/O patterns these hosts generate are much more random and much more write-intensive than those generated by physical servers running dedicated applications. This randomness lowers storage performance, driving the purchase of more spindles or exotic storage technologies like solid-state disk (SSD) to meet performance requirements. This poses a conundrum for those building cloud infrastructures: How do I create a cost-effective platform when the use of server virtualization requires more storage, actually driving storage costs up?

Cloud providers often consider the judicious use of SSD to reduce spindle count while maintaining high performance. SSD offers great read performance, quite good sequential write performance, but quite poor random write performance. The challenge in virtual computing environments is in managing random, write-intensive workloads, so SSD by itself is only a partial solution.

Interestingly, if there was a way to turn all those random writes into sequential writes, this could have a significant performance improvement without requiring any other infrastructure changes. Enterprise databases for decades have used a unique logging architecture to do just that. By sending all writes to a persistent log, which generates the write acknowledgement back to the database, it takes all the randomness out of the I/O stream. This means that the performance of the environment is determined by the sequential, not random, write performance characteristics of the device hosting the log. These writes are then later asynchronously de-staged to primary storage, an operation that has zero performance impact on the database. This trick increases the IOPS per spindle any given storage technology can sustain, a speedup that varies between 3x and 10x, depending on the storage technology in use.

What makes this particularly relevant for virtual computing environments is that it has been implemented in software at the storage, not the application layer, by several vendors. By implementing it at the storage layer, the performance speedups it produces are available to all applications, not just a given database application. For any given storage configuration, it reduces the number of spindles required to meet a given performance requirement, regardless of the type of storage technology in use - generally by at least 30%. It even speeds up SSD, since it allows SSD to operate at sequential rather than random write speeds.

This capability goes by the name of virtual storage optimization technology. It can be used in a complementary manner with the other storage technologies mentioned, is transparent to applications, and can be used with any heterogeneous, block-based storage. Much like the way server virtualization technology allowed organizations to get higher utilization out of their existing server hardware, virtual storage optimization technology does the same thing for storage hardware.

For Cloud Providers, Cost Is Critical
When selling cloud-based services, the set of performance, scalability, and availability requirements are relatively clear, and building the storage infrastructure to meet those needs will likely comprise at least 40% of the overall cost of your virtual infrastructure. But there is a big difference in how each cloud provider chooses to get there, and how each leverages available storage technologies to meet those requirements. The functionality of cloud service offerings for specific markets may be the same across providers that address those markets, but the one who meets those requirements with the most cost-effective virtual infrastructure has a significant leg up against the competition. The storage technologies available in today's market offer the savvy cloud provider the tools to achieve this advantage.