Skip to content

Tintri blog
Syndicate content
Updated: 1 day 3 hours ago

How Flash Changes Enterprise Storage

Mon, 02/06/2012 - 18:44

Flash-based solid-state drives (SSD) have no moving parts, are more reliable, have faster read times, and offer consistent IO performance (latency) during high utilization, compared to traditional hard drives. Most of us have seen the most immediate impact of SSD in laptop and tablet systems that we use every day. In those systems, SSD is either used to make the device fit the form factor we desire (like an iPad) or to provide our laptop with greater performance and reliability. When we can choose SSD, the choice comes with the tradeoff of being more expensive and providing greater performance, vs. the alternative of lower cost and abundant space (the case with traditional hard drives).

What does SSD do for Enterprise Storage?

In the enterprise storage landscape, you may think SSD is just for some high-end cache on the largest of enterprise storage. However, you would be wrong.

Yes, SSD made its first inroads through the end-user (or consumer) side, but it is quickly taking over enterprise storage. In a very small form factor, SSD can provide solid IO performance. For example, in the past, to achieve a certain level of IOPS, storage vendors would build arrays with a large amount of cache memory. That cache memory was expensive and required redundant storage controllers and built-in backup power supplies (because the RAM cache would be lost when power was removed). Finally, that RAM cache was small, in comparison to solid state disks.

Flash-based SSD drives

In summary, SSD will:

  • Improve reliability — SSD disks fail much less
  • Improve performance when under load — SSD offers predictably low latency when pushed to its maximum
How can SSD Hybrid Storage Help?

As we know from buying laptops today, the downside to SSD is that when compared to traditional hard drives, SSD costs much more and offers much less capacity. If you replaced all HD in your storage array with SSD, you would spend much more and receive significantly less capacity. However, that doesn’t make SSD impractical.

The best use of SSD today, in enterprise storage arrays, is as the first-class storage to a larger array of traditional HD. Then, use software in the array to intelligently balance the most active files to SSD and the less active files to HD. This works particularly well for snapshot data and older versions of files that are infrequently accessed. When you combine this intelligence with features like compression and deduplication — you have today’s ideal storage cocktail for the datacenter.

This approach is in stark contrast to traditional hard tiering levels, or just using SSD for caching only.

The Bottom Line

What should storage and virtualization admins do? In this quickly changing market, here are three recommendations:

  1. Ask your storage vendor how SSD benefits their storage arrays
  2. Investigate how SSD is used in these arrays
  3. Consider storage alternatives that capitalize on the power SSD brings to the datacenter

SSD is the single greatest factor changing enterprise storage today. It’s important to stay current on this rapidly changing storage topic.

 

 

 

Categories: Companies

Storage Virtualization: An Overview

Fri, 01/27/2012 - 19:22

The concept of virtualization has been around for some time. Virtualization is really just the abstraction of an actual entity or construct into logical representations of those entities or constructs. Most of the time, the term “virtualization” is tied to server virtualization — a technology made popular by VMware, Microsoft, Xen, etc.

However, while server virtualization is the hot trend in enterprise IT, storage virtualization is making significant strides in functionality; people just do not realize it yet.

Storage virtualization involves abstracting the physical data storage process to more logical constructs inside of the storage device. Let’s take a quick look at how storage virtualization is taking shape:

Traditional storage: Single disk
  • A data consumer issues read/write requests. The disk controller either reads or writes to specific locations on disk.
RAID: Multiple disk
  • This is one of the most widely used implementations for storage virtualization. While it may not seem like it, the data storage environment is indeed virtualized.
  • Multiple disks are aggregated into a storage structure to increase storage, increase resiliency, or both.
  • A data consumer issues read/write requests. The storage controller determines which storage devices contain the data, compute the entire request from multiple devices (potentially), and return it to the consumer. The data is no longer on a single device.
LUN: Multiple logical storage devices
  • This takes RAID to the next level.
  • A group of disks are placed into an array structure. The disks are aggregated in some fashion (typically in RAID levels). However, a subset of the allocated capacity is divided and presented to a data consumer as a LUN. The LUN is a logical storage device for a consumer.
Storage pooling: Spanning multiple drive array types
  • Multiple tiers of storage are created based on storage device profile (capacity and performance), typically a RAID group or other physical storage enclosures.
  • The storage device creates a higher-level structure, called a pool, of which the various performance tiers are members. The pool structure is presented to the data consumer at the LUN level.
  • The storage controller stores metadata about which data blocks reside in which tier, and their location inside the tier.
Data migration: Moving data around
  • Building on top of storage pools, storage controllers (via metadata) are able to determine the data access patterns for individual blocks of data.
  • Frequently used data is moved to the highest performing tier of disk while less frequently accessed data is moved to the lower performing tier of disk.
  • This migration occurs without the knowledge of the data consumer. The consumer sees the storage as a LUN and does not know (or care) about what happens as long as the data is available.
Deduplication: Sharing common data
  • Many data structures share the same data patterns. Microsoft Word files share the same framework across all files, regardless of content. Microsoft Windows servers all have common files. Conceptually, deduplication addresses the idea of “Why store multiple copies of the same data over and over again?”
  • Based on the type of algorithm, the storage device processes existing data to determine if any duplicate data exists.
  • In the event of duplicate data, the storage controller creates pointers to the common data. Common blocks are replaced by a pointer, and the overall storage footprint is reduced.
Thin provisioning: Not allocating storage at creation time
  • This functionality operates under the theory that space may be allocated but never fully used, resulting in unused space that cannot be used by anyone else.
  • The storage controller receives a request to allocate space for a data consumer. The controller creates the basic framework that represents a LUN. However, internal to the storage device, the space is not allocated. Rather, the LUN is basically authorized to consume a specific amount of disk space.
  • As the disk consumer continues to use storage space, the LUN grows on the storage controller until the LUN size is completely allocated. Until the LUN is fully utilized, the unused space can be used for other purposes.
  • This may result in over-allocation of storage, though, and needs monitoring.
Storage Virtualization Continues to Advance

As you can see, from traditional file storage to thin provisioning, storage virtualization has played a major role in advancing how we use our storage infrastructure and reap the benefits from our investments.

Storage virtualization techniques and technology continue to advance. Object storage, pNFS, and server virtualization functional offload will become more commonplace as new storage device models and feature sets are developed and introduced.

Storage virtualization is not the process of storing virtual machine disks. Rather, it is a beast of its own, and continues to provide valuable benefits.

Categories: Companies

Choosing which applications to run in flash

Tue, 01/24/2012 - 20:03

As flash storage has become cheaper, it’s becoming increasingly practical to move application storage from hard-disk drive (HDD) to solid-state drive (SSD). Where are we today and how far have we come?

In my previous blog post, “When is it cheaper to use SSD vs. HDD?” I described how the IO density of an application, or IOPS generated per GB of data, can be useful for determining when to run applications from SSD vs. HDD. Flash is most cost-effective when used to run applications with high IO density that generates lots of IOPS per GB of data. In this post, I’ll discuss how to characterize applications and determine if it’s cost-effective to run them on SSD.

The cost of flash in 2006 made it too expensive for any typical application (see Figure 1, below). However, the cost of flash has declined faster than the cost of disk, making it feasible to run an increasing number of applications in flash (see Figure 2, below).

Figure 1 graphs data from a publication, “Migrating Server Storage to SSDs: Analysis of Tradeoffs.” Each data point is a server application workload with the y-axis showing the random read IOPS and the x-axis showing the capacity needed to run the application at peak performance. In all, there are 49 workloads representing applications ranging from corporate Exchange servers, commercial Web services, file servers, database servers, and Web caches/proxies. The applications vary greatly both in terms of the IOPS and capacity.

Figure 1: SSD IO Density Threshold, 2006



The diagonal line in Figure 1 corresponds to an IO density of 50 IOPS per GB. In 2006, all workloads above this line were best run from SSD while all points below this line are best run from HDD. As can be seen, in 2006, it was not economical to run any of these applications from SSD.

Today, given the large drop in SSD prices as well as the advent of MLC SSDs, the break-even IO density threshold is about 1 IOPS per GB, as illustrated in Figure 2. At this threshold, approximately half of the workloads can be run economically from SSD. What a difference a few years makes!

However, raw SSD capacity remains 10 times more expensive than disk. To make SSD cost-effective, you want to avoid storing entire applications on SSD. Only relatively “hot” components should live in flash. By applying optimization techniques such as deduplication, compression, and working set analysis, the threshold IO density can be significantly lowered. I’ll discuss some of these techniques in a future post.

Figure 2: SSD IO Density Threshold, Today

In Figure 2, we can roughly group the workloads into three different classes. The first contains the hot, small applications that generate high IOPS and use little capacity. These are like the small active databases that easily fit in SSD today. The second contains cold, midsize applications such as home directories and static web pages, which can grow large but are easily cached. The last consist of warm, large applications such as email and large databases that are actively accessed and are difficult to cache. In this sample, this third class represents the largest amount of total IOPS and capacity, and represents the biggest challenge and opportunity for SSD storage vendors today.

The affordability of SSD for common applications has come a long way in just the past six years. In fact, we are just now entering the phase where larger mainstream applications can be run cost-effectively from SSD. Given the continued drop in SSD prices, it seems only a matter of several years before the majority of applications can benefit from SSD. Given that technologies such as MLC flash, inline dedupe and compression, and working set analysis can greatly accelerate this trend, the future may be closer than many people believe.

Categories: Companies

Storage Admins Evolve

Wed, 01/11/2012 - 00:44

Storage administrators are changing and evolving. Will you be left behind? Find out what’s changing and what you can do to prevent becoming obsolete.

Storage Grows Up

Storage systems can still be complex, but they are getting smarter and more intuitive. Features like SSD, autotiering, massive caching, and simple NFS setup have made even high performance storage something that the intermediate-level IT admin can set up and administer. Features previously reserved for high-end SAN arrays, like replication, can now be done using software. With many storage arrays (like Tintri, for example), you don’t need a team of dedicated storage admins—or perhaps even a single storage admin—to take care of the SAN. Storage can now just be one of the pieces that make up the infrastructure. The one or more full-time employees taking care of storage can now take care of the entire infrastructure including servers, network, hypervisor and storage.

In many cases storage and servers are working together to be smarter and more efficient. VMware’s API for Array Integration (VAAI) allows the vSphere hypervisor to talk to the storage array to detect when an LUN is running out of space or performing poorly, and take action without direction from the admin. This is just one example of how servers and storage continue to work close together, reducing or eliminating the need for a dedicated storage person (or team) to coordinate SAN changes with server and hypervisor changes.

Cloud Computing Is Real & Changes Your Job

In addition to advancements in storage, changes in cloud computing are also pushing storage admins to change. Infrastructure-as-a-service (IaaS) offerings have become more mainstream and acceptable solutions. Moving servers to the cloud and connecting local end-users to the cloud via hybrid VPN can be done in self-service portals. While local storage will be needed at most companies for the foreseeable future, we have to take these new remote virtual datacenters into account in infrastructure design.

What Can You Do?

While many people don’t like change, if you are a storage admin, it’s time to change or become extinct. Companies will need more infrastructure engineers and fewer SAN admins. What can SAN admins do?

  • Today’s private and public cloud infrastructure solutions make common storage tasks, such as provisioning a LUN, a self-service capability. To stay current, storage admins need to take a project-oriented, rather than a task-oriented approach. Take time to understand what the business needs, what the big picture is, and how you can make a difference. For example, focus on larger projects like data protection and DR instead of smaller tasks like nightly backup processing.
  • Work with other members of your team to cross-train on all the pieces that make up the datacenter. Show you can do more than just provision a LUN—you can be the superhero of the datacenter, solving problems no matter where they pop up.
  • Begin training on all the components that make up the infrastructure. I suggest doing this by building your own virtualization lab infrastructure that has NFS storage, server, and virtualization hypervisor, simulating the current or future datacenter infrastructure.

Technology is always changing. Don’t let it pass you by!

 

Categories: Companies