ACADEMIA
NERSC Accelerates Data Access and Exceeds Reliability Standards with Tape-Based Active Archive
Active Archive Supports Massive Data Volumes and High Access Rates for Worldwide Research Community
The Active Archive Alliance announced today that the National Energy Research Scientific Computing Center (NERSC) has implemented a tape-based active archive to support its growing data storage needs. NERSC’s active archive is a combined storage solution consisting of high performance storage system software, disk, and tape hardware, which provides a reliable way for users to access all of their data while ensuring highly reliable data storage.
NERSC is the primary scientific computing facility for the Office of Science in the U.S. Department of Energy and supports more than 400 different projects at any given time. It maintains a growing archive of more than 140 million files and its facility is connected to a network that facilitates the transfer of large scientific data sets between NERSC and other supercomputing centers and experimental facilities around the world. With data growth that typically hovers between 50 to 70 percent each year, the active archive takes in approximately 50 TB of data each day and data is retained an indefinite period of time.
“We provide some of the largest open computing and storage systems available to the global scientific community,” said Jason Hick, storage system group lead at NERSC. “At any given moment, there are about 35 people logged into the archive system, including users from as far away as Europe or Asia, to researchers at various universities across the United States. Our active archive system allows us to support the high read rates that our users demand while retaining data efficiently, reliably and cost effectively.”
The new active archive provides a way for NERSC’s users to instantly access all of their data and also simplifies data storage and management for both researchers and storage administrators at a low cost.
Exceeding Reliability Standards
NERSC’s tape-based active archive provides extremely reliable data storage. NERSC recently replaced its existing tape infrastructure with newer versions of tape in its active archive, and migrated 40,489 tape cartridges, which involved reading 22,065,763 meters of tape – the same distance as flying from San Francisco to Tokyo to Paris to Nova Scotia. The tapes ranged in age from two to 12 years.
During this massive migration process, NERSC tracked its tape data reliability within its active archive, and the findings flew in the face of conventional wisdom: 99.9991 percent of tapes were 100 percent readable, representing a 0.00009 percent error rate and exceeding the industry’s high availability (HA) measure of “five 9’s” reliability.
“Our recent migration further validated our belief that tape is reliable, and supported our long-standing practice of keeping a single copy of data,” added Hicks. “This is particularly beneficial for NERSC given our significant storage capacity and reliability requirements.”
“As NERSC’s experience shows, active archives provide an extremely reliable and efficient infrastructure that keeps large data volumes online and accessible for users,” said Peter Faulhaber, senior vice president at Fujifilm and an Active Archive Alliance board member. “What’s more, the ongoing advancements in data tape technology will allow organizations such as NERSC to support exponential data growth well into the future, in a reliable and cost effective manner.”
A full case study of this project is available online at: NERSC Exceeds Reliability Standards With Tape-Based Active Archive.