Ocarina Partners with Cornell and DDN on Storage Optimization

Ocarina Networks, the leader in content-aware compression and dedupe for online storage, has partnered with the Cornell Center for Advanced Computing (CAC) and DataDirect Networks (DDN), the data infrastructure provider for the most extreme, content-intensive environments in the world, to advance the state of lossless data reduction and storage optimization.
 
“As scientific researchers acquire data at faster and faster rates, optimizing the analysis of that data with scalable storage solutions is essential,” said Dr. David Lifka, Cornell CAC director. “Despite advances in disk technology, storing research data remains an expensive proposition,” he explained. “We are working with Ocarina and DDN to effectively maximize storage capacity without sacrificing performance.”
 
Data production rates from telescopes, satellites, surveys, and other scientific instruments are exploding. “In the life sciences, next generation sequencing techniques are producing vast quantities of data that must be quickly processed and stored online for short periods of time,” said Dr. Jaroslaw Pillardy, a senior researcher at Cornell’s Computational Biology Service Unit. “For example,” noted Pillardy, “one Solexa sequencing run produces .05 terabytes of raw data, and a single sequencer may be used multiple times per week.” Pillardy expects that new sequencing techniques will soon generate data at a rate of 0.1 terabyte per hour.
 
To prepare for these data demands, Cornell is performing extensive data compression testing across a wide range of research applications using the Ocarina ECOSystem. The ECOSystem reads stored files and uses content-aware compression and deduplication to reduce the amount of space those files take. It includes multiple data compressors for the types of files commonly found in research computing environments and includes over 100 algorithms that support 600 file types. Testing is occurring on DDN’s S2A9700 high performance storage platform deployed at Cornell. The S2A™ (Silicon Storage Architecture™) technology provides extreme performance and storage capacity, with the ability to manage up to 1.2 petabytes in only two floor tiles and deliver sustained throughput of up to 6 gigabytes per second for both writes and reads, per appliance.
 
“Ocarina is very pleased to be collaborating with Cornell and DDN,” said Goutham Rao, Ocarina’s CTO. “Cornell’s breadth of science and engineering applications, their focus on utility, and expertise in data analysis makes them an ideal partner for this project.”
 
“New breakthroughs in content-aware compression and deduplication are making it possible for data sets to be reduced soon after they come off scientific instruments and have been analyzed.” explained Lifka. “Compression technologies with efficient algorithms are becoming an essential component in data-intensive computing system deployments,” he added. “Space savings of 50% and above are common.”
 
“New data reduction technologies hold great promise for helping to get the most out of storage systems and reducing overall operating expenses,” said Dave Fellinger, CTO, DDN. “DDN’s S2A storage systems have always provided significant manageability, rack space, power consumption, and capacity benefits and now, in combination with Ocarina Networks, we are able to bring a new level of efficiency to the data center.”
 
The Cornell Center for Advanced Computing receives support from the Cornell University, the National Science Foundation, DOD, USDA, and members of its corporate program. For more information, visit www.cac.cornell.edu.