Team Led by Argonne, Virginia Tech Wins Storage Challenge Competition

A team of researchers led by Pavan Balaji of Argonne National Laboratory and Wu Feng of Virginia Tech won an international competition for the most effective approach in using large-scale storage for high-performance computing. The award was presented November 15 at SC|07, the world's premier conference on high- performance computing and networking. Using a novel software framework for distributed I/O called ParaMEDIC, the team of researchers from Argonne National Laboratory, Virginia Tech, and North Carolina State University searched the sequences of all completed microbial genomes against each other. The aim was twofold: to discover missing genes and to speed future searches by generating a complete genome similarity tree. The ParaMEDIC software framework used a semantics-based approach to create a metadata representation that was four orders of magnitude smaller than the actual output data. "Using ParaMEDIC, the entire genome similarity tree, corresponding to a petabyte of data, can fit into a 4-gigabyte iPod nano," said Balaji. This entire task required many millions of CPU-hours of computational capability and generated a petabyte of uncompressed output. Since not many supercomputer centers provide both the computational and storage resources required for this task simultaneously, the research team relied on a worldwide supercomputer that aggregated the compute resources from various locations within the U.S. and the TSUBAME storage resources at the Tokyo Institute of Technology in Japan, with technical support from Sun Microsystems. The largest portion of the compute cycles were provided by Virginia Tech's System X supercomputer. "In total, we relied on six U.S. supercomputing institutions and accessed over 12,000 processors across eight supercomputers. The ParaMEDIC framework then improved compute utilization from 10 percent to nearly 100 percent for the compute resources and storage bandwidth utilization from 0.04 percent to 90 percent for the storage resources," said Feng. The ParaMEDIC team is indebted to the support of the following people who made the impossible possible: Virginia Tech (System X): J. Setubal, A. Warren, K. Shinpaugh, L. Scharf, G. Zelenka, T. Herdman Argonne National Laboratory (Jazz, SiCortex, BlueGene/L and Breadboard): R. Stevens, E. Lusk, S. Coghlan Tokyo Institute of Technology (TSUBAME): S. Matsuoka, T. Yamanashi, S. Ono, R. Fukushima U. Chicago (TeraGrid): I. Foster, M. Papka Center for Computation & Technology at Louisiana State University (Oliver): D. Katz, S. Jha, H. Liu Renaissance Computing Institute (Open Science Grid): D. Reed, J. McGee, M. Rynge Sun Microsystems: T. Kujiraoka, S. Ihara, S. Vail, S. Cochrane, C. Kingwood, S. See, A. Katz