IU Data Capacitor Achieves 977 MB/Second Over TeraGrid

Indiana University's Data Capacitor — a 535 TB Lustre filesystem designed to store and manipulate large data sets — has demonstrated in its opening weeks of production a single client transfer rate of 977 MB per second across the TeraGrid network. Data was copied from a single computer equipped with a 10 Gigabit Ethernet card from Oak Ridge National Laboratory to the Data Capacitor at IU's Bloomington Campus. The outstanding transfer rate, which represents nearly 80 percent of the 10 Gigabit network's theoretical capacity, was reported by Data Capacitor project lead, Stephen Simms during a talk entitled, “Wide Area Filesystem Performance Using Lustre on the TeraGrid.” The talk was given by Simms and collaborators from Oak Ridge National Laboratory and Pittsburgh Supercomputing Center at the TeraGrid ‘07 conference being held this week in Madison, Wis. “These numbers illustrate how the Data Capacitor and the high-speed TeraGrid network could help distributed resources feel less distributed to the user,” said Simms. “Imagine being able to move 12 DVDs worth of data from your desktop machine onto a filesystem two states away in a single minute. This technology has the potential to significantly change how scientists collaborate across distance.” Since entering production in April, the Data Capacitor has supported several high-profile projects including the Linked Environment for Atmospheric Discovery (LEAD) Science Gateway, which provides meteorological data, forecast models and analysis tools for the interactive exploration, simulation and prediction of weather, and the WxChallenge, a collegiate weather forecasting competition. The Data Capacitor is also a key cyberinfrastructure component in an international federation of crystallography labs under the Common Instrument Middleware Project (CIMA). “The Data Capacitor has been exceptionally valuable to the CIMA project,” said principal investigator Donald F. McMullen, of the Pervasive Technology Labs at IU. “Its capacity and throughput allowed us to design and implement a system that supports data sharing and maintains workflows involving massive amounts of instrument data for about a dozen labs in the U.S. and around the world.” Data Capacitor principal investigator Craig Stewart, associate dean of Research Technologies and chief operating officer of the Pervasive Technology Labs at IU, stated “The wide-area capabilities we have demonstrated for the Data Capacitor and the TeraGrid will enable IU to better support scientific workflows — the end to end transformation of data into knowledge through use of advanced cyberinfrastructure.” The Data Capacitor was developed by a team from University Information Technology Services, the IU School of Informatics and Pervasive Technology Labs at Indiana University, with financial support from the National Science Foundation. Project Co-PI's include Randall Bramley, Catherine Pilachowski and Beth Plale. Its architecture uses components manufactured by Data Direct Networks, Myricom and Dell. The Data Capacitor's Lustre filesystem is supported by Cluster File Systems. This material is based upon work supported by the National Science Foundation under NSF Award Numbers CNS0521433, ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.