Indiana University AVIDD Linux Clusters Achieve Milestone

BLOOMINGTON, IND. ­ Indiana University's AVIDD (Analysis and Visualization of Instrument-Driven Data) facility has achieved a calculation rate of 1.02 Teraflops, or just over one trillion mathematical operations per second. This milestone in performance was achieved using a program called the Linpack Benchmark on the two largest of AVIDD's four geographically distributed Linux clusters. These two clusters are located in Bloomington (IUB) and Indianapolis (IUPUI) and the I-Light network connects these two systems across the 50 miles separating the two cities. The combined system is ranked in 51st place on the 21st Top500 list of the world¹s fastest supercomputers, released today at the International Supercomputer Conference in Heidelberg, Germany. The system is the 12th Linux cluster on this list, and the fastest geographically distributed Linux cluster on the list. "To the best of my knowledge, this is the first time a computational rate of more than one trillion mathematical operations per second has been achieved within the State of Indiana," said Craig Stewart, Director of the Research and Academic Computing division of University Information Technology Services. "What made this achievement possible were extensive tuning work by IU¹s expert system and network administrators, collaboration with IU¹s Open Systems Laboratory and industrial partners, and use of a portion of the I-Light network dedicated to the AVIDD system." The AVIDD clusters are from IBM, Inc., and Force10 networking equipment was used for this benchmark. I-Light, an optical fiber network funded by the State of Indiana and jointly owned by Indiana and Purdue Universities, allows IU's advanced information technology resources like AVIDD to be distributed between Indianapolis and Bloomington. Brian D. Voss, IU Associate Vice President for Telecommunications, stated "This is exactly the sort of innovative accomplishment that I-Light was designed to enable. I-Light is permitting our State's universities to achieve a leadership position in many areas of networking and computing, and it provides a key building block in the developing national cyberinfrastructure." Computer capacity is measured and described in a number of ways. 'Aggregate peak theoretical capacity' refers to the maximum number of calculations that a computer is theoretically capable of, but all computers operate at far less than their peak theoretical capacity. The aggregate peak theoretical capacity of the entire AVIDD system, including the two IBM Pentium IV clusters used in this benchmark, a separate IBM Itanium cluster at IUPUI, and an IBM Pentium III cluster at IU Northwest in Gary, is 2.176 Teraflops. The Linpack Benchmark is commonly used to determine how a high performance computer actually performs with real problems. It also demonstrates that although there are clear benefits to using a distributed system, there are also costs in terms of performance. As shown below, tests on a distributed system yield a slightly lower percentage of its peak capacity compared to a cluster confined to a single location: * Using the Linpack Benchmark on a cluster in one machine room, connected with Myrinet networking from Myricom, Inc., technicians achieved 66.7% of the peak theoretical capacity of that single cluster.[1] * Using the Linpack Benchmark on a cluster in one machine room, connected with Ethernet networking provided by Force10, Inc., technicians achieved 62.4%of the peak theoretical capacity of that single cluster.[2] * With the Linpack Benchmark running simultaneously on the IUB and IUPUI clusters, IU technicians achieved 57.4% of the peak theoretical capacity of those two clusters.[3] The clusters were connected via Force10 networking gear over I-Light; processors communicated with each other using a pre-release of LAM/MPI version 7.0, developed in the Open Systems Lab at IU. [1] The peak theoretical capacity of the cluster used for this benchmark was 0.921 TFLOPS. Benchmark performed using MPICH-GM 1.2.4..8 implementation of MPI over Myrinet GM 1.6.3. [2] The peak theoretical capacity of the cluster used for this benchmark was 0.921 TFLOPS. Benchmark performed using LAM/MPI 6.6b2 implementation of MPI over Force10 FTOS v4.3.2.0b. [3] The combined peak theoretical capacity of the two clusters used for this benchmark was 1.862 Teraflops.