RETAIL
Virginia Tech Earns Honors for Making Supercomputing Available to All
Virginia Tech is the recipient of this year's Computerworld Honors 21st Century Achievement Award in Science. The award was presented Monday evening at a black tie event at the National Building Museum in Washington, D.C. Apple nominated Virginia Tech for its development of a 2,200-processor supercomputer with a cluster of 1,100 Power Mac G5 computers. Called System X, the supercomputer is currently the world's third fastest with the best price/performance for a top supercomputing facility. Virginia Tech faced stiff competition with more than 250 of this year's most innovative applications of technology submitted for consideration. In April, Virginia Tech was named one of five finalists for the award, along with Rice University, Pittsburgh Supercomputing Center, CoreTek, and the Columbus Zoo. Entries came from 33 states and 26 countries. "Virginia Tech is using information technology to make great strides toward remarkable social achievement in science," said Daniel Morrow, executive director, Computerworld Honors Program. "The materials submitted on behalf of Virginia Tech will enrich the program's growing collection on the Information Age, and help build an accurate record of the truly outstanding achievements being made in these remarkable times." Virginia Tech built "a world-class supercomputer to tackle fundamental, grand challenge problems in computational science and engineering. While supercomputers have been invaluable, their high cost of often tens to hundreds of millions of dollars has limited their deployment to a few national facilities. The goal of the Virginia Tech project was to develop novel computing architectures that reduce their cost, time to build, and maintenance complexity. As a result, institutions with relatively modest budgets can now afford to build a premier supercomputer," said Hassan Aref, dean of Virginia Tech's College of Engineering. The engineering college collaborated with the University's Information Technology group to build the supercomputer. As the cluster was being built, the University named Srinidhi Varadarajan, an assistant professor of computer science in the college of engineering, the director of Tech's new Terascale Computing Facility. Jason Lockhart, also of the college of engineering, and Kevin Shinpaugh of information technology were named associate directors. Pat Arvin, associate vice president for information technology, and Glenda Scales, associate dean for research computing and distance learning, college of engineering, provided the overall direction for the project that included some 160 student volunteers. Varadarajan started the Virginia Tech initiative with a National Science Foundation grant to expand and upgrade a small supercomputer he was directing on the University campus. Conversations with Lockhart and others led to the grander goal. "In the 1970s, a paradigm shift occurred when large expensive mainframes made way to increasingly powerful minicomputers, which were affordable to academic and industrial research labs. The resulting spurt in research -- both into computing and the use of computing -- lead to the computing landscape today. "We believe System X is the first step in a similar paradigm shift in supercomputer architectures from expensive custom supercomputers to inexpensive cluster based supercomputers to solve the largest research problems -- an area called capability computing. At a price of $5.2 million, world-class supercomputing is now within the reach of academic research budgets, enabling the larger community of academic researchers to tackle fundamental problems with easily available supercomputing resources," Varadarajan said. Varadarajan is also the developer of "Deja vu," a software package that brings stability to large clusters. "We developed the first comprehensive solution -- called Deja vu -- to the problem of transparent parallel checkpointing and recovery, which enables large-scale supercomputers to mask hardware, operating system and software failures -- a decades old problem," he said. System X is currently going through an upgrade process, wherein the nodes that comprise the supercomputer are being replaced by the Xserve G5 server platform. This upgrade has several advantages. It reduces the size of the supercomputer by a factor of three, consuming significantly less power than its predecessor. It generates less heat thereby reducing the cooling requirements. The upgrade adds automatic error correcting memory that can recover from transient bit errors. Finally, it has significant hardware monitoring capabilities -- line voltages, fan speeds, communications -- that allow real-time analysis of the health of the system. Virginia Tech's partners for building System X were Apple, Mellanox, Cisco, and Liebert. Mellanox is the leading provider of the InfiniBand semiconductor technology, the primary communications fabric, drivers, cards, and switches for the project. Cisco's Gigabit Ethernet switches were the choice for the secondary communications fabric to interconnect the cluster. Cisco provided a significant educational discount to support the project. Liebert, a division of Emerson Network Power, known for its comprehensive range of protection systems for sensitive electronics, provided the cooling system.