SDSC Evaluates Prototype of Next-Generation Sun Fire 15K System

LA JOLLA, CA -- Over the past 10 months, SDSC has put an alpha prototype of Sun's new high-end Sun Fire server, based on Sun's next-generation processor and system architecture, through its paces. The 32-processor system, code-named "Starcat" and the only such prototype system outside of Sun, was used for development and evaluation in a collaboration between SDSC and Sun's San Diego-based Data Center and High Performance Computing Products Group. "The Starcat marked an exciting new phase of our partnership with Sun Microsystems," said Fran Berman, director of SDSC and NPACI. "We were excited to be evaluating the capabilities of this system for large-scale scientific computing and analysis of the vast quantities of data being created by the scientific community and stored at SDSC." The Starcat was evaluated to assess the system's unique architecture and its scaling behavior with regard to terascale computing and petascale data management. A powerful Sun Fire 15K server will be installed at SDSC as part of the data-handling infrastructure for the recently announced TeraGrid. The server, to be configured with 72 processors and 288 GB of shared memory, will provide support for databases, data management and data mining for the national TeraGrid facility. The system will be connected to a Storage Area Network (SAN) that provides approximately 250 TB of disk storage. The system will also run SDSC's Storage Resource Broker software to manage the metadata for over one petabyte of nationally distributed TeraGrid data. The Starcat prototype has 32 UltraSparc III processors and 32 GB of memory and was loaned to SDSC for the duration of the evaluation process. The evaluation included porting and testing several applications of interest to the biomedical and chemistry communities. "As one of the country's leading research and supercomputing facilities, the SDSC works on some of the world's most challenging biological, environmental and computing issues," said Mike Vildibill, SDSC deputy director of resources. "The Sun Fire 15K server offers the performance needed to support our data-intensive requirements ranging from storage management, relational databases, data mining, and data-intensive scientific applications such as those in bioinformatics. The solid reliability and manageability of the system allows us to deploy our most critical services. The combination of reliability and performance makes the Sun Fire a critical component to our IT infrastructure." Many scientific applications circumvent the limited shared memories of parallel systems by recomputing data, transferring data among memory modules, or storing data to disk. The size of the shared memory on the Sun Fire 15K server and other Sun systems makes it possible to execute in-core methods, which compute data once and store all data structures as single images, eliminating the need for recomputation, communication, and data movement. "The Sun Fire 15K server is the only system available that supports some of our demanding scientific applications," Vildibill said. "Without the Starcat some scientific calculations and simulations simply would not be done." "Applications with large memory requirements deliver superior performance on large shared memory SMP architectures, such as Sun's powerful HPC systems," said Steve Campbell, senior director of marketing, Enterprise System Products Group, Sun Microsystems, Inc. "For many codes, new in-core methods allow them to run significantly faster on the Sun Fire 15K server than on any other system, allowing scientists to conduct larger and more complex scientific studies than previously possible." The applications evaluated include several with extensive user communities. For example, GAMESS, the General Atomic and Molecular Electronic Structure System, is an ab initio quantum chemistry program used worldwide to calculate direct quantitative predictions of chemical phenomena. SDSC's Kim Baldridge and Sun's John Feo took advantage of the large shared memory of the Starcat system to run an in-core version of GAMESS that computes all program constants once. In contrast, the version of GAMESS that runs on distributed memory systems recomputes the values when needed. Distributed memory machines cannot run the in-core version because there is not enough memory on each processor to store all the values. For large runs, recomputation of constant values can take up to 70% of the execution times, which means the in-core GAMESS on the Sun Fire is up to four times faster than the code on distributed memory systems. On the Starcat, "we were able to apply the new GAMESS code to applications involving mechanistic studies of bacteriocholorophyll systems being evaluated for photodynamic therapies, and a new illudin derivative with superior antitumor properties," Baldridge said. MOLPRO, which has an extensive worldwide user community, is a very large program for highly accurate molecular electronic structure calculations. The parallel version of MOLPRO has been ported to the Starcat and evaluated by SDSC chemists Joakim Persson and Peter Taylor. Performance and scaling are both better than the E10000 version, and Persson and Taylor plan to apply the code to hitherto impossible calculations of accurate energetics in bioinorganic molecules like porphyrins. NPEM takes data from drug test subjects and constructs a model of the statistical variability of the drug's movement throughout the subjects’ bodies and of the therapeutic effects. SDSC applied mathematician Robert Leary has ported NPEM to the Starcat and other parallel machines. The parallel version has allowed models to be run with a complexity and resolution that were previously unachievable on serial machines. The parallel version of NPEM at SDSC has been used by a wide variety of researchers to model drugs that have a narrow margin of safety and/or require serum level monitoring to achieve the best clinical results. On a per-processor basis, the Starcat version is faster by a factor of about two than the Sun E10000. The Starcat joined the arsenal of high-end Sun systems including two Sun Enterprise 10000 servers and many smaller Sun Enterprise servers. These systems provide the computing power for a number of computational biology and bioinformatics projects at SDSC, including the Protein Data Bank, the National Biomedical Computation Resource, the Alliance for Cellular Signaling, the Joint Center for Structural Genomics, and the Biology WorkBench. --David Hart Reprinted with permission of Online: News about the NPACI and SDSC Community