Pittsburgh Scientists Measure Productivity in Petascale Supercomputing

IBM has provided $900,000 to the University of Pittsburgh Computer Science Department as part of a three-year effort called PERCS (Productive, Easy-to-use, Reliable Computing System) to rethink the design of supercomputers; a major portion of this funding supports the Pitt-PSC collaboration. The PERCS program is funded by the Defense Advanced Research Projects Agency pursuant to its multi-million dollar HPCS (High-Productivity Computing Systems) initiative, aimed at creating supercomputers that will deliver computer power at the scale of petaflops -- quadrillions of operations per second -- by the end of the decade, a thousand times more powerful than current systems. The problem addressed by the Pitt-PSC team arises from growing awareness among scientists that software performance alone is an inadequate measure of the ability to accomplish scientific work. It fails to account for other important factors such as system reliability and the amount of human time that goes into software development, which can vary significantly depending on the hardware. "If a programmer spends a year tuning the software to double its performance," says Melhem, "that human time has to be part of the equation when we talk about productivity of the system. It's not just the productivity of the machine, but also productivity of the people -- the scientific research group -- who use the machine to solve problems." As an important step toward more realistic productivity assessment, PSC scientists have announced plans to develop a software tool called SUMS (Standardized User Monitoring Suite). SUMS will run in the background as a programmer creates a program and non-intrusively record data on the full cycle of the "code development" process, explain PSC scientists Nick Nystrom and John Urbanic, who lead the SUMS effort for PSC. "SUMS measures the time you spend typing in code and editing," says Urbanic, "and it monitors your runs to fix syntax bugs, then more subtle performance tuning, through the other stages of optimizing and eliminating bottlenecks. It's an expandable collection of tools to monitor the entire process and collect data." Along with gathering data, SUMS will provide the ability to correlate and display the data for analysis. "This will provide a holistic picture," says Nystrom, "and provide the productivity analyst with a quantifiable metric. We'll do experiments with SUMS in PSC workshops and in computer science classes at Pitt. We have a flexible framework to adapt as we iteratively define what's important." The outcome, says Melhem, is to be able to evaluate different approaches and weigh the cost of production, including human time, against the ultimate performance of a system, and to rationally weigh these factors prior to committing to a particular system architecture. The Pittsburgh Supercomputing Center is a joint effort of Carnegie Mellon University and the University of Pittsburgh together with Westinghouse Electric Company. It was established in 1986 and is supported by several federal agencies, the Commonwealth of Pennsylvania and private industry.