GOVERNMENT
Red Sky at night, Sandia's new supercomputing might
The top 10 ranking was achieved by temporarily aggregating Red Sky, Sandia’s newest institutional machine, with a second system being constructed using the same architecture and components on behalf of NREL.
The second system is sited adjacent to Red Sky and operated by Sandia to support work sponsored by the DOE’s office of Energy Efficiency and Renewable Energy. The flexibility of the Red Sky architecture enabled this super-configuration which achieved a peak performance of more than 500 Teraflops (or 500 trillion mathematical operations per second), and an impressive 433.5 Teraflops against the Linpack benchmark commonly used for ranking supercomputing speed.
“One thing that’s really exciting to me about this project,” said Rob Leland, director of Computing and Network Services Center, “is that we’re taking the architectural philosophy and design principles that we pioneered in systems for the weapons program such as ASCI Red and Red Storm and building a machine that will be broadly available.”
The use of “red” in the name evokes the legacy of Sandia’s prior successful supercomputing systems and programs. The designers and builders of Red Sky built on the design principles and successes of earlier machines such as ASCI Red from the mid-1990s and Red Storm from early 2000. “Red” also conveys to the broader computing community that this is a machine consistent in its approach and its philosophy with those previous Sandia machines that were so highly regarded in industry, which continues that sense of legacy.
Red Sky is intended to be a capacity machine. “It’s not designed with the full-system job as its target,” said Dino Pavlakos, Sandia manager of scientific computing support.“It’s designed to run lots and lots of jobs. While we were at it, we wanted to do a design that could accommodate a higher degree of scalability than you would ordinarily see in a commodity-based system.” The trick is to leverage the economics of commodity parts and yet incorporate the design principles learned from previous generations of specialized high-performance computing (HPC) systems.
“It’s a scalable design,” Leland said, “but it’s also an extensible design, meaning we can physically build it out.” One key feature that enables this is the simple “topology,” which uses a three-dimensional mesh-type grid. “It turns out,” Leland said, “that structure is a good choice for mapping physical codes onto the machine because physics is typically expressed mathematically in a three-dimensional grid that matches well to the machine.”
The overall attributes of Red Sky, combined with promising early application results, made for an easy choice when faced with the opportunity to stand up additional capability for the Sandia/NREL HPC collaboration. The collaboration leverages Sandia’s experience in HPC to provide a computational resource for applications in the energy sector, as NREL grows its computational science capabilities and expertise.
Steve Hammond, director of the computational science center at NREL, said he is excited about this collaboration. “The staff at Sandia are outstanding and the collaboration allows NREL scientists access to vital HPC capabilities to meet our needs for modeling and simulation in support of energy efficiency and renewable energy technologies critical to the nation. Red Sky fills a key capability gap for NREL until our new facility is completed in 2012,” Hammond said.
A project of this complexity and ambition requires a close partnership with leading-edge vendors. In this case, Sandia worked with Sun Microsystems and Intel.
“Sun was willing to take substantial risks and create and invest in technology for the partnership,” Leland said, “so it was a very good fit for our needs and goals.” Sun was also willing to work with Sandia to innovate in several key dimensions, he said.
Red Sky is a commodity machine, but off-the-shelf is not the complete story. “Many of the system components are leading-edge, but they are commodity parts,” Pavlakos said.
“Intel gave us early access to their latest processing technology and very competitive pricing for that new technology,” Leland said. Intel was a natural choice, he said because it has been very actively reestablishing itself in the scientific high-performance computing market in recent years. Leland said Intel’s processor technology is moving intentionally toward incorporating certain key technologies and design features that support Sandia’s goals for the machine.
The Red Sky project, Leland said, required both a commitment to technical innovation and strong value because the machine must provide the highest-quality service for the lowest prices possible. The Labs also wanted the project to continue Sandia’s legacy of innovation and excellence in high-performance computing and leadership in the field.
Another area of innovation in Red Sky is in its energy efficiency. “Red Sky should really be called Green Sky,” said John Zepper, senior manager of computing systems. “This machine is the most energy efficient HPC system we have deployed to date.”