Supercomputing Success Story

Appro helps LLNL successfully implement a 620TF supercomputing cluster at three National Labs

The challenge

The workhorse of modern day High Performance Computing (HPC) is the Linux Cluster. Almost all areas of science and engineering are dependent upon the power and performance offered by Linux clusters. Today's top computational problems now require performance in the TFLOP/s (10^12 Floating Point Operations per Second) range. In the very near future, the delivery of PFLOP/s (10^15) cluster systems will be a reality as well. One of the leading edge practitioners of Linux HPC computing is Lawrence Livermore National Lab (LLNL) located in Livermore, California. One of LLNL's primary missions for the National Nuclear Security Administration (NNSA) is ensuring the safety, security and reliability of the nation's nuclear deterrent through a program called stockpile stewardship. A cornerstone of this effort is NNSA's Advanced Simulation and Computing Program (ASC), which provides the integrating simulation and modeling capabilities and technologies needed to combine new and old experimental data, past nuclear test data, and past design and engineering experience into a powerful tool for assessment and certification of nuclear weapons and their components. In addition, LLNL computing supports research in many other areas including, molecular dynamics, turbulence, peta scale atomistic simulations with quantum accuracy, simulation of protein membranes, nano technology, ultrahigh resolution global climate models, fundamental material research, and laser plasma interactions for the National Ignition Facility (NIF). Currently LLNL provides 495 TFLOP/s of computing power, spread over eighteen x86 Linux clusters, to its user base. Of these clusters, 480 TFLOP/s are derived from the eleven systems supplied by Appro International. These clusters are available for both classified and unclassified work depending on the project. In addition, clusters are broken into two types: Capability Clusters, or those clusters that are designed to handle unusually large computing jobs, and Capacity Clusters, or those that are designed to handle a large number of different computing jobs at the same time.

Fielding and maintaining these clusters is no small feat. As a result, LLNL has a strong incentive to reduce the Total Cost of Ownership (TCO) of these systems. The TCO challenge is not unfamiliar to other government labs, notably Los Alamos National Lab (LANL) and Sandia National Labs (SNL). Each has similar issues and working together on a common hardware and software infrastructure would be advantageous, but it would also present a management challenge. In order to reduce TCO by over 50%, the Tri-Labs decided to purchase identical Scalable Units (SU) and build multiple clusters of various sizes over a two year cycle for all three labs. All- in-one procurement was certainly a novel, and an obvious approach. In theory, providing all three labs with the same hardware and software environment would allow applications to be easily moved from one cluster/site to another as well as offering an advantage for a common software development environment. In addition, a similar computing environment reduces man power and cost if software is the same across all three sites. In the past each lab created their own Linux based distribution -- often with operating system patches to better use the underling hardware. The Tri-lab procurement allowed the creation of TOSS (Tri-Lab Operating System Stack). TOSS is similar to the CHAOS environment developed by LLNL. Currently, the Tri-Labs are working on the TOSS project and when complete this is the first time all three labs will be able to share a common hardware environment and cluster operating systems infrastructure. Most of the TOSS components come from Red Hat Enterprise Linux 5 which is available under a DOE site license. In an attempt to leverage this common hardware platform, LLNL, LANL, and SNL joined together in the first multi-lab Linux cluster procurement aptly called Tri-Labs Capacity Cluster 2007 or TLCC07 procurement. In the HPC world, headlines often cite the achievements of capability clusters to scale up and push the limits of computing. In the Tri-lab procurement, the challenge was not so much one of scaling up, but rather scaling out. According the Mark Seager of LLNL, "Even though these are capacity clusters, we feel we are pushing the HPC state of the art because we bought 3,744 compute nodes to be spread across three Government Labs. That was a huge logistical challenge." The Solution LLNL working with two other NNSA national laboratories - Sandia and Los Alamos- developed the concept of the Scalable Unit (SU) cluster building block in order to build multiple commodity Linux clusters of different sizes from the same SU. With previous Linux cluster acquisitions, each cluster was designed, acquired and integrated individually. In a sense, this method treated every cluster as a unique creation with little carry over from other previous clusters. With a SU concept, large and small clusters can be constructed with the identical hardware and software. In addition, by building multiple clusters from the same SU, LLNL gained experience along the way and successive deployments went more smoothly. Thirdly, with a common hardware and software environment spread across multiple clusters at multiple locations, applications developer cost for porting and supporting these clusters was also significantly reduced. These are the factors that lead to the 50% reduction in the total cost of ownership. Examples of Scalable Units used at the Tri-Labs procurement:
  • 1 SU = 144 nodes/ 2,304 processor cores
  • 2 SU = 288 nodes/ 4,608 processor cores
  • 4 SU = 576 nodes / 9,216 processor cores
  • 6 SU = 864 nodes / 13,824 processor cores
  • 8 SU = 1152 nodes / 18,432 processor cores

Part of the solution was a careful specification that placed clear boundaries between various aspects of the cluster solution (i.e. storage area networking and parallel file systems) and required only the delivery of a large quantity of 144-node SU and second level switches from the cluster provider. The Tri-Lab procurement consisted of 21 scalable units for an aggregate performance of 438 TFLOP/s which would be broken into eight clusters and 2 options where Lawrence Livermore would create the Juno cluster from (8 SUs), the Hype cluster from (1 SU), and the Eos cluster from (2 SUs). Los Alamos would create Lobo from (2 SUs) and Hurricane from (2 SUs). Sandia would create Unity from (2SUs), Whitney from (2SUs) and Glory from (2 SUs). LLNL as part of the Tri-Labs labs procurement had the option to purchase an additional 2 clusters, Hera from (6 SUs) and Nyx from (2 SUs). In the fall of 2007, Appro International was selected under the TLCC procurement as the solution provider to deliver the SUs to the three labs. Appro was chosen based on how well they addressed the requirements specified in the Request for Proposal, their proven HPC track record, system cost and project management skills. In addition, Appro was noted to work well with component suppliers and solve problems as they arise. In terms of hardware, Appro 1U- 1143H, Quad-socket based on Quad-Core AMD Opteron Processors was specified instead of bladed packaging. While blade based servers add some convenience and redundancy, the most economical choice was still rack-mount 1U servers. In terms of the processors, the Tri-Labs SU employed a Quad-Core AMD Opteron (Barcelona) connected by DDR (4x) InfiniBand. Each SU was designed to achieve a high compute density and therefore used quad-socket motherboards with four Quad-Core AMD Opteron Socket F processors running at 2.2 GHz (Model 8354) for a total of 16 cores. Each node was equipped with 16 2GB DDR2-667 DIMMs for 32 GBytes of memory and a 4x DDR InfiniBand Connect-X Host Channel Adapter (HCA) Card. In order to achieve a balanced computational system, each node was required to achieve an optimum memory Bytes/FLOP ratio for both the memory interface and interconnect. The Opteron nodes were able to achieve a 20GB/s/node memory bandwidth which fit nicely within the specifications. The Results Measuring success in HPC at LLNL is not so much a mater of up-times and peak performance, but rather how much faster does the science and engineering get done. In the case of LLNL, immediately after installation of the first SUs, the NIF team requested time to refine the optics needed for the NIF lasers. There was a pressing purchasing decision that needed to be completed as soon as possible. The NIF team was given a 1.5 months of time on one of the LLNL TLCC clusters to complete the calculations and make the right decisions. The NIF is slated to fire all 196 lasers in 2010. This historic event will be the culmination of many people years of work backed by many compute years of CPU time made possible by the Tri-Lab cluster acquisition. According to Mark Seager, "The users are very excited about the TLCC Linux clusters. Appro and AMD have been great partners to work with. They have exceeded our expectation in terms of working with us to resolve problems as they came up and to get these system fielded quickly". Mark Seager states, "It is amazing how much leverage we are getting out of the standard configuration at all three Labs. Because we work on the same problems when we are trying to field these clusters, each lab brings its own unique approach to the problem and we tend to solve these problems more quickly.” From a performance standpoint, the new system is not only a success, but also represents an improvement over the previous clusters in use at Lawrence Livermore that previously only had dual-core processors. With twice the number of cores and twice the memory capability, says Seager, “we’re seeing a performance boost of anywhere from 1.3 to 1.8 times compared to the previous system. The users are very excited about getting this kind of capability.” One group of users from the LLNL National Ignition Facility (NIF) focuses on both high energy-density physics research and new kinds of energy sources utilizing photon science. NIF requested three months of dedicated time on the clusters, but were assigned half that because the pressing demand for computing resources from other LLNL programs. “Still, they were able to do the research they needed on both the ignition research and optics,” says Seager. Standing behind this success is Appro International. Their ability to deliver high quality hardware and work closely with component vendors proved to be a success factor in this project. Appro maintained good lines of communication with all vendors and customers. Appro also adapted well to other aspects of the procurement. There were some modification required to the standard SU design because not all sites had the same power and cooling capabilities. The solution was to reduce the density of systems in the SU and still maintain the SU concept. The Summary Overall, the Tri-Labs project has been a huge success. Its challenge has proved to be beneficial to everyone involved. Mark Seager summarizes his thoughts on the entire project and says, “Can we put in all the information we need to put in, and when we do, what kind of science comes out in the results? Ultimately, that’s the most important attribute of the Tri-Labs challenge, and by that metric, the new clusters have been a big success.” The Tri-Lab SU concept reduced costs across the board. Reductions in procurement costs and time frames were noted as well as ongoing maintenance costs. A key component to the success was the project management and problem solving Appro International brought to the table. Based on the experience of the Tri-Lab project, the future success of grand challenge computing depends on teamwork, planning, and amortizing the costs across multiple government labs. As LLNL has shown, Linux HPC clusters are igniting or nations future. To fulfill time-urgent national security missions LLNL needs rapidly deployable high performance computing systems. This requires partners who understand HPC ecosystems as well as who possess the strong technical and management skills necessary to work with a large number of component vendors. Fielding 21 Scalable Units (3,744 compute nodes with an aggregate performance of 438 TFLOPS) at three Tri-Lab sites (LLNL, LANL, SNL) was a daunting project challenge that, in this case, was very well executed." Mark Seager Head of Advanced Computing Systems Lawrence Livermore National Laboratory