VISUALIZATION
Berkeley Lab and NCSA team to build Geophysics Cluster
Members of NCSA’s expert cluster computing staff recently traveled to Lawrence Berkeley National Laboratory (LBNL) to help the Department of Energy center install a new high-performance computing cluster. When they returned to Illinois, they took with them LBNL-developed cluster management software, Warewulf, which they will use to speed up the performance of their clusters. The NCSA/LBNL team installed a Dell system – a Linux cluster of 256 Intel Xeon EM64T 3.6Ghz processors with 512GB memory, a Topspin Infiniband fabric, and a Panasas parallel filesystem – in a six-day marathon at LBNL. Named Geophys, the system has a peak performance of 1.8 teraflops, and will be applied to solving large-scale geophysical problems in computational seismology and electromagnetic imaging. “It’s a smoker,” said Greg Newman, LBNL researcher in Earth Sciences. “The cluster is now solving large-scale geophysical imaging problems, providing critical information on subsurface geological processes with implications for energy and the environmentfrom characterizing hazardous waste sites to exploring for hydrocarbons and geothermal resources.” In 2004, NCSA added a similar system dubbed T2 – a Linux cluster of 1,024 Intel Xeon EM64T processors with a peak performance of 7 teraflops. Earlier this year, Newman had the opportunity to run his geophysics code on NCSA’s T2 and was so impressed with the performance that he wanted to acquire a similar system for the Earth Sciences Division at LBNL. “To do the type of multi-scale processes that we’re doing in geophysics, we need a machine that’s tailor-made and dedicated to running our code exclusively,” said Ernest Majer, principal investigator of the cluster and deputy director of LBNL’s Earth Sciences Division. “This new machine has already gone beyond our expectations.” NCSA helped speed the process for the IT Division at LBNL, leveraging its strong relationship with Dell, an NCSA Private Sector Partner, to work out many of the implementation details so that LBNL could essentially duplicate the T2 system on a smaller scale in a shorter amount of time. “This shows the value that NCSA’s Private Sector Program creates by bringing industry, academia, and national research centers together,” said NCSA Director Thom Dunning. “It also shows how supercomputing centers leverage the advancements that each of them have made.” In addition to lending their expertise and assistance to LBNL, the NCSA members of the team -- Brian Kucic, a program manager for NCSA’s Private Sector Program, and Jim Long, a systems engineer in the High-Performance Computing Systems Group -- were interested in learning more about Warewulf, a cluster implementation toolkit developed by LBNL’s Greg Kurtzer. Warewulf features a lightweight RAM-disk based filesystem for the compute nodes that simplifies cluster installation and maintenance. It is the standard cluster implementation tool for the IT Division’s Scientific Cluster Support program at LBNL and is in use at several other organizations. NCSA had been running Warewulf on a test cluster but they were eager to get a first-hand look at how Warewulf worked on a large production cluster. The results were exceptional, with NCSA seeing a significant increase in performance with the use of Warewulf on LBNL’s cluster. A final Linpack Benchmark run on the cluster produced a result of 1.5 teraflops, indicating an efficiency of 83.43 percent, which would place this system at 324 on the current TOP 500 list. NCSA is now planning to use Warewulf to administer its T2 cluster and is hoping to see a similar improvement in performance. “It's been exciting working through all of the details in order to bring the project to a successful completion,” said Gary Jung, who leads LBNL’s Scientific Cluster Support program. “We put a terrific system together, in terms of price, performance, and time to installation, by optimally leveraging a lot of expert resources.”