ACADEMIA
Unleashing a Realistic Magnitude 7.7 Virtual Earthquake
SDSC Enables PetaShake Simulation on 40,960 Processor IBM Blue Gene Watson System: by Paul Tooby, SDSC Senior Science Writer -- Progress usually happens in small increments. Not this time. Computational scientists at the San Diego Supercomputer Center (SDSC) at UC San Diego have collaborated with Southern California Earthquake Center (SCEC) scientists to perform a simulation 20 times larger than their previous largest of a massive magnitude 7.7 earthquake on the southern part of California's San Andreas Fault. This will take “virtual earthquakes” to a whole new level of realism.
Following extensive development and testing work on SDSC's Blue Gene Data machine, the computation was run on 40,960 processors of the IBM Blue Gene Watson system. The code achieved a remarkable 96 percent parallel efficiency compared to that on 4,096 processors, paving the way for far more realistic earthquake simulations that can give scientists and engineers a better understanding of the large earthquakes that threaten California . Successor to the pioneering TeraShake simulations run at SDSC on DataStar in collaboration with scientists from the SCEC Community Modeling Environment (SCEC/CME), the ambitious PetaShake simulations target the next generation of petascale supercomputers, which will enable scientists to compute at the previously unimaginable speed of a petaflop, or 10 15 calculations per second, hundreds of thousands of times faster than today's personal computers. “This is a significant milestone in enabling improved ground motion prediction because it achieved 18 times the throughput of the previous TeraShake simulations,” said Kim Olsen, Associate Professor of Geological Sciences at San Diego State University (SDSU), who developed the sophisticated Anelastic Wave Model (AWM), the fourth-order finite difference code used in the simulations. “Petascale computers will enable earthquake simulations to enter a new level of accuracy, capturing greater detail in space and time.” SDSC computational scientist Yifeng Cui, who led the scaling effort, added, “this achievement is especially rewarding because until recently we weren't even sure we would be able to scale up the code this much, two years ago the code would only run on 240 processors on SDSC's DataStar system.” For months the TeraShake and PetaShake collaborators worked to prepare for the larger run. SDSC staff, including Cui, Yuanfang Hu, and Jing Zhu, improved the code, testing it on SDSC's DataStar and Blue Gene Data systems, finding ingenious ways to scale the code to 2,048 processors and beyond, and implementing a number of changes and optimizations, in particular improved I/O, to make this possible. When they realized they could indeed make the code run on larger machines, the researchers applied for time on the largest unclassified supercomputer in the world, IBM's 114 peak teraflops Blue Gene Watson system, with 40,960 processors (100 teraflops is 100,000 billion calculations per second, like combining the power of about 20,000 personal computers). The run was part of the Blue Gene Consortium Days program in which IBM offers researchers the opportunity to run applications on the 20 rack Blue Gene Watson System, creating the potential to do breakthrough science and improve scaling, performance, and software. SDSC has also helped other users achieve successful large runs in the same program. In an SDSC Strategic Applications Collaboration (SAC), SDSC computational scientist Ross Walker helped Professor David Baker of the University of Washington improve his Rosetta protein structure prediction code to achieve a Critical Assessment of Structure Prediction (CASP) run in three hours on the Watson machine, a computation that previously took weeks on Baker's resources. Through another SDSC SAC and development work on SDSC's Blue Gene machine, computational scientist P.K. Yeung of Georgia Tech also achieved unprecedented scaling of his code for Direct Numerical Simulation (DNS) of turbulence on the large Blue Gene system. Now the that PetaShake researchers have achieved a successful benchmarking run on Blue Gene Watson, they look forward to the opportunity to extend this in further runs that will explore scientific questions in unprecedented detail on the large machine. Scientists believe that California is overdue for a large earthquake on the southern San Andreas Fault, and the motivation for the research is to better understand the basic science of major earthquakes and to apply this knowledge to prepare for them through such measures as improved seismic hazard analysis estimates, better building codes in high-risk areas, and safer structural designs, potentially saving lives and property. The initial TeraShake simulations involved a collaboration of SCEC scientists from eight institutions with more than 20 SDSC staff, and were run on SDSC's DataStar. These unprecedented data-intensive simulations, producing more than 40 terabytes of data, revealed new insights into large-scale patterns of earthquake ground motion, including where the most intense impacts may occur in Southern California's sediment-filled basins during a magnitude 7.7 southern San Andreas Fault earthquake, and how basins flanked by mountains can form a “waveguide” that could channel unexpectedly large amounts of earthquake wave energy into the Los Angeles basin. But the best previous TeraShake simulations could reach a frequency of just one half Hertz, modeling only the lowest part of the frequency range of the ground motion. While these earlier simulations provided information engineers can use to explore earthquake impacts on larger multi-story structures, more than 20 floors high, say, the much larger number of smaller structures remain “invisible” in the TeraShake simulations, which failed to capture the higher frequencies that interact with smaller multi-story buildings. By reducing the computational grid spacing from 200 to 100 meters, the PetaShake simulations will capture frequencies up to one Hertz, providing information that can model earthquake impact on the larger number of smaller multi-story structures. In addition to higher frequencies, the researchers more than doubled the physical volume to 800 x 400 x 100 kilometers. These improvements in realism are very computationally intensive, however. Each factor of two improvement in frequency resolution increases by a factor of eight the required spatial grid mesh and another factor of two in the timestep, for a total increase of 16, strongly driving the need for the next generation of petascale computing resources. “The large-scale ‘virtual earthquake' simulations give us a more detailed view of how the earthquake waves propagate across a vast area of California ,” said Olsen. “This is invaluable because we can ‘connect the dots' and understand what's going on all over – and we just can't get this information any other way.”