On the Pulse of Grid Computing

By J. William Bell, Public Affairs Director, NCSA -- In 1998, George Karniadakis and his team at Brown University were just getting their blood flowing. They completed a 3-D simulation of movement through a single arterial bifurcation. They were happy just to see circulation in that one small branch. It gave them insight into the impact arterial bypass grafts have on a patient. At SC'05, the international supercomputing conference held in November 2005, they completed a simulation vastly larger than that work from just a few years ago. In fact, it was many times larger than any simulation ever completed of blood flow in the human body. Their simulation included 55 arteries and 27 bifurcations, accounting for every artery in the human body larger than about two to three millimeters in diameter. Seventeen bifurcations were modeled in 3-D, and another 10 were modeled in 1-D. No one had ever attempted to model more than two arteries at the same time. "Our simulation included 100 million grid points," explained Karniadakis. "Before, the state-of-the-art was 5 million grid points. By next year, we want to be looking at 1 billion. This is beyond any single computer. The collective power of the TeraGrid is necessary." The record-crushing calculation took place in concert on TeraGrid machines at NCSA, the San Diego Supercomputer Center, the Pittsburgh Supercomputing Center, and the Texas Advanced Computing Center, as well as a computer from the United Kingdom's Computer Services for Academic Research program. Visualizations were completed at The University of Chicago/Argonne National Laboratory. The team hopes that the simulation will serve as the heart of an environment in which scientists and engineers can complete a wide array of work easily. "This is the beginning of a biomechanical environment. A platform for bioengineers to couple their particular simulations to the entire tree, not just do them in isolation. We imagine they'll be able to embed their carotid artery simulation into the tree and see the global interaction," said Karniadakis, who is working with researchers at Brown, Argonne, Ben-Gurion University in Israel, and Imperial College in London. A proposal is planed for a front-end interface to such a system in the coming year. They also see the arterial tree as an important part of collaborations like the Digital Human Project, a fledgling effort led by the Federation of American Scientists to construct an on-going simulation of the entire body. A Global Problem Cardiovascular disease is responsible for almost half of all deaths in the western world, and the "formation of arterial disease [like atherosclerosis] is strongly correlated to blood flow pattern," according to Steve Dong, an assistant professor of applied mathematics who works on the project at Brown. "Disease is observed to form preferentially in separating or recirculating regions [where blood runs backward against the dominate flow]." In other words, build-up in the heart tends to occur at the bifurcations that the team's simulations are focusing on. Were the situation that simple, 3-D simulations of the bifurcations themselves might be a sufficient way to understand the development of heart disease. Blood flow is a complicated thing, though. Flows influence one another in different regions across the body. Change in the flow at a certain bifurcation -- either because of plaque build-up or because of a bypass graft to get around that build-up -- changes the flow at remote locations as well. That, in turn, can lead to undesirable conditions at other sites and, eventually, another bout with heart disease or a heart attack. 1-D Feeds 3-D To address conditions throughout the body and to chart their interactions, the team splits the model into two. A quasi-1-D simulation tracks blood flow through the straight sections of the 55 modeled arteries (MRI images are used to develop computational representations of the arteries. These meshes of grid points are built by the team in-house). The 1-D model calculates flow only along the streamwise axis. It ignores the flow's other dimensions because the complicated eddies that mark turbulence within the flow are less prominent in these areas and are thought to have less influence on cardiovascular problems. This model, which was run at NCSA during the SC'05 demonstration, provides information on features like flow rate and pressure to a set of 3-D models. Such models, which ran at the other sites and at NCSA, zoom in on the bifurcations. They give an incredibly detailed view of what the flow looks like, and that can be used to understand the implications flow has on heart disease. A description of a similar, earlier TeraGrid computation by the team was published in September 2005's Computing in Science & Engineering. Parallel at Two Levels Such a scheme -- with data passing from a model running in Urbana, Illinois, to several other models across the country and across the ocean -- presents a special challenge. "This is parallel at two levels," said Karniadakis. Dong continued: "This is large-scale coupled communication -- cross-site and intrasite. The machines are communicating with each other and, inside, each machine's processors are communicating." The intrasite communications rely on an MPI implementation of NekTar, a fluid dynamics code that Karniadakis and his collaborators developed over many years on computers at NCSA and other systems supported by the National Science Foundation. "You really get the platinum service at NCSA," said Karniadakis. "The ability to talk to leaders like John Towns and Rob Pennington and to get their support is crucial; also the stability of the machines and tools at NCSA. We always debug our codes at NCSA because of the stability that we can count on." "When there are questions, NCSA gives us answers," said Dong. Cross-site communications is a more recent challenge, born of the distributed computing power that the TeraGrid offers. MPICH-G2 and the help of its developer, Nick Karonis of Northern Illinois University and Argonne National Lab, meet that challenge. At every time step, the 1-D simulation computes and provides the basic pressure and flow data. MPICH-G2 grabs the data from memory and ships it to the distributed sites. Then NekTar takes over, running the in-depth 3-D calculations. To improve performance, the team uses a strategy that anticipates the needs of processors doing the calculations. "We overlap intersite communication with intrasite communication, and within a site overlap communications with computations," said Dong. Before the data for the next time step is required, it's already been called for. "When we really need the data, it's waiting. No wasting time," he said. "Arden Bement [director of the National Science Foundation] asked us recently, 'Why not use Blue Gene [or another single, massive machine instead of relying on distributed computing through infrastructure like the TeraGrid]?'" said Karniadakis. "But the TeraGrid is potentially unlimited in power. We will always be looking to increase the sophistication of our models, and coupling computers is currently the best way to do that."