UCSD Researcher Speaks on Recent Cellular Mapping Breakthrough

By Steve Fisher, Editor In Chief -- UCSD/SDSC and the Howard Hughes Medical Institute recently announced that a team of researchers in San Diego have mapped key cellular structures using a new method to harness the power of supercomputing, specifically, SDSC’s Blue Horizon. Supercomputing Online went right to the source to learn more. The following is an interview with Nathan Baker, UCSD post-doctoral researcher, and one of the principal investigators. Supercomputing: First off, congratulations on your recent breakthrough. Could you provide a little background on the Poisson-Boltzmann equation and your new method of solving it? BAKER: Electrostatics play a vital role in determining the specificity, rate, and strength of interactions in a variety of biomolecular processes. The accurate modeling of the contributions of solvent, counterions, and protein charges to the electrostatic field can often be very difficult and typically acts as the rate-limiting step for a variety of numerical simulations. Rather than explicitly treating the solvent and counterion effects in atomic detail, continuum methods such as the Poisson-Boltzmann equation (PBE) are often used to represent the effects of solvation on the electrostatic properties of the biomolecule. Despite this simplification, current methods for the calculation of electrostatic properties from the PBE still require significant computational effort and typically do not scale well with increasing problem size. This paper describe new parallel methods which overcome these scaling difficulties. These new methods make use of recent work by Randy Bank and Mike Holst(UCSD Math) which basically showed that a very large problem, like the PBE for a million-atom structure, could be broken into many smaller problems. Previous parallel methods would require that these smaller problems communication during solution. However, our alternate method is to take care of communication ahead of time by performing a much coarser global solution which is then used to establish boundary conditions for each of smaller problems. By calculating this initial, coarse solution and introducing a bit of redundancy by using overlapping subdomains for each of smaller calculations, we can perform parallel PBE calculations with no communication -- i.e., in a trivially parallel fashion. Supercomputing: The recent release mentions that you were able to go from modeling less than 50,000 atoms to modeling over 1 Million atoms. What are the research benefits of this massive increase in modeling power? Is there any particular research area that will benefit the most from this breakthrough? If not, can you provide an example of an area of research that will benefit greatly from it? BAKER: Because the electrostatic analyses of individual proteins and nucleic acids has proved important both in explaining the activity of these molecules and in guiding engineering and synthetic efforts, it seems reasonable to expect that the new methods will be similarly important at larger scales. E.g., in work that has recently been submitted for publication, we have been able to establish excellent agreement between calculated and experimental binding affinities of aminoglycoside antibiotics (neomycin, streptomycin, etc.) with a ribosome target. But a key feature of the new methods is that they scale very well with the number of processors in parallel computing, due to the low commnunication costs. In the Distributed Terascale Facility that NSF just announced (a partnership of SDSC/UCSD, NCSA/UIUC, Caltech and Argonne), calculations involving tens of millions of atoms will likely be possible. Together with fast methods that have been developed for calculating forces as well as electrostatic energies, these will open the way to simulations of the dynamic properties of such cellular components as DNA replication and transcription complexes. In addition to making the code freely available, we're working with folks at NBCR (http://nbcr.sdsc.edu/) to turn this into a portal-based application (like BLAST, etc.), which should further increase the accessibility of the software. Supercomputing: Would you explain to the readers the role SDSC's Blue Horizon played in your work citing a specific example or two? BAKER: This work would not have been possible without the NPACI Blue Horizon supercomputer and the support of SDSC personnel. Blue Horizon provided the resources for the ~700 processor microtubule calculation that would have been difficult to come by on other academic supercomputers. Basically, these calculations required a very large amount of memory and CPU cycles, albeit in a distributed fashion. The massively parallel nature of Blue Horizon filled both of those needs. Supercomputing: Please tell us about the algorithms and codes the team wrote to solve equations that describe the electrostatic contributions of individual atoms within a system. Is there any other software, proprietary or otherwise, you'd like to mention that was an asset in this work? BAKER: For this application, APBS (the software designed to solve the PBE in a parallel fashion) relies heavily on MALOC, a hardware abstraction library developed by the Mike Holst research group (UCSD Math), and PMG, a fast multigrid partial differential equation solver written by Mike Holst. The communication is currently performed with MPI and the visualization has been done with OpenDX and QMView. Most of the code development was done on Linux platforms using GNU tools. ---------- Supercomputing Online wishes to thank Nathan Baker for his time and insights. It would also like to thank SDSC’s Cassie Ferguson for her assistance. ----------