SDSC Researchers Accurately Predict Protein Docking

Computational biologist Lynn Ten Eyck and colleagues at the San Diego Supercomputer Center (SDSC) at UC San Diego have used software known as DOT to produce accurate predictions of protein-protein interaction as part of the Critical Assessment of PRedicted Interactions (CAPRI), an ongoing evaluation of docking algorithms. CAPRI is a community-organized experiment hosted at the European Bioinformatics Institute. The SDSC group's entries were among the most accurate submitted in the seventh round of CAPRI.

Accurately Predicting Protein Docking - This protein-protein complex was judged best in round seven of the CAPRI experiment. The structure shown is a blind test prediction using the DOT program developed at SDSC. The small dots show the degree of surface complementarity at the interface of the proteins as measured by the Fast Atomic Density Evaluator (FADE), another SDSC software product. Warmer colors are better. L. Ten Eyck, M. Hotchko, D. Law, E. Thompson, SDSC; M. Pique, V. Roberts, TSRI. Graphics rendered by PyMOL (DeLano Scientific, 2002).

"The strength of DOT is that we approach the problem in stages," said Ten Eyck, Associate Director for Science Research and Development at SDSC. "First, DOT finds fast, approximate answers using a scalable algorithm that allows us to take advantage of modern parallel computing to carry out a comprehensive search."

DOT's speed also comes from using an algorithm that computes estimated interaction energy for all possible relative positions in a single step for each orientation. After using DOT to quickly screen the billions of possibilities to find a small number of promising cases, more computationally demanding methods and visual inspection can then be applied to find the correct protein docking configuration.

The motivation of the NIH-funded research is the ongoing search by biologists to better predict the interactions among proteins, the molecules of life. Examples of these problems include examining cellular metabolism, finding the most stable relative orientations between two proteins, studying protein subunit aggregation, performing computer-aided drug design, and solving problems of cellular signaling and expression. The benefits of this research include both greater scientific understanding and advances in efficient drug discovery.

"There is an amazing diversity of conditions in which proteins are found, from cell membranes to being loose in the blood, or buried in cells with little free water," said Ten Eyck. "So there is no 'one-size-fits-all solution for protein interactions."

Using DOT as an initial screening method allows the researchers to efficiently zero in on a solution for each different case. In the four-year-old CAPRI evaluation, which encourages the development of improved protein docking algorithms, organizers solicit from crystallographers protein structures that have been solved but not published. Then they invite scientists to use their best docking algorithms to predict how the protein pairs will fit together. The problem is presented as two isolated protein molecules, so that if there is conformational change, that is, the molecules change shape as they dock, the researchers have to account for this in their solution. Participants can submit up to 10 predictions of how the proteins will interact in the double-blind evaluation.

While modern sophisticated methods that allow flexibility ion binding would be expected to perform the best, the SDSC DOT entries have done surprisingly well over the course of the CAPRI experiment, getting acceptable predictions for around one-third of the targets overall. This is considered quite good performance, according to Ten Eyck, especially in light of DOT's relatively modest resource requirements.

In a remarkable story of software longevity, the DOT software was originally developed in 1994 by Ten Eyck and then-graduate student Jeffrey Mandell, now at The Scripps Research Institute (TSRI), as well as Victoria Roberts and Mike Pique of TSRI. The software was christened DOT for "Daughter of Turnip" since it was based on the earlier docking program, TURNIP, developed by TSRI's Victoria Roberts.

Although DOT then moved to the back burner, it is finding new use in the CAPRI evaluation. While the DOT algorithm is efficient in comparison with other docking algorithms, this class of problems is still computationally intensive. Running on 64 processors of Blue Gene, for example, DOT can produce an answer in about an hour or run on several Linux workstations in about a day.

The SDSC DOT software is open source and available on the CCMS website, and Ten Eyck notes that an updated version will be released soon. For the future, Ten Eyck's group is working to further improve their predictions. One way they are doing this is by collecting information on the location where binding occurs and then using this with methods that take protein flexibility into account.

AWS Training https://training.resources.awscloud.com/training-certification-top-of-funnel/aws-priroitize-your-people-to-put-gen-ai-to-work