Interactive Supercomputing Helps NCI Accelerate Genetic Research

Parallel Computing Software Speeds Cancer Research 200-Fold: Interactive Supercomputing Inc.'s (ISC's) Star-P software is being used by scientists at the National Cancer Institute's (NCI's) Pediatric Oncology Branch to mine vast public databases of genomic information for potential new medical discoveries. With Star-P, scientists can now tap powerful high performance computers (HPCs) to dramatically accelerate the process of genomic profiling, which could yield new insights into the genetic risk factors for cancer, foster new procedures for testing tumors, or identify genetic changes that may result from treatments and therapies. Using a specialized software application called CORR4DB, researchers correlate one genomic array against a database of 100,000 probe pieces of a gene in search of specific DNA components or attributes. The correlations help them understand the relationship of genes, and their conclusions can provide the basis for additional genomic research. CORR4DB is developed in MATLAB, a highly productive desktop tool favored by scientists. But sample sizes were growing into the tens of thousands of genomic arrays, overwhelming the capabilities of their desktop MATLAB environment and hindering interactivity with the data. Scientists knew that larger correlations could be completed faster if their calculation could be parallelized to run on a parallel HPC. "Running a single correlation on a desktop computer could take a week or more to complete," said Bill Strecker, chief technical officer at ISC. "An explosion in the amount of genomic data available to researchers has made their work increasingly difficult. Their tasks require more computing power, more system memory, and - all too often - more time. And in the race to understand how genetics and cancer are linked, time is precious." Star-P is an interactive parallel computing platform that lets NCI scientists continue to work with CORR4DB on their desktops, but run the correlations interactively on SGI Altix servers. This eliminates the need to re-program the CORR4DB models in C, FORTRAN or MPI languages to run on the parallel computer. As a result, the answers to some researchers' questions are arriving up to 200 times faster than ever before. The Star-P approach has yielded significant advantages, said Dr. Mark Potts, president of HPC Applications, Inc., a consulting firm contracted to get NCI's software up and running on the SGI Altix. "If your goal is to take the same interactive environment and transfer it to a parallel processing system with a lot more memory, then you'll look for the easiest way to get there," says Potts. "NCI is accustomed to working in MATLAB and with certain formatted files, and the Star-P approach retains that environment."