INDUSTRY
SDSC, TimeLogic and Sun Validate Ultra-Fast Hidden Markov Model Analysis
- Written by: Writer
- Category: INDUSTRY
LA JOLLA, CA -- Sun Microsystems (Nasdaq:SUNW ), TimeLogic Corp., manufacturer of the DeCypher® line of biocomputing accelerators, and the San Diego Supercomputer Center (SDSC) at the University of California, San Diego (UCSD) recently completed benchmark testing and scientific validation of large-scale Hidden Markov Model bioinformatics analyses comparing speed and results to those obtained on SDSC's Linux PC cluster. In the tests, designed by Adam Godzik, Ph.D., a researcher at SDSC in the Joint Center for Structural Genomics (JCSG) and associate professor and program director for Bioinformatics and Biological Complexity at The Burnham Institute, compared a cluster of 32 x 1-GHz Pentium III Linux PCs running Dr. Sean Eddy's (Washington University at St. Louis) popular HMMER software, to an 8 CPU Sun Fire 6800 server containing the DeCypher XD-4G FPGA reconfigurable computing processor array on the Sun Fire's PCI bus. The benchmark employed NCBI's NR75 non-redundant protein database containing 372,119 sequences (128.1 million aa symbols) as the query set, and SDSC's HMM database of 19,192 models as the target of the search. Fully utilizing the Linux cluster, this analysis requires 144 days run time (approximately five months of dedicated, uninterrupted processing). The DeCypher accelerated Sun Fire machine completed the entire task in just 41 hours 46 minutes. Validity of scientific results was confirmed by SDSC scientists, based on comparison to those of Dr. Eddy's software. SDSC researcher Slawek K. Grzechnik, technical coordinator for the JCSG Bioinformatics Core, conducted the testing. "The speed was incredible," commented Dr. Godzik. "DeCypher proved to be an ideal tool for a comparison of this magnitude." "This is the era of high-throughput biology where the same operation is repetitively performed on many thousands of data points," said Philip E. Bourne, professor of Pharmacology at UCSD and the director of SDSC's Integrative Biosciences program. "In these scenarios, results such as these are very significant. The speed improvement makes large scale analyses of this type much more achievable." Commenting on the dramatic result, Jim Lindelien, founder and CTO of TimeLogic, said, "We're very pleased to have the opportunity to validate our Solaris-based DeCypher Sun Fire accelerator with SDSC's, JCSG's and Sun's fine teams. This benchmark ran 82-fold faster than the Linux cluster's 32 CPUs. That works out to 2,624 CPUs equivalent throughput for this important bioinformatics application, in the footprint and power demand of a single Sun machine -- with impressive price-performance improvement over what is widely thought to be the most cost-effective technology available." Sia Zadeh, Ph.D., group manager for Life Sciences at Sun, stated, "The value benefit of the Sun-TimeLogic technology combination is highlighted by this outcome. Certainly, server farms play an important role in life science, and Sun's Grid Engine supports that role. Yet, server farm operators face daunting administrative workload, scaling and load balancing challenges, and per-CPU licensing costs as farms scale to huge sizes. Life science organizations struggling with IT scaling now have a competitive road map toward smooth bioinformatics performance scaling beyond 1,000 CPUs for these compute intensive critical applications." Zadeh also added, "While the initial cost of non-Sun clusters may appear attractive, once you factor in the cluster management and the R&D delay time, this hybrid solution's total cost of ownership -- including the TimeLogic accelerators -- is much lower and represents a significant cost savings for discovery institutions." The Joint Center for Structural Genomics (JCSG), a consortium of California scientific research organizations including The Scripps Research Institute (TSRI)/Genomics Institute for the Novartis Research Foundation (GNF); SDSC; and the Stanford Synchrotron Radiation Laboratory (SSRL, a Division of the Stanford Linear Accelerator Center, SLAC) at Stanford University, has received a National Institutes of Health (NIH) grant to develop and integrate high-throughput robotic technologies to greatly accelerate the discovery of the 3-D structure of proteins, providing key insights about their biological function.