SRI International Applies AI Expertise to Advance Systems Biology Research

MENLO PARK, CA -- SRI International, a leading research institute based in Silicon Valley, has developed a novel software system that uses artificial intelligence and symbolic computing to better understand and manipulate the overwhelming information produced by genome data. SRI's software, Pathway Tools, encodes large and complex scientific theories that are the basis of systems biology, leading to a greater understanding of biological systems, improved scientific collaboration, and more rapid integration of disparate sources of bioscience knowledge. Pathway Tools is a software environment that supports query, analysis, and visualization operations for pathway/genome databases. Systems biology studies the complex interactions between all levels of biological information -- genomic DNA, informational pathways, and informational networks -- to understand how they work together. The Pathway Tools form the basis of the EcoCyc(TM) Project (http://www.ecocyc.org/), a symbolic pathway database that describes the metabolic, transport, and genetic-regulatory networks of Escherichia coli (E. coli), a bacterium studied extensively worldwide because of its role in disease and usefulness in biotechnology. Pathway databases describe biochemical pathways and their component reactions, enzymes, and substrates. Science Magazine Details AI-Based Advancements in Systems Biology A study outlining the role of pathway databases in systems biology research will be published in the September 14, 2001 issue of Science magazine. The study emphasizes the need for artificial intelligence to construct and manipulate systems biology theories, which are growing too large and complex to be grasped by the mind of a single scientist. Encoding the theories in this manner allows computers to verify a theory's internal consistency, its global properties, and its consistency with external data. This holds a tremendous advantage over conventional approaches, which typically take the form of text-based repositories of theories that lack the means to facilitate computer-based reasoning. "The EcoCyc database is the largest and most detailed theory encoded within the Pathway Tools to date. It integrates information about the bacterium E. coli in one place -- in a computable symbolic form -- allowing us to visualize the largest known genetic network of any organism," explained Dr. Peter Karp, Director of SRI International's Bioinformatics Research Group. "We can now begin to use that network to help interpret the huge volumes of gene expression data being generated by gene-chip experiments." Pathway Databases Enable Bioinformatic Collaboration Although biologists had determined approximately 25 percent of E. coli's genetic network prior to the EcoCyc project, this information was spread across hundreds of articles in different scientific journals. A symbolic database such as the one used in the EcoCyc project allows scientists working in different physical locations to integrate their findings within an interactive, searchable database, enabling the scientific community to have a computational theory for the full system. Completing organisms' DNA sequences (such as human and E. coli) has effectively determined their molecular blueprints. Systems biology seeks to define the interactions among these parts and the manner in which those interactions yield system-wide behaviors of the organism. For this reason, Pathway Tools is being applied to a number of other organisms at SRI and by other researchers. For nearly a decade, the EcoCyc project has been funded by the National Center for Research Resources (NCRR), a component of the National Institutes of Health. "This database will provide biomedical investigators nationwide with a valuable, cost-effective tool that has great potential for unlocking the secrets of pathogenicity of E. coli and a number of other disease-causing microorganisms," said Dr. Michael Chang of the NCRR Comparative Medicine Division. "When many researchers have access to a shared national resource such as this encyclopedia database, not only is the effectiveness of the resource maximized relative to its cost, but the scientific impact is greatly increased. EcoCyc is a unique and powerful resource that will increase our understanding of metabolic pathways and will greatly aid drug research and development." Dr. Karp presented the results of the NIH-supported research at last week's Genomes to Life conference in Washington, D.C., sponsored by the Department of Energy.