Crowdsourced Supercomputing Project Seeks Better Understanding of DNA, Improved Quality of Life

IBM teams with Australian university and Brazilian health institute to examine big genomic data using IBM's World Community Grid

What do the DNA in Australian seaweed, Amazon River water, tropical plants, and forest soil all have in common?

Lots, say scientists. And understanding the genetic similarities of disparate life forms could enable researchers to produce compounds for new medicines, eco-friendly materials, more resilient crops, and cleaner air, water, and energy.

Comparing the proteins encoded by the genes from a variety of life forms is the goal of Uncovering Genome Mysteries, a new project hosted on IBM's World Community Grid that debuted today. Administered by UNSW Australia and the Oswaldo Cruz Institute of Brazil, the project seeks to make about 20 quadrillion comparisons of 200 million proteins underlying a wide variety of organisms.

That herculean effort would normally require that a PC spend 40,000 continuous years performing calculations, but the computing power of World Community Grid will reduce the task to months.

This is possible because IBM's World Community Grid - the free, crowdsourced supercomputer celebrating a decade of cutting-edge research on global humanitarian issues -- taps into the goodwill and computer power of thousands of volunteers spanning the globe. They've all downloaded an app that borrows the unused power of the computing devices when it is not otherwise needed by their users, such as when they take a brief or extended break from using their computers. The scalability of this virtual supercomputer gives scientists a virtually limitless capacity to work with large amounts of data at no cost to them.

While the project will process protein sequences fromvarious forms of life, it will pay special attention to microorganisms because of their ubiquity and importance. For example, there are about 10 times more microorganisms living in and upon human bodies than actual human cells. They control a huge variety of natural processes involved in human health (gut bacteria aid digestion and reduce allergies), food production (baker's yeast increases yields, speeds preparation and improves taste), and agriculture and aquaculture (bacteria remove impurities). Microorganisms have been used to clean water in sewage treatment plants and even help consume oil spills. Microorganisms in exotic tropical plants show promise as efficient, sustainable fuel sources.

However, most of these discoveries were largely made through time-consuming trial and error. A better understanding of their genes and corresponding proteins might speed development of practical technologies and solutions. Despite their importance for our planet's health, microorganisms are hard to analyze because of their tiny size, great numbers, and dizzying variety. If scientists want to search for useful genes in unknown organisms, their task is daunting. A small sample of water or soil can contain tens of thousands of organisms, and each organism may have thousands of genes. The acceleration of climate change and the disappearance of habitat have made the identification and analysis of DNA a race against time.

One approach to identifying nature's hidden "superpowers" is to analyze the genetic makeup of different organisms to help understand how they function. Traditionally, this has been an expensive and time-consuming process, but in recent years, scientists have developed more affordable and effective methods to decode DNA. However, scientists must still conduct further studies to discover the function of each gene and its corresponding protein.

Consequently, the Uncovering Genome Mysteries project intends to produce a database of protein sequence comparison information for all scientists to reference. Project leaders hope this can lead to the identification of new gene functions, discoveries of how organisms interact with each other and the environment, and a better understanding of how microorganisms change under environmental stresses, such as climate change.

Created and managed by IBM, World Community Grid provides computing power to scientists by harnessing the unused cycle time of volunteers' computers and mobile devices.  The software receives, completes, and returns small computational assignments to scientists. The combined power contributed by hundreds of thousands of volunteers has created one of the fastest virtual supercomputers on the planet, advancing scientific work by hundreds of years.

Nearly three million computers and mobile devices used by over 670,000 people and 460 institutions from 80 countries have contributed virtual supercomputing power for projects on World Community Grid over the last 10 years. Since the program's inception, World Community Grid volunteers have powered more than 20 research projects, donating nearly a million years of computing time to scientific research and enabled important scientific advances in health and sustainability. IBM invites researchers to submit research project proposals to receive this free resource, and invites members of the public to donate their unused computing power to these efforts at worldcommunitygrid.org.

World Community Grid is enabled by software developed in 2002 by Berkeley Open Infrastructure for Network Computing (BOINC) at the University of California, Berkeley and with support from the National Science Foundation. The BOINC project choreographs the technical aspects of volunteer computing.

For more information about IBM's philanthropic efforts, please visit www.CitizenIBM.com