VISUALIZATION
Faster, cheaper way to find disease genes in human genome passes initial test
Method makes it feasible to search for disease genes in unrelated people with same condition
University of Washington (UW) researchers have successfully developed a novel genome-analysis strategy for more rapid, lower cost discovery of possible gene-disease links. By saving time and lowering expenses, the approach makes it feasible for scientists to search for disease-causing genes in people with the same inherited disorder but without any family ties to each other.
The strategy also might be extended to common medical conditions with complex genetics by making it more cost-effective and efficient to study the genomes of large groups of people.
Such large-scale research hasn't been undertaken because it has been prohibitively expensive, cumbersome, and time-consuming to sequence, compare and interpret entire human genomes.
The study, published today in Nature by lead author Sarah B. Ng, a graduate student in the UW Department of Genome Sciences, was conducted as a proof-of-concept to see if a more targeted analysis and newer technology could identify candidate genes for Mendelian disorders. These are diseases like cystic fibrosis or sickle cell anemia that are caused by a mutation in a single gene and are passed along through generations in a simple inheritance pattern. In this study, the rare Mendelian disorder picked to evaluate the strategy in unrelated, affected individuals was Freeman-Sheldon syndrome.
The study's senior author is Jay Shendure, UW assistant professor of genome sciences. In addition to the Shendure lab, the UW labs of Deborah Nickerson, Genome Sciences; Michael Bamshad, Pediatrics; and Evan Eichler, Genome Sciences, played key roles in the collaborative study.
To make progress in disease genetics, new strategies such as this are vital. Shendure gave an example: "The genetics of thousands of rare diseases remains unsolved because sufficient numbers of families with individuals affected by those disorders are not easily available. Even with such families, mapping and identifying the causative gene can take many years."
| ||||
From attempts to determine the genetics of cancer, diabetes, and heart disease, scientists now realize that common variations in the human genome account for only a small fraction of the risk of these common diseases. The new strategy allows researchers to investigate the contributions of rare variants and might be extended to larger population studies to untangle the complex genetics underlying the leading causes of death and disability.
Shendure explained the team's approach: "We decided to focus only on the 1 percent of the human genome which codes for proteins. This portion is called the exome. In other words, we determined the genetic variation in these areas, and ignored the rest. We used new technologies to capture these specific regions in the genomes of 12 people, 4 of whom were affected by the same Mendelian disorder. None of the subjects were relatives. We then decoded these selected parts of the genome through massively parallel DNA sequencing, a technology that allows one to sequence hundreds of millions of DNA fragments in parallel." Intersecting these data found that only a single gene, MYH3, contained novel mutations in the exomes of all four affected individuals.
The UW was one of three institutions, along with Harvard Medical School and the Broad Institute, funded in 2008 for The Exome Project by the National Heart, Lung and Blood Institute of the National Institutes of Health. The project aims at developing technologies to selectively sequence the human exome.
Shendure pointed out that a limitation of sequencing only exomes is that it doesn't reveal the regulatory, structural or other non-coding differences between human genomes.
Despite this limitation, genome-focused sequencing has several advantages: "Our focus on the protein-coding subset of the genome enables us do at least 20 times more samples than could be done with whole genome sequencing with equivalent effort," Shendure said. The data-gathering for this project started in November of 2008, and finished in February 2009. However, with the technical advances the researchers have achieved, a similar type of rare disease could be solved in a matter of weeks, and in the future even more rapidly.
As "second-generation" DNA sequencing technologies such as this expand in their use and overcome obstacles in the cost and time for collecting data, Shendure predicts different challenges will follow each step. For example, the amount of raw data that is collected by these sequencing instruments at the UW alone soon will be measured in petabytes. A petabyte is one quadrillion units of computer data, roughly the equivalent of 6 billion Web photos. New computational approaches for data analysis are a major part of the UW efforts, and they are expanding the new information that can be obtained with an exome-based approach.
"Massively parallel technologies that make it possible to study individual genomes have only recently emerged," Shendure said, "but hold significant promise for gaining new insights in human biology and medicine. This approach to human exome sequencing will be the key in scaling everyone's efforts to explore the genetics of both susceptibility and resistance to more complex human diseases such as heart disease, cancer, and infectious diseases."