Beijing Genomics Institute Advances Findings in Rice Genome Sequencing

BEIJING, CHINA -- The Genomics and Bioinformatics Center of the Chinese Academy of Sciences / Beijing Genomics Institute (BGI) has announced the completion of a rice genome working draft of a major crop genome sequenced after the human genome. Both the genomic and cDNA contig sequence data have been released and can be viewed at http://btn.genomics.org.cn/rice. The rice genome scaffold was assembled by a new algorithm on a Sun Enterprise (TM) 10000 server and Chinese-made Dawning 3000 and 2000 supercomputers. "This is a very important accomplishment," said Dr. Francis Collins, the Director of the National Human Genome Research Institute in the United States, and coordinator of the International Human Genome Sequencing Consortium. "Our Chinese colleagues have given the world a wonderful gift by deriving a highly useful draft of the instruction book for this incredibly important crop species." "The public availability of rice genome sequence will have an immediate and salutary effect on the scientific community," commented Dr. Eric Lander, director of the Whitehead Institute Center for Genome Research, which also serves as a sister center with BGI. The BGI team sequenced the genome of a rice subspecies indica. This indica subspecies is the paternal cultivar of a Chinese 'Super Hybrid Rice.' Cultivated by Chinese breeder Dr. Yuan Longping over twenty years, the Super Hybrid Rice has a yield per hectare 20-30 percent higher than the average of other rice crops in the field. "This is a very significant milestone accomplished by Chinese scientists independently. It shows that China has become another country with large-scale sequencing and assembly capability, after the United States and the United Kingdom," said Professor Yang Huanming, director of BGI. The rice genome is about one seventh the size of a human. The working draft has a four-fold coverage with over 95 percent of the gene coding region identified. The strategy used presented a new challenge in the genome assembly and annotation. The assembly of scaffolds was achieved by a new repeat-masking algorithm developed at BGI. BGI has also developed a gene finding algorithm to identify genes from the rice genome. To further assist gene survey from different rice varieties, tissues and development stages, over 77,000 cDNAs (expressed gene sequences) were sequenced and assembled into gene contigs. The genome assembly was conducted on high performance computer servers including the Sun E10K and the Chinese made Dawning 3000 and 2000. With its superior architecture and reliability, the Sun E10K server has proven to be the workhorse in genome assembly, which is very CPU-intensive, while the Dawning servers have taken on the sequence similarity comparison. "We are very pleased to see that our large memory, multi-processor E10K servers have been part of the successful story of rice genome assembly. The large memory of the E10K provides substantial benefits in such large computational projects. BGI was recently named a Sun Center of Excellence in Genomics and this is an excellent beginning to our multi-year partnership to provide new algorithms and capabilities," said Dr Stefan Unger, business development manager for computational biology in Sun's Global Education and Research group. "This showcases that computational technology plays an essential role in genomics in particular, and in biology in general. Bioinformatics not only analyzes data, it also directs the project strategy and experimental design," said Dr. Matthew Huang, deputy director of BGI. "Without our new algorithms and the fire power of these high performance computer servers such as E10K, we wouldn't have accomplished our task -- which was labeled 'impossible' at its beginning." Further information is available at www.genomics.org.cn or www.sun.com