Salk prof Michael shines a light into black holes in the Arabidopsis genome

Salk scientists, collaborating with researchers from the University of Cambridge and Johns Hopkins University, have sequenced the genome of the world’s most widely used model plant species, Arabidopsis thaliana, at a level of detail never previously achieved. The study, published in Science on November 12, 2021, reveals the secrets of Arabidopsis chromosome regions called centromeres. The findings shed light on centromere evolution and provide insights into the genomic equivalent of black holes. Todd Michael

“Just over 20 years ago the Arabidopsis genome was published, and it has been the gold standard plant genome since giving rise to amazing advances from models to crops,” says Todd Michael, a research professor in the Plant Molecular and Cellular Biology Laboratory. “Our new assembly resolves the final missing pieces of the genome, paving the way for exciting research on chromosome architecture and evolution, which will be critical for our efforts to engineer plants to address climate change in the future.”

Arabidopsis thaliana was adopted as a model plant due to its short generation time, small size, ease of growth, and prolific seed production through self-pollination. Its fast life cycle and small genome make it well suited for genetics research and to map key genes that underpin traits of interest. It has led to a multitude of discoveries and in 2000 it became the first plant to have its genome sequenced. This initial genome release was of an excellent standard in the chromosome arms, where most of the genes are located, but was unable to assemble the highly repetitive and complex regions known as centromeres, telomeres, and ribosomal DNA. Now, due to advances in sequencing technologies, these challenging regions have been assembled for the first time.

The study is the first to successfully perform long-read sequencing and assembly of the Arabidopsis thaliana centromeres. Since the genome was first sequenced in 2000, long-read sequencing technologies have advanced, allowing researchers to see the genome in greater than 100,000 nucleotide pieces, instead of 100-200 nucleotide pieces. These data, combined with algorithmic advances that assemble the reads, means that the “genomic jigsaw puzzle” is suddenly now completable.

“The centromeres are some of the most interesting, but also the most difficult, regions of the genome to analyze —they are like endless ‘blue sky’ within a jigsaw puzzle,” says co-corresponding author Professor Mike Schatz, from Johns Hopkins University. “Fortunately, advances in sequencing paired with advances in the computational methods for genome assembly now make it possible to accurately assemble even the most challenging of sequences,” such as the genetic makeup of the centromere. Arabidopsis thaliana plant.

For decades, researchers have been trying to understand the paradox of how and why centromeric DNA evolves with extraordinary rapidity, whilst remaining stable enough to perform its job during cell division. In contrast, other ancient parts of the cell that have conserved roles, such as ribosomes, which make proteins from mRNA, tend to be very slow evolving. Yet the centromere, despite its conserved role in cell division, is the fastest evolving part of the genome. This study, by revealing the genetic and epigenetic topography of Arabidopsis centromeres, marks a step-change in our understanding of this paradox.

As part of the study, the compiled centromere maps provide new insights into the “repeat ecosystem” found in the centromere. The maps reveal the architecture of the repeat arrays, which has implications for how they evolve, and for the chromatin and epigenetic states of the centromeres. Moving forward the scientists want to use these maps as a foundation to understand how and why centromeres are evolving so rapidly.

“It’s fantastic to be able to see into the centromeres for the first time and use this to understand their unusual modes of evolution,” says co-corresponding author Professor Ian Henderson, from the University of Cambridge’s Department of Plant Sciences.

Next, the scientists will be looking at using this approach to map centromeres from diverse Arabidopsis species, and ultimately more widely throughout plants.

Other scientists include Bradley W. Abramson, Nolan Hartwick and Kelly Colt of Salk; Matthew Naish, Piotr Wlodzimierz, Andrew J. Tock, Christophe Lambing, Pallas Kuo and Natasha Yelina of the University of Cambridge; Michael Alonge of Johns Hopkins University; Anna Schmücker, Bhagyshree Jamge and Frédéric Berger of the Austrian Academy of Sciences; Terezie Mandáková and Martin A. Lysak of Masaryk University in the Czech Republic; Lisa Smith and Jurriaan Ton of the University of Sheffield; Tetsuji Kakutani of the University of Tokyo; Robert A. Martienssen of the Howard Hughes Medical Institute; Korbinian Schneeberger of LMU Munich; and Alexandros Bousios of the University of Sussex.