The newly assembled genomes of 26 different genetic lines of corn illustrate the crop’s rich genetic diversity and lay the groundwork for a better understanding of what genetic mechanisms account for crop traits prized by farmers.
The mapping of the 26 genomes, published recently in the journal Science, was a team effort co-led by University of Georgia’s Kelly Dawe that will help scientists piece together the puzzle of corn genetics. Using these new genomes as references, plant scientists can better select for genes likely to lead to better crop yields or stress tolerance.
“For much of the modern genetic era, we relied on a single genome and compared everything else to it. However, we have learned that one genome doesn’t have all the genes,” said Dawe, UGA Athletic Association Professor in Plant Genetics. “It is like having one golf club, one socket wrench or one set of clothes. We, as a community, have been slowly trying to shift our approach to include multiple references. Our goal here was to shift all of maize genomics in one large leap from one reference to 26.”
Dawe worked on the project with a team including Matthew Hufford, first author on the paper and associate professor of ecology, evolution and organismal biology at Iowa State University, where the analysis was performed.
It started with one genetic line
The first corn genome to be mapped was the genetic line known as B73, a line developed at Iowa State and completed in 2009. Since then, B73 has served as the primary reference genome for corn, with a handful of additional genome assemblies becoming available only in the last few years. That means scientists have a limited understanding of genetic sequences in other corn genomes that aren’t present in B73.
But the 26 genomes mapped in the new study encompass a wide range of genetic diversity, covering everything from popcorn to sweetcorn to field corn from various geographical and environmental conditions. This provides much more reference data for scientists searching for genetic targets that could lead to better crop performance.
Hufford said the sheer genetic diversity present in corn created major hurdles for the assembly of the new genomes. He said 85% of the corn genome is composed of transposable elements, or patterns that repeat throughout the genome. Hufford compared those transposable elements to a jigsaw puzzle in which the vast majority of pieces are a single color. All that repetition makes it difficult to figure out how the parts fit together.
“If you can’t find a unique color or shape that tells you where to put the puzzle piece, you’re in a world of hurt,” Hufford said. “But if you get slightly larger puzzle pieces with unique features, that simplifies the process.”
Additional partners involved in the analysis included Cold Spring Harbor Laboratory and Corteva Agriscience.
A lot of genome data to crunch
“The first genome was invaluable, providing an initial parts list and partial wiring diagram. But we knew it was not complete,” said Doreen Ware, adjunct associate professor and research scientist in the U.S. Department of Agriculture’s Agricultural Research Center, located in Cold Spring Harbor, New York. “It was critical to develop other genome references to understand the genetic architecture and other important agricultural traits.”
The project’s primary challenge, according to Dawe, was the enormity of the data and the difficulty of integrating it all into a single resource in the allotted two-year time frame between funding in January 2018 to initial release in January 2020. “Long-read” sequencing technology developed in 2018 was the “special sauce” that allowed the team to assemble the genomes with accuracy that greatly exceeded all other maize genome assemblies.
“The massive sequencing effort required the simultaneous efforts of four public sequencing facilities in four states,” he said. “We worked with private industry to carry out a lot of the computational work.”
Dawe’s lab provided expertise in the regions of the genome that lie between genes. They interpreted the sequences of the centromeres, attachment domains that move chromosomes during cell division. Led by Jonathan Gent, senior research associate, the UGA researchers were also responsible for the DNA methylation analysis and annotation, where they identified all the parts of the genome that turn genes on and off.
A pan-genome reference
“All maize breeding activities require complete genomes. Every useful trait is directly referenced to genes, like a page number or Wikipedia entry,” Dawe said. “The process is vastly more accurate when there are more genomes available. We have created a pan-genome reference that has more than doubled the number of referenceable genes.”
All of the data has been integrated into the maizeGDB resource, funded by the USDA, so the results will be accessible to all maize researchers for decades to come.
“The effect on maize will be immediate, and the effect on other crops will be apparent in the next few years,” Dawe said, “as other major-crop genetics communities strive to meet the new standard we have set for maize.”
Co-authors at UGA include Gent, Dong won Kim, Jianing Liu, Alexandre P. Marand, Rebecca D. Piri, William A. Ricci, Robert J. Schmitz, Na Wang and Yibing Zeng. Additional co-authors include 37 researchers at Cold Spring Harbor Laboratory; Corteva Agriscience; Iowa State University; University of Arizona; University of California, Berkeley; University of California, Davis; University of Minnesota; USDA ARS National Animal Disease Center; and the USDA ARS North Atlantic Area Robert W. Holley Center for Agriculture and Health.
The research was funded by the National Science Foundation Plant Genome Research Program.