小柯机器人

研究报道多个脊椎动物物种的完整无误基因组组装
2021-04-30 15:49

美国洛克菲勒大学Erich D. Jarvis等研究人员合作报道多个脊椎动物物种的完整无误基因组组装。相关论文发表在2021年4月28日出版的《自然》杂志上。

研究人员表示,高质量和完整的参考基因组装配对于将基因组学应用于生物学、疾病和生物多样性保护至关重要。但是,这些组装仅存在于少数非微生物物种中。为了解决这个问题,国际基因组10K(G10K)联盟在五年的时间里致力于评估和开发具有成本效益的方法来组装高度准确和几乎完整的参考基因组。

研究人员报道了代表六个主要脊椎动物谱系的16种物种的装配。研究人员认为长读测序技术对于最大化基因组质量至关重要,如果处理不当,未解析的复杂重复和单倍型杂合性是装配错误的主要来源。研究人员的程序集纠正了重大错误,在一些最佳的历史参考基因组中添加了缺失的序列,并揭示了生物学发现。这些包括鉴定许多错误的基因重复,增加基因大小,特定于谱系的染色体重排,蝙蝠基因组中重复的独立染色体断点以及蛋白质编码基因及其调控区中的富含GC的经典模式。通过这些结果,研究人员启动了脊椎动物基因组计划(VGP),这是一项国际工程,旨在为大约7万种现存脊椎动物物种产生高质量、完整的参考基因组,并帮助实现整个生命发现的新时代科学。

附:英文原文

Title: Towards complete and error-free genome assemblies of all vertebrate species

Author: Arang Rhie, Shane A. McCarthy, Olivier Fedrigo, Joana Damas, Giulio Formenti, Sergey Koren, Marcela Uliano-Silva, William Chow, Arkarachai Fungtammasan, Juwan Kim, Chul Lee, Byung June Ko, Mark Chaisson, Gregory L. Gedman, Lindsey J. Cantin, Francoise Thibaud-Nissen, Leanne Haggerty, Iliana Bista, Michelle Smith, Bettina Haase, Jacquelyn Mountcastle, Sylke Winkler, Sadye Paez, Jason Howard, Sonja C. Vernes, Tanya M. Lama, Frank Grutzner, Wesley C. Warren, Christopher N. Balakrishnan, Dave Burt, Julia M. George, Matthew T. Biegler, David Iorns, Andrew Digby, Daryl Eason, Bruce Robertson, Taylor Edwards, Mark Wilkinson, George Turner, Axel Meyer, Andreas F. Kautt, Paolo Franchini, H. William Detrich, Hannes Svardal, Maximilian Wagner, Gavin J. P. Naylor, Martin Pippel, Milan Malinsky, Mark Mooney, Maria Simbirsky, Brett T. Hannigan, Trevor Pesout, Marlys Houck, Ann Misuraca, Sarah B. Kingan, Richard Hall, Zev Kronenberg, Ivan Sovi, Christopher Dunn, Zemin Ning, Alex Hastie, Joyce Lee, Siddarth Selvaraj, Richard E. Green, Nicholas H. Putnam, Ivo Gut, Jay Ghurye

Issue&Volume: 2021-04-28

Abstract: High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1,2,3,4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.

DOI: 10.1038/s41586-021-03451-0

Source: https://www.nature.com/articles/s41586-021-03451-0

Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html


本期文章:《自然》:Online/在线发表

分享到:

0