小柯机器人

科学家开发出端粒到端粒基因组组装的验证和修改策略
2022-04-04 12:28

美国国立卫生研究院Arang Rhie、Adam M. Phillippy等研究人员合作开发出端粒到端粒基因组组装的验证和修改策略。2022年3月31日,国际知名学术期刊《自然—方法学》在线发表了这一成果。

研究人员表示,长读测序技术和基因组组装方法的进步使得最近完成了第一个端粒到端粒的人类基因组组装,它解决了复杂的节段性重复和大串联重复,包括一个完整的葡萄胎(CHM13)的中心粒卫星阵列。虽然来自于高度准确的序列,但评估发现在最初的装配草案中存在小的错误和结构性的错误装配。

为了纠正这些错误,研究人员设计了一个新的重复-察觉修改策略,在大的重复中进行准确的组装修正,而没有过度修正,最终修复了51%的现有错误,并将PacBio高保真和Illumina k-mers测量的组装质量值从70.2提高到73.9。通过将这些结果与标准的自动修改工具进行比较,研究人员概述了常见的修改错误,并为资源有限的基因组项目提供了实用建议。研究人员还展示了高保真和牛津纳米孔技术读数中的测序偏差是如何导致特征组装错误的,这些错误可以用不同的测序技术方案进行纠正。

附:英文原文

Title: Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

Author: Mc Cartney, Ann M., Shafin, Kishwar, Alonge, Michael, Bzikadze, Andrey V., Formenti, Giulio, Fungtammasan, Arkarachai, Howe, Kerstin, Jain, Chirag, Koren, Sergey, Logsdon, Glennis A., Miga, Karen H., Mikheenko, Alla, Paten, Benedict, Shumate, Alaina, Soto, Daniela C., Sovi, Ivan, Wood, Jonathan M. D., Zook, Justin M., Phillippy, Adam M., Rhie, Arang

Issue&Volume: 2022-03-31

Abstract: Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies. The work describes the validation and polishing strategies developed by the telomere-to-telomere consortium for evaluating and improving the first complete human genome assembly.

DOI: 10.1038/s41592-022-01440-3

Source: https://www.nature.com/articles/s41592-022-01440-3

Nature Methods:《自然—方法学》,创刊于2004年。隶属于施普林格·自然出版集团,最新IF:47.99
官方网址:https://www.nature.com/nmeth/
投稿链接:https://mts-nmeth.nature.com/cgi-bin/main.plex


本期文章:《自然—方法学》:Online/在线发表

分享到:

0