小柯机器人

科学家绘制出人类泛基因组参考草图
2023-05-18 10:29

美国加州大学圣克鲁斯分校Benedict Paten等研究人员合作绘制出人类泛基因组参考草图。相关论文于2023年5月10日发表在《自然》杂志上。

人类泛基因组参考联盟提出了人类泛基因组参考的第一份草图。泛基因组包含47个分阶段的二倍体组合,这些组合来自于不同基因的个体群。这些组合涵盖了每个基因组中99%以上的预期序列,在结构和碱基对水平上的准确度超过99%。基于这些组合的比对,研究人员生成了一个泛基因组草图,捕捉到了已知的变体和单倍型,并揭示了结构复杂的基因座上的新等位基因。

相对于现有的参考GRCh38,研究人员还增加了1.19亿个碱基对的超微多态性序列和1115个基因的重复。大约有9000万个额外的碱基对来自于结构变异。与基于GRCh38的工作流程相比,使用这个泛基因组草图来分析短读数据可将小变体发现错误减少34%,并将每个单倍型检测到的结构变体数量增加104%,这使得每个样本的绝大多数结构变体等位基因得以分型。

附:英文原文

Title: A draft human pangenome reference

Author: Liao, Wen-Wei, Asri, Mobin, Ebler, Jana, Doerr, Daniel, Haukness, Marina, Hickey, Glenn, Lu, Shuangjia, Lucas, Julian K., Monlong, Jean, Abel, Haley J., Buonaiuto, Silvia, Chang, Xian H., Cheng, Haoyu, Chu, Justin, Colonna, Vincenza, Eizenga, Jordan M., Feng, Xiaowen, Fischer, Christian, Fulton, Robert S., Garg, Shilpa, Groza, Cristian, Guarracino, Andrea, Harvey, William T., Heumos, Simon, Howe, Kerstin, Jain, Miten, Lu, Tsung-Yu, Markello, Charles, Martin, Fergal J., Mitchell, Matthew W., Munson, Katherine M., Mwaniki, Moses Njagi, Novak, Adam M., Olsen, Hugh E., Pesout, Trevor, Porubsky, David, Prins, Pjotr, Sibbesen, Jonas A., Sirn, Jouni, Tomlinson, Chad, Villani, Flavia, Vollger, Mitchell R., Antonacci-Fulton, Lucinda L., Baid, Gunjan, Baker, Carl A., Belyaeva, Anastasiya, Billis, Konstantinos, Carroll, Andrew, Chang, Pi-Chuan, Cody, Sarah, Cook, Daniel E., Cook-Deegan, Robert M., Cornejo, Omar E., Diekhans, Mark, Ebert, Peter, Fairley, Susan, Fedrigo, Olivier, Felsenfeld, Adam L., Formenti, Giulio, Frankish, Adam, Gao, Yan, Garrison, Nanibaa A., Giron, Carlos Garcia, Green, Richard E., Haggerty, Leanne, Hoekzema, Kendra, Hourlier, Thibaut

Issue&Volume: 2023-05-10

Abstract: Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

DOI: 10.1038/s41586-023-05896-x

Source: https://www.nature.com/articles/s41586-023-05896-x

Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html


本期文章:《自然》:Online/在线发表

分享到:

0