TickingClock的个人博客分享 http://blog.sciencenet.cn/u/TickingClock

博文

the plant journal:高粱参考基因组update

已有 4043 次阅读 2018-1-2 08:28 |个人分类:每日摘要|系统分类:论文交流|关键词:学者




The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization


First author: Ryan F. McCormick; Affiliations: Texas A&M University (德州农工大学卡城分校): TX, USA
Corresponding author: John E. Mullet


Sorghum bicolor (高粱) is a drought tolerant C4 grass used for the production of grain (谷物), forage (饲料), sugar, and lignocellulosic biomass (木质纤维素生物量) and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid genetics, diverse germplasm (种质资源), and colinearity (共线性) with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated (包含), the number of genes annotated increased 24% to 34 211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Subtelomeric repeats (端粒重复序列) with characteristics of Tandem Repeats in Miniature (TRIM; 末端串联重复) elements were identified at the termini of most chromosomes. Nucleosome (核小体) occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of more than 50 resequenced genomes from diverse sorghum genotypes to the reference genome identified approximately 7.4 M single nucleotide polymorphisms (SNPs) and 1.9 M indels. Large-scale variant features in euchromatin (常染色质) were identified with periodicities (周期性) of approximately 25 kbp. A transcriptome atlas of gene expression was constructed from 47 RNA-seq profiles of growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, and seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement.




高粱是一种耐旱的碳4草本,被广泛用来生产谷物、饲料、糖类及木质纤维素生物量,同时也由于其基因组大约800Mbp,相对其它同类植物较小,且高粱作为二倍体遗传,种质资源丰富、与其它碳4草本具有很强的共线性关系,因此高粱是一个比较理想的碳4草本模式植物。本文利用深度测序、遗传连锁分析及转录组数据等获得了一个高质量、注释完美的高粱参考基因组。参考基因组序列排序获得了提升,组装基因组增加了29.6 Mbp的确定排序的序列,另外注释基因的数量提升了24%,达到了34211个基因,平均基因长度和N50均得到了提升,并且错误率降低了10倍左右,达到了平均100kbp一个碱基的错误。作者还在大多数染色体的末端鉴定到了含有末端串联重复元件的端粒重复序列。核小体占位预测显示核小体紧紧跟随着转录起始位点的下游,并且在整个染色体上以不同的密度分布。通过50个不同基因型高粱的重测序比对到参考基因组上,作者鉴定到了7.4M的单核苷酸变异和1.9M的插入或缺失。作者好鉴定到常染色质上大范围的变异特性以大约25 kbp的长度在染色体上周期性分布。作者通过对高粱根、茎、叶、花序及种子在幼年、营养生长和生殖生长三个阶段共47个转录组数据分析获得了高粱的基因表达图谱。转录组分析显示组织特异性和蛋白激酶的表达对于转录谱的聚类具有很大的影响。本文所报道的基因组序列更新、注释、和转录组数据为C4草本植物的研究和作物改良提供了数据资源。



通讯John E. Mullet (https://biochemistry.tamu.edu/people/mullet-john/)


个人简介:1976年,科尔盖特大学,学士;1980年,伊利诺伊大学,博士;1980-1983年,洛克菲勒大学,博士后。


研究方向:功能基因组学:建立基因组图谱、和DNA诊断芯片以分析全面的基因表达和生物多样性。


doi: 10.1111/tpj.13781


Journal: the plant journal
Published date: 28 December, 2017

P.S. 欢迎关注微信公众号:微信号Plant_Frontiers


P.S. 第一篇高粱基因组发表于2007年Nature杂志,doi:10.1038/nature07723,文章题目:《The Sorghum bicolor genome and the diversification of grasses


https://m.sciencenet.cn/blog-3158122-1092631.html

上一篇:Genome Biology:拟南芥重测序分析揭示其对长江流域的适应
下一篇:Nature Plants:大豆种油含量与种皮光滑性状驯化基因

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-18 03:08

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部