woodcorpse的个人博客分享 http://blog.sciencenet.cn/u/woodcorpse

博文

QIIME 2教程. 17鉴定和过滤嵌合体q2-vsearch(2020.11)

已有 1455 次阅读 2021-1-30 16:05 |个人分类:QIIME2|系统分类:科研笔记

鉴定和过滤嵌合体序列q2-vsearch

Identifying and filtering chimeric feature sequences with q2-vsearch

https://docs.qiime2.org/2020.11/tutorials/chimera/

注:最好按本教程顺序学习,想直接学习本章,至少完成本系列《1简介和安装》

在QIIME 2中进行嵌合体检验基于FeatureTable[Frequency]FeatureData[Sequences]对象。QIIME 2内嵌了vsearch的Uchime无参(de novo)和有参(reference)去嵌合体流程。对于此过程的细节,详见Uchime的论文和vsearch的帮助文档。(推荐USEARCH软件主页有比较详细的教程,vsearch帮助读起来不方便)

本节使用《6沙漠土壤分析Atacama soil》中的特征表。

数据下载

Obtain the data

mkdir -p chimera
cd chimera

wget -c https://data.qiime2.org/2020.11/tutorials/chimera/atacama-table.qza
wget -c https://data.qiime2.org/2020.11/tutorials/chimera/atacama-rep-seqs.qza

无参嵌合体鉴定

Run de novo chimera checking

# 4s/11s
time qiime vsearch uchime-denovo \
  --i-table atacama-table.qza \
  --i-sequences atacama-rep-seqs.qza \
  --output-dir uchime-dn-out

输出对象:

注:基于参考序列(有参,Reference-based)的嵌合体鉴定方法详见vsearch uchime-ref

可视化统计结果

Visualize summary stats

qiime metadata tabulate \
  --m-input-file uchime-dn-out/stats.qza \
  --o-visualization uchime-dn-out/stats.qzv

输出可视化:

image

过滤特征表和序列

Filter input tables and sequences

过滤嵌合体和可疑序列

Exclude chimeras and “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-table uchime-dn-out/table-nonchimeric-wo-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-wo-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-wo-borderline.qzv

输出对象:

  • uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza:过滤嵌合体的序列。 查看 | 下载
  • uchime-dn-out/table-nonchimeric-wo-borderline.qza:过滤嵌合体的特征表。 查看 | 下载

输出可视化结果:

  • uchime-dn-out/table-nonchimeric-wo-borderline.qzv:特征表统计。 查看 | 下载

过滤嵌合但保留可疑序列

Exclude chimeras but retain “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-table uchime-dn-out/table-nonchimeric-w-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-w-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-w-borderline.qzv

输出对象:

  • uchime-dn-out/table-nonchimeric-w-borderline.qza:过滤嵌合体的序列。 查看 | 下载
  • uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza:过滤嵌合体的特征表。 查看 | 下载

输出可视化结果:

  • uchime-dn-out/table-nonchimeric-w-borderline.qzv:特征表统计。 查看 | 下载

译者简介

刘永鑫,博士,中科院青促会会员,QIIME 2项目参与人。2008年毕业于东北农业大学微生物学专业,2014年于中国科学院大学获生物信息学博士,2016年遗传学博士后出站留所工作,任工程师。目前主要研究方向为宏基因组数据分析。目前在Science、Nature Biotechnology、Protein & Cell、Current Opinion in Microbiology等杂志发表论文30余篇,被引2千余次。2017年7月创办“宏基因组”公众号,目前分享宏基因组、扩增子原创文章2400余篇,代表作有《扩增子图表解读、分析流程和统计绘图三部曲(21篇)》《微生物组实验手册》《微生物组数据分析》等,关注人数11万+,累计阅读2100万+。

Reference

https://docs.qiime2.org/2020.11/

Evan Bolyen, Jai Ram Rideout, Matthew R. Dillon, Nicholas A. Bokulich, Christian C. Abnet, Gabriel A. Al-Ghalith, Harriet Alexander, Eric J. Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E. Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J. Brislawn, C. Titus Brown, Benjamin J. Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily K. Cope, Ricardo Da Silva, Christian Diener, Pieter C. Dorrestein, Gavin M. Douglas, Daniel M. Durall, Claire Duvallet, Christian F. Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M. Gauglitz, Sean M. Gibbons, Deanna L. Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin A. Huttley, Stefan Janssen, Alan K. Jarmusch, Lingjing Jiang, Benjamin D. Kaehler, Kyo Bin Kang, Christopher R. Keefe, Paul Keim, Scott T. Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan G. I. Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D. Martin, Daniel McDonald, Lauren J. McIver, Alexey V. Melnik, Jessica L. Metcalf, Sydney C. Morgan, Jamie T. Morton, Ahmad Turan Naimey, Jose A. Navas-Molina, Louis Felix Nothias, Stephanie B. Orchanian, Talima Pearson, Samuel L. Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S. Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R. Spear, Austin D. Swafford, Luke R. Thompson, Pedro J. Torres, Pauline Trinh, Anupriya Tripathi, Peter J. Turnbaugh, Sabah Ul-Hasan, Justin J. J. van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C. Weber, Charles H. D. Williamson, Amy D. Willis, Zhenjiang Zech Xu, Jesse R. Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight & J. Gregory Caporaso#. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology. 2019, 37: 852-857. doi:10.1038/s41587-019-0209-9



https://m.sciencenet.cn/blog-3334560-1269755.html

上一篇:QIIME 2教程. 15样品分类和回归q2-sample-classifier(2020.11)
下一篇:QIIME 2教程. 20实用程序Utilities(2020.11)

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-3-28 20:06

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部