Exporting your data to QIIME2 - Michael-D-Preston/PrestonLab GitHub Wiki
By Angus Ball
Introduction
Sometimes life suchs and this has been a multiday effort in sucking. I dont like qiime2's file structure and neither should you, eitherway lets get this over with, we need 3 seperate files to use in qiime, a metadata file, an otu table, and a taxonomy table. Lets start simple
getting a metadata file
First you need Keemei. download this package and follow the pictures and instructions to format your metadata file.
Getting otu and taxonomy tables
then go to your linux box and download biomformat. PS. pip is a python command so you should have access but if you don't get rekt
Then in R follow this tutorial
Very painfully preparing your phyloseq object for qiime2
then bring all these folders into the linux box and run the following command
biom convert -i OTU.txt -o OTUbiomv210.biom --table-type="OTU table" --to-hdf5
then load up qiime2
conda activate qiime2-amplicon-2024.2
and start running these commands
import your OTU table:
qiime tools import --input-path OTUbiomv210.biom --type 'FeatureTable[Frequency]' --input-format BIOMV210Format --output-path table.qza
Imported OTUbiomv210.biom as BIOMV210Format to table.qza
Import your taxa table:
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path tax.txt --output-path taxonomy.qza
Imported tax.txt as HeaderlessTSVTaxonomyFormat to taxonomy.qza
Import your sequence information:
qiime tools import --input-path rep-seqs.fna --type 'FeatureData[Sequence]' --output-path rep-seqs.qza
Imported rep-seqs.fna as DNASequencesDirectoryFormat to rep-seqs.qza
Import your metadata:
qiime metadata tabulate \
--m-input-file SAM.txt \
--o-visualization SAM.qzv
Saved Visualization to: sample-metadata.qzv
Citations
The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Daniel McDonald, Jose C. Clemente, Justin Kuczynski, Jai Ram Rideout, Jesse Stombaugh, Doug Wendel, Andreas Wilke, Susan Huse, John Hufnagle, Folker Meyer, Rob Knight, and J. Gregory Caporaso. GigaScience 2012, 1:7. doi:10.1186/2047-217X-1-7
Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets. Rideout JR, Chase JH, Bolyen E, Ackermann G, González A, Knight R, Caporaso JG. GigaScience. 2016;5:27. http://dx.doi.org/10.1186/s13742-016-0133-6