Filtering out Low Abundance OTUS Sequence Variants - meyermicrobiolab/Meyer_Lab

Removing samples

IF you need to remove samples, do that first, then filter out low abundance OTUs

Remove OTUs that do not appear more than 1 time in more than half the samples

    filterlow <- genefilter_sample(ps, filterfun_sample(function(x) x>1),A=0.5*nsamples(ps))
    ps1<-prune_taxa(filterlow,ps)
    ntaxa(ps1)

Remove samples by sample names ( != means exclude; you could use == for keep samples matching, but you can only have one thing to match)

    ps2 = subset_samples(ps, sample_names(ps) != "F10" & sample_names(ps) != "G2" & sample_names(ps) != "J10" &
    sample_names(ps) != "K3" & sample_names(ps) != "K8" & sample_names(ps) != "K4")
    nsamples(ps2)

If you want to keep MULTIPLE SAMPLES, separate with "or" (|) instead of "and" (&)

    ps10raG = subset_samples(ps10ra, Coral == "MC2" | Coral == "OF2")

Remove samples by metadata ("Coral" is a column name in my metadata, command says EXCLUDE samples matching "Diploria...." and samples matching "Dichocoenia...")
- ```
    ps_MO = subset_samples(ps_nopink, Coral != "Diploria labyrinthiformis" & Coral != "Dichocoenia stokesi" )
    nsamples(ps_MO)
```

Filtering out Low Abundance

Filter out low abundance otus; only OTUs with a mean relative abundance greater than 10^-5 (0.001%) are kept
- ```
    ps2 <- transform_sample_counts(ps, function(OTU) OTU/sum(OTU))
    ps2f <- filter_taxa(ps2, function(x) mean(x) > 1e-5, TRUE)
    ntaxa(ps2f)
```
  if you do this, the otu table is now rel. ab. --- can't use in codaseq

Filtering used by Bian et al msphere: filter out low abundance otus; only OTUs greater than 0.1% relative abundance in any sample and occurred in at least 20% of samples are kept

    ps2 <- transform_sample_counts(ps, function(OTU) OTU/sum(OTU))
    filterlow <- genefilter_sample(ps2, filterfun_sample(function(x) x> 1e-3),A=0.2*nsamples(ps2))
    ps3<-prune_taxa(filterlow,ps2)

filter out taxa with mean read count across all samples >10 ###### this is what worked well on my project
- ```
    ntaxa(ps)
    ps10<-filter_taxa(ps, function(x) mean(x) >10, TRUE)
    ntaxa(ps10)
```

Filter out taxa with mean read count across all samples >5

    ps5<-filter_taxa(ps, function(x) mean(x) >5, TRUE)
    ntaxa(ps5)
    get_taxa_unique(ps5, "Phylum")
    get_taxa_unique(ps5, "Order")

Filtered taxa with phyloseq, now export otu and taxa tables from phyloseq object for input to CoDaSeq

    otu = as(otu_table(ps5), "matrix")
    taxon = as(tax_table(ps5), "matrix")
    metadata = as(sample_data(ps5), "matrix")
    write.table(otu,"filtered_otu_table_DiseaseOutbreak_gg_nochloromito.txt",sep="\t",col.names=NA)
    write.table(taxon,"filtered_taxa_table_DiseaseOutbreak_gg_nochloromito.txt",sep="\t",col.names=NA)
    write.table(metadata,"filtered_metadata.txt",sep="\t",col.names=NA)

Filtering out Low Abundance OTUS Sequence Variants - meyermicrobiolab/Meyer_Lab_Resources GitHub Wiki

Skip to:

Removing samples

IF you need to remove samples, do that first, then filter out low abundance OTUs

Filtering out Low Abundance

Now you can import the filtered otu table, taxa table, and updated metadata files into the CoDaSeq pipeline for analysis.

⚠️ GitHub.com Fallback ⚠️

Filtering out Low Abundance OTUS Sequence Variants - meyermicrobiolab/Meyer_Lab_Resources GitHub Wiki

Skip to:

Removing samples

IF you need to remove samples, do that first, then filter out low abundance OTUs

Filtering out Low Abundance

Now you can import the filtered otu table, taxa table, and updated metadata files into the CoDaSeq pipeline for analysis.

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️