TaxaPlot.R Tutorial

This tutorial can follow usage of Merged_Abundance_Processing.R or can be used with any relative abundance table in the format of Samples as columns and Taxa as rows.

Example Input Files

If starting from Merged_Abundance_Processing.R:

merged_abundance_table_example.txt

If starting from a relative abundance table:

Step 1: Process Merged Abundance File with Merged_Abundance_Processing.R

1. Read the merged abundance file into R

merged <- read.table("/path/to/file/merged_abundance_example.txt", header = FALSE, sep = "\t")

2. Set an output directory

output_dir <- ("/path/to/file/output")

3. Utilize merged_abundance_processing.R script

relabun_list <- merged_abundance_processing(merged, output_dir)

Step 2: Load metadata

1. Read metadata into R

met <- read.csv("/path/to/file/metadata_example.csv")

met <- tibble::column_to_rownames(met, var = "X")

Step 3: Subset merged abundance output (optional)

1. Subset the metadata and make the metadata groups by which to facet_wrap

change this to however you want to subset for your example data

samples <- met %>% filter(str_detect(SampleID, regex("\\bSample", ignore_case = TRUE)))

2. Extract sample ID's for the metadata group

subset_samples <- samples$SampleID

3. Subset the merged abundance output using subset_by_colnames_and_save (if working from merged abundance table)

new_dir <- ("/path/to/file/output/subset")

subset_relabun_list <- subset_by_colnames_and_save(relabun_list, subset_samples, prefix = "Subset", output_dir = new_dir)

Step 4: Subset most abundant taxa using mostAbundant

Genus_otu_most = mostAbundant(Subset_Genus_RelAbun, N = 15, items = NULL, others = TRUE, rescale = TRUE)

Species_otu_most = mostAbundant(Subset_Species_RelAbun, N = 15, items = NULL, others = TRUE, rescale = TRUE)

Step 5: Create metagroups for plots

1. Set metagroups by which to facet_wrap the plots

sample_metagroup <- list('All_Samples' = subset_samples)

2. If you have multiple groups to define, define the samples in each group, then the metagroup

five_samples <- subset(samples, Timepoint == "5_month")

five_samples <- five_samples$SampleID

two_samples <- subset(samples, Timepoint == "2_week")

two_samples <- two_samples$SampleID

time_samples <- (c(two_samples, five_samples))

timepoint_metagroup <- list('5_month' = five_samples, '2_week' = two_samples)

3. If your metagroups are a subset, make sure to subset the relative abundance as well

subset_genus <- Genus_otu_most[,subset_samples]

subset_species <- Species_otu_most[,subset_samples]

timepoint_genus <- Genus_otu_most[,time_samples]

timepoint_species <- Species_otu_most[,time_samples]

Step 5: Set colors and labels for plot

color_taxa_gen <- color_taxa(Genus_otu_most)

color_taxa_spec <- color_taxa(Species_otu_most)

label_fill_gen <- rownames(Genus_otu_most)

label_fill_spec <- rownames(Species_otu_most)

Step 6: Create and save taxa plot

1. Create plot

change height and width of the pdf as needed
data: relative abundance table
label_y: y label
color: colors for taxa
label_fill: labels for taxa
base_size: text size
nrow & ncol: number of columns and rows for facet_wrap
metadata_groups: groups by which to facet_wrap

pdf("/path/to/file/status.genus.pdf", width = 8, height = 5) #change the height and width of the pdf as you desire

taxa_plot(data = timepoint_genus, label_y = "Relative Abundance", color = color_taxa_gen, label_fill = label_fill_gen , base_size = 12, nrow = 1,ncol = 2,metadata_groups = timepoint_metagroup)

dev.off()

2. Save the data used in the plot

write.csv(timepoint_genus, "/path/to/file/timepoint.genus.csv")

Example Output Files

status.genus.pdf

timepoint.genus.csv

Plotting more taxa and setting unique colors

Step 1: Subset species by top 30 most abundant using mostAbundant

Species_otu_most = mostAbundant(Subset_Species_RelAbun, N = 30, items = NULL, others = TRUE, rescale = TRUE)

Step 2: Convert rownames into column

Species_otu_most <- Species_otu_most %>% rownames_to_column(var = "Species")

Step 3: Create custom order of taxa as I'd like them to appear in the plot

custom_order <- c("Other", "Bifidobacterium bifidum", "Bifidobacterium breve", "Bifidobacterium dentium", "Bifidobacterium longum", "Bifidobacterium pseudocatenulatum", "Klebsiella michiganensis", "Klebsiella oxytoca", "Klebsiella pneumoniae", "Klebsiella variicola", "Escherichia coli", "Veillonella parvula", "Veillonella ratti", "Veillonella rogosae", "Bacteroides fragilis", "Bacteroides ovatus", "Citrobacter freundii", "Citrobacter sp RHBSTW 00671", "Phocaeicola dorei", "Phocaeicola vulgatus", "Streptococcus salivarius", "Parabacteroides distasonis", "Parabacteroides merdae", "Staphylococcus epidermidis", "Erysipelatoclostridium ramosum", "Enterococcus faecalis", "Clostridium butyricum", "Hungatella hathewayi", "Megasphaera sp MJR8396C", "Ruminococcus gnavus", "Ruminococcus torques")

Step 4: Reorder the taxa by custom order

Species_otu_most <- Species_otu_most[match(custom_order, Species_otu_most$Species), ]

Step 5: Reset the rownames as taxa

rownames(Species_otu_most) = NULL

Species_otu_most <- Species_otu_most %>% column_to_rownames(var = "Species")

Step 6: Set custom colors for taxa list

color_taxa_spec <- c("Other" = "#bdbcbc", "Bifidobacterium bifidum" = "#85A1EF", "Bifidobacterium breve" = "#B7D8E8", "Bifidobacterium dentium" = "#7EC8E3", "Bifidobacterium longum" = "#0F52BA", "Bifidobacterium pseudocatenulatum" = "#00FFFF", "Klebsiella michiganensis" = "#b2d689", "Klebsiella oxytoca" = "#B4C424", "Klebsiella pneumoniae" = "#7CFC00", "Klebsiella variicola" = "#DFFF00", "Escherichia coli" = '#FFEA00', "Veillonella parvula" = "#f7766d", "Veillonella ratti" = "#b54e4b", "Veillonella rogosae"= "#732728", "Bacteroides fragilis" = "#196619", "Bacteroides ovatus" = "#123D12", "Citrobacter freundii" = "#D4FBE7", "Citrobacter sp RHBSTW 00671"= "#75D7A2", "Phocaeicola dorei" = "#0000FF", "Phocaeicola vulgatus" = "#00008B", "Streptococcus salivarius" = "#f5bcbc", "Parabacteroides distasonis" = "#AB6666", "Parabacteroides merdae" = "#F5DEB3", "Staphylococcus epidermidis"= "#f59a40", "Erysipelatoclostridium ramosum"= "#c8b1d5", "Enterococcus faecalis" = "#D22B2B", "Clostridium butyricum" = "#FF7518", "Hungatella hathewayi"= "#FF10F0", "Megasphaera sp MJR8396C"= "#9F2B68", "Ruminococcus gnavus" ="#5D3FD3", "Ruminococcus torques" = "#800080")

Step 7: Set labels for plot

label_fill_spec <- rownames(Species_otu_most)

Step 8: Subset relative abundance table as needed

timepoint_species <- Species_otu_most[,time_samples]

Step 9: Create and save the plot

pdf("/path/to/file/timepoint.species_extended.pdf", width = 12, height = 5) #change the height and width as needed

taxa_plot(timepoint_species, label_y = "Relative Abundance", color = color_taxa_spec, label_fill = label_fill_spec, base_size = 12, nrow = 1, ncol = 2, metadata_groups = timepoint_metagroup)

dev.off()

Step 10: Save the data used to make the plot

write.csv(timepoint_species, "/path/to/file/timepoint.species_extended.csv")

Example Output

timepoint.species_extended.pdf

timepoint.species_extended.csv

TaxaPlot.R Tutorial - sciencesharon/MicrobialSeq GitHub Wiki

TaxaPlot.R Tutorial

Example Input Files

Step 1: Process Merged Abundance File with Merged_Abundance_Processing.R

1. Read the merged abundance file into R

2. Set an output directory

3. Utilize merged_abundance_processing.R script

Step 2: Load metadata

1. Read metadata into R

Step 3: Subset merged abundance output (optional)

1. Subset the metadata and make the metadata groups by which to facet_wrap

2. Extract sample ID's for the metadata group

3. Subset the merged abundance output using subset_by_colnames_and_save (if working from merged abundance table)

Step 4: Subset most abundant taxa using mostAbundant

Step 5: Create metagroups for plots

1. Set metagroups by which to facet_wrap the plots

2. If you have multiple groups to define, define the samples in each group, then the metagroup

3. If your metagroups are a subset, make sure to subset the relative abundance as well

Step 5: Set colors and labels for plot

Step 6: Create and save taxa plot

1. Create plot

2. Save the data used in the plot

Example Output Files

Plotting more taxa and setting unique colors

Step 1: Subset species by top 30 most abundant using mostAbundant

Step 2: Convert rownames into column

Step 3: Create custom order of taxa as I'd like them to appear in the plot

Step 4: Reorder the taxa by custom order

Step 5: Reset the rownames as taxa

Step 6: Set custom colors for taxa list

Step 7: Set labels for plot

Step 8: Subset relative abundance table as needed

Step 9: Create and save the plot

Step 10: Save the data used to make the plot

Example Output

⚠️ GitHub.com Fallback ⚠️

TaxaPlot.R Tutorial - sciencesharon/MicrobialSeq GitHub Wiki

TaxaPlot.R Tutorial

Example Input Files

Step 1: Process Merged Abundance File with Merged_Abundance_Processing.R

1. Read the merged abundance file into R

2. Set an output directory

3. Utilize merged_abundance_processing.R script

Step 2: Load metadata

1. Read metadata into R

Step 3: Subset merged abundance output (optional)

1. Subset the metadata and make the metadata groups by which to facet_wrap

2. Extract sample ID's for the metadata group

3. Subset the merged abundance output using subset_by_colnames_and_save (if working from merged abundance table)

Step 4: Subset most abundant taxa using mostAbundant

Step 5: Create metagroups for plots

1. Set metagroups by which to facet_wrap the plots

2. If you have multiple groups to define, define the samples in each group, then the metagroup

3. If your metagroups are a subset, make sure to subset the relative abundance as well

Step 5: Set colors and labels for plot

Step 6: Create and save taxa plot

1. Create plot

2. Save the data used in the plot

Example Output Files

Plotting more taxa and setting unique colors

Step 1: Subset species by top 30 most abundant using mostAbundant

Step 2: Convert rownames into column

Step 3: Create custom order of taxa as I'd like them to appear in the plot

Step 4: Reorder the taxa by custom order

Step 5: Reset the rownames as taxa

Step 6: Set custom colors for taxa list

Step 7: Set labels for plot

Step 8: Subset relative abundance table as needed

Step 9: Create and save the plot

Step 10: Save the data used to make the plot

Example Output

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️