6. Data visualisation - VascoElbrecht/JAMP GitHub Wiki
Sequences_lost()

This function is ued between modules to indicate how many sequences are discarded in each processing step (in red). When plotting relative sequence proportions, the imput files are used. If you want to plot the relative abundance of reads, in respect to the raw sequence data, set rel=F and devide the Reads_in and Reads_inout by the inital number of raw reads.
Reads_inare the number of reads in each sample, before the respecitive module is applied. If you just want to plot the reas remaining after the module was applied, set this toNA.Reads_outare the number of sequences remaining after the module was applied.- Set
rel=Tif you want to plot relative proportions in percentage - The figure title can be defined with
mainand oumitted by settingmain="" - If a filename is given in
outthe plot will be saved as a PDF under that file name, other wise it will be plotted in R.
Length_distribution()

Function for plotting the length distribution of any fasta or fastq file. Does not work on wrapped fasta files!
sequFileprovide the path to the fasta or fastq file here. The sequence format should be automatically detected, but can also be provided withfastq=T.colcolors used for the read abundances are provided here as a vector.maxL=600length of the plot, depends on the number of cycles / readlingth of the used sequencer. Usually 500 or 600 is used for Illumina sequencing.- If a filename is given in
outthe plot will be saved as a PDF under that file name, other wise it will be plotted in R.
OTU_heatmap()

This function can be used to read the JAMP OTU tables and visualize them as heatmaps. Instead of a .csv-file you can also use data.frames directly, as long as the OTU IDs are present as row.names() and all values in the table are numeric. The function will convert the read counts to relative abundance and highlight reads with 100-0.001% abundance with a color gradient. The color gradient is base on a log10 scale.
- Heatmaps can be saved as PDF files if the name is given in
out. If no name is given the heatmap is returned as a plot within R. - If
abundance=Tthe absolute read counts for each OTU and samples are plotted. They can also be converted to relative abundance usingrel=T. Also, text can be omitted for entries where zero reads where detected, withplot0=F. by default, no read counts are plotted. - The gradient color can be customised using
col=c("blue3", "white"). Change e.g. the"Purple"to"Orange"(figure above), or use a light gray to stronger differentiate between 0 and low abundant reads, e.g.c( "Red", "gray95").
Denoise_barplot()

Plot the distribution of haplotypes within each OTU as a barplot. This plot is generated automatically as part of the Denoise() function (in the _stats folder). However, you can also use the Denoise_barplot() function and the haplotype table e.g. E_haplo_table.csv to further customize the plot.
- Use
tableto import a standard haplotype table csv or directly supply a data.frame with OTU names in the first column. - By default, samples of a respective OTU are plotted as white cells in the barplot, indicating no haplotypes were detected in x number of samples. If you would like to omit the empty OTUs and only plot the relative proportions of OTUs set
emptyOTUs=TtoF. If turned off, no axis labels will be plotted, as the barplots do not indicate the actual distribution of OTUs. - Specify the name of the PDF to save in out, e.g.
out="MyPlot.pdf". The dimensions of the PDF can be adjusted withheight=6andwidth=7. If out is left onNAthe plot is returned within R. - To control the number of plots per row and line use
mfrow=c(5, 40).