6. Data visualisation - VascoElbrecht/JAMP GitHub Wiki
Sequences_lost()
This function is ued between modules to indicate how many sequences are discarded in each processing step (in red). When plotting relative sequence proportions, the imput files are used. If you want to plot the relative abundance of reads, in respect to the raw sequence data, set rel=F
and devide the Reads_in
and Reads_in
out by the inital number of raw reads.
Reads_in
are the number of reads in each sample, before the respecitive module is applied. If you just want to plot the reas remaining after the module was applied, set this toNA
.Reads_out
are the number of sequences remaining after the module was applied.- Set
rel=T
if you want to plot relative proportions in percentage - The figure title can be defined with
main
and oumitted by settingmain=""
- If a filename is given in
out
the plot will be saved as a PDF under that file name, other wise it will be plotted in R.
Length_distribution()
Function for plotting the length distribution of any fasta or fastq file. Does not work on wrapped fasta files!
sequFile
provide the path to the fasta or fastq file here. The sequence format should be automatically detected, but can also be provided withfastq=T
.col
colors used for the read abundances are provided here as a vector.maxL=600
length of the plot, depends on the number of cycles / readlingth of the used sequencer. Usually 500 or 600 is used for Illumina sequencing.- If a filename is given in
out
the plot will be saved as a PDF under that file name, other wise it will be plotted in R.
OTU_heatmap()
This function can be used to read the JAMP OTU tables and visualize them as heatmaps. Instead of a .csv
-file you can also use data.frames
directly, as long as the OTU IDs are present as row.names()
and all values in the table are numeric. The function will convert the read counts to relative abundance and highlight reads with 100-0.001% abundance with a color gradient. The color gradient is base on a log10
scale.
- Heatmaps can be saved as PDF files if the name is given in
out
. If no name is given the heatmap is returned as a plot within R. - If
abundance=T
the absolute read counts for each OTU and samples are plotted. They can also be converted to relative abundance usingrel=T
. Also, text can be omitted for entries where zero reads where detected, withplot0=F
. by default, no read counts are plotted. - The gradient color can be customised using
col=c("blue3", "white")
. Change e.g. the"Purple"
to"Orange"
(figure above), or use a light gray to stronger differentiate between 0 and low abundant reads, e.g.c( "Red", "gray95")
.
Denoise_barplot()
Plot the distribution of haplotypes within each OTU as a barplot. This plot is generated automatically as part of the Denoise()
function (in the _stats
folder). However, you can also use the Denoise_barplot()
function and the haplotype table e.g. E_haplo_table.csv
to further customize the plot.
- Use
table
to import a standard haplotype table csv or directly supply a data.frame with OTU names in the first column. - By default, samples of a respective OTU are plotted as white cells in the barplot, indicating no haplotypes were detected in x number of samples. If you would like to omit the empty OTUs and only plot the relative proportions of OTUs set
emptyOTUs=T
toF
. If turned off, no axis labels will be plotted, as the barplots do not indicate the actual distribution of OTUs. - Specify the name of the PDF to save in out, e.g.
out="MyPlot.pdf"
. The dimensions of the PDF can be adjusted withheight=6
andwidth=7
. If out is left onNA
the plot is returned within R. - To control the number of plots per row and line use
mfrow=c(5, 40)
.