Postprocessing: DMR Detection - GarrettJenkinson/informME GitHub Wiki
The user can use a provided utility to perform DMR detection using the Jensen-Shannon distance (JSD) based on the method described in [2]. This utility must be run within an R session.
Usage in R (when replicate reference data is available):
setwd("/path/to/informME/src/R_src/")
source("jsDMR.R")
runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder)
where
-
refVrefFiles
is a vector of BIGWIG file names that contain the JSD values of all pairwise reference comparisons -
testVrefFiles
is a vector of BIGWIG file names that contain the JSD values of test/reference comparisons -
inFolder
is the directory that contains all JSD files -
outFolder
is the directory used to write the results
Usage in R (when no replicate reference data is available)
setwd("/path/to/informME/src/R_src/")
source("jsDMR.R")
runNoReplicateDMR(JSDfile,inFolder,outFolder)
where
-
JSDfile
is the name of a BIGWIG file that contains the JSD values of a test/reference comparison -
inFolder
is the directory that contains the JSD file -
outFolder
is the directory used to write the result
NOTE 1: For this utilities, the following tools must be installed in R: rtracklayer, logitnorm, mixtools, annotatr, and Homo.sapiens
NOTE 2: More information about these utilities can be found in informME/src/R_src/postprocess/README.txt
, with a relevant excerpt reproduced below:
jsDMR.R
-------
This is an R script that performs DMR detection using
the Jensen-Shannon distance (JSD) based on the method
described in [1]. Once DMR detection is done, the script
generates helpful annotation tables and figures. It should
be run within an R session.
syntax (replicate reference data is available):
source("jsDMR.R")
runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
correction='XX',pAdjThresh=value)
# refVrefFiles is a vector of BIGWIG file names that contain
# the JSD values of all pairwise reference comparisons.
# For example: if
#
# JSD-lungnormal-1-VS-lungnormal-2.bw
# JSD-lungnormal-3-VS-lungnormal-1.bw
# JSD-lungnormal-3-VS-lungnormal-2.bw
#
# are available, then set
#
# file1 <- "JSD-lungnormal-1-VS-lungnormal-2.bw"
# file2 <- "JSD-lungnormal-3-VS-lungnormal-1.bw"
# file3 <- "JSD-lungnormal-3-VS-lungnormal-2.bw"
# refVrefFiles <- c(file1,file2,file3)
#
# testVrefFiles is a vector of BIGWIG file names that contain
# the JSD values of test/reference comparisons.
# For example: if
#
# JSD-lungcancer-1-VS-lungnormal-1.bw
# JSD-lungcancer-2-VS-lungnormal-2.bw
# JSD-lungcancer-3-VS-lungnormal-3.bw
#
# are available, then set
#
# file4 <- "JSD-lungcancer-1-VS-lungnormal-1.bw"
# file5 <- "JSD-lungcancer-2-VS-lungnormal-2.bw"
# file6 <- "JSD-lungcancer-3-VS-lungnormal-3.bw"
# testVrefFiles <- c(file4,file5,file6)
#
# inFolder is the directory that contains all JSD files
# outFolder is the directory used to write the results
#
# For example:
# inFolder <- "/path/to/in-folder/"
# outFolder <- "/path/to/out-folder/"
#
# correction is an optional argument that specifies the
# type of multiple hypothesis correction used:
#
# BY: Benjamini & Yekutieli (default).
# BH: Benjamini & Hochberg
#
# pAdjThresh is an optional argument that specifies the
# adjusted p-value threshold used (0.01 by default).
#
# Examples:
#
# Default usage: BY applied with FDR control at 0.01.
# runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder)
#
# Alternative usage: BY applied with FDR control at 0.05.
# runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
# pAdjThresh=0.05)
#
# Alternative usage: BH applied with FDR control at 0.05.
# runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
# correction='BH',pAdjThresh=0.05)
syntax (no replicate reference data is available)
source("jsDMR.R")
runNoReplicateDMR(JSDfile,inFolder,outFolder,
correction='XX',pAdjThresh=value)
# JSDfile is the name of a BIGWIG file that contains
# the JSD values of a test/reference comparison.
# For example: if
#
# JSD-lungcancer-1-VS-lungnormal-1.bw
#
# is available, then set
#
# JSDfile <- "JSD-lungcancer-1-VS-lungnormal-1.bw"
#
# inFolder is the directory that contains the JSD file
# outFolder is the directory used to write the result
#
# For example:
# inFolder <- "/path/to/in-folder/"
# outFolder <- "/path/to/out-folder/"
#
# correction is an optional argument that specifies the
# type of multiple hypothesis correction used:
#
# BY: Benjamini & Yekutieli (default).
# BH: Benjamini & Hochberg
#
# pAdjThresh is an optional argument that specifies the
# adjusted p-value threshold used (0.01 by default).
#
# Examples:
#
# Default usage: BY applied with FDR control at 0.01.
# runNoReplicateDMR(JSDfile,inFolder,outFolder)
#
# Alternative usage: BY applied with FDR control at 0.05.
# runNoReplicateDMR(JSDfile,inFolder,outFolder,pAdjThresh=0.05)
#
# Alternative usage: BH applied with FDR control at 0.05.
# runNoReplicateDMR(JSDfile,inFolder,outFolder,
# correction='BH',pAdjThresh=0.05)
requirements:
The following R libraries must be installed:
- rtracklayer
- logitnorm
- mixtools
- annotatr
- Homo.Sapiens
- ggplot2
REFERENCES
----------
[1] Jenkninson, G., Abante, J., Feinberg, A.P., and
Goutsias, J. (2018) An information-theoretic approach
to the modeling and analysis of whole-genome bisulfite
sequencing data, BMC Bioinformatics, 19:87,
https://doi.org/10.1186/s12859-018-2086-5.