Postprocessing: DMR Detection - GarrettJenkinson/informME GitHub Wiki

The user can use a provided utility to perform DMR detection using the Jensen-Shannon distance (JSD) based on the method described in [2]. This utility must be run within an R session.

Usage in R (when replicate reference data is available):

    setwd("/path/to/informME/src/R_src/")
    source("jsDMR.R") 
    runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder)

where

  • refVrefFiles is a vector of BIGWIG file names that contain the JSD values of all pairwise reference comparisons

  • testVrefFiles is a vector of BIGWIG file names that contain the JSD values of test/reference comparisons

  • inFolder is the directory that contains all JSD files

  • outFolder is the directory used to write the results

Usage in R (when no replicate reference data is available)

    setwd("/path/to/informME/src/R_src/")
    source("jsDMR.R") 
    runNoReplicateDMR(JSDfile,inFolder,outFolder)

where

  • JSDfile is the name of a BIGWIG file that contains the JSD values of a test/reference comparison

  • inFolder is the directory that contains the JSD file

  • outFolder is the directory used to write the result

NOTE 1: For this utilities, the following tools must be installed in R: rtracklayer, logitnorm, mixtools, annotatr, and Homo.sapiens

NOTE 2: More information about these utilities can be found in informME/src/R_src/postprocess/README.txt, with a relevant excerpt reproduced below:


jsDMR.R
-------

This is an R script that performs DMR detection using 
the Jensen-Shannon distance (JSD) based on the method 
described in [1]. Once DMR detection is done, the script
generates helpful annotation tables and figures. It should 
be run within an R session.

  syntax (replicate reference data is available): 
 
   source("jsDMR.R") 
   runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
   correction='XX',pAdjThresh=value) 
     
   # refVrefFiles is a vector of BIGWIG file names that contain  
   # the JSD values of all pairwise reference comparisons. 
   # For example: if 
   #
   # JSD-lungnormal-1-VS-lungnormal-2.bw
   # JSD-lungnormal-3-VS-lungnormal-1.bw 
   # JSD-lungnormal-3-VS-lungnormal-2.bw 
   #
   # are available, then set
   #
   # file1 <- "JSD-lungnormal-1-VS-lungnormal-2.bw"
   # file2 <- "JSD-lungnormal-3-VS-lungnormal-1.bw"
   # file3 <- "JSD-lungnormal-3-VS-lungnormal-2.bw"
   # refVrefFiles <- c(file1,file2,file3)
   #
   # testVrefFiles is a vector of BIGWIG file names that contain 
   # the JSD values of test/reference comparisons.
   # For example: if 
   #
   # JSD-lungcancer-1-VS-lungnormal-1.bw
   # JSD-lungcancer-2-VS-lungnormal-2.bw 
   # JSD-lungcancer-3-VS-lungnormal-3.bw 
   #
   # are available, then set
   #
   # file4 <- "JSD-lungcancer-1-VS-lungnormal-1.bw"
   # file5 <- "JSD-lungcancer-2-VS-lungnormal-2.bw"
   # file6 <- "JSD-lungcancer-3-VS-lungnormal-3.bw"
   # testVrefFiles <- c(file4,file5,file6)
   #
   # inFolder is the directory that contains all JSD files
   # outFolder is the directory used to write the results
   #
   # For example:
   # inFolder <- "/path/to/in-folder/"
   # outFolder <- "/path/to/out-folder/"
   #
   # correction is an optional argument that specifies the 
   # type of multiple hypothesis correction used:
   #
   # BY: Benjamini & Yekutieli (default).
   # BH: Benjamini & Hochberg
   # 
   # pAdjThresh is an optional argument that specifies the 
   # adjusted p-value threshold used (0.01 by default). 
   #
   # Examples: 
   #
   # Default usage: BY applied with FDR control at 0.01.
   #   runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder)
   # 
   # Alternative usage: BY applied with FDR control at 0.05.
   #   runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
   #                   pAdjThresh=0.05)
   #
   # Alternative usage: BH applied with FDR control at 0.05.
   #   runReplicateDMR(refVrefFiles,testVrefFiles,inFolder,outFolder,
   #		       correction='BH',pAdjThresh=0.05)

  syntax (no replicate reference data is available) 
   
   source("jsDMR.R") 
   runNoReplicateDMR(JSDfile,inFolder,outFolder,
                     correction='XX',pAdjThresh=value)

   # JSDfile is the name of a BIGWIG file that contains 
   # the JSD values of a test/reference comparison. 
   # For example: if 
   #
   # JSD-lungcancer-1-VS-lungnormal-1.bw
   #
   # is available, then set
   #
   # JSDfile <- "JSD-lungcancer-1-VS-lungnormal-1.bw"
   #
   # inFolder is the directory that contains the JSD file
   # outFolder is the directory used to write the result
   #
   # For example:
   # inFolder <- "/path/to/in-folder/"
   # outFolder <- "/path/to/out-folder/"
   #
   # correction is an optional argument that specifies the 
   # type of multiple hypothesis correction used:
   #
   # BY: Benjamini & Yekutieli (default).
   # BH: Benjamini & Hochberg
   # 
   # pAdjThresh is an optional argument that specifies the 
   # adjusted p-value threshold used (0.01 by default). 
   #
   # Examples: 
   #
   # Default usage: BY applied with FDR control at 0.01.
   #   runNoReplicateDMR(JSDfile,inFolder,outFolder)
   # 
   # Alternative usage: BY applied with FDR control at 0.05.
   #   runNoReplicateDMR(JSDfile,inFolder,outFolder,pAdjThresh=0.05)
   # 
   # Alternative usage: BH applied with FDR control at 0.05.
   #   runNoReplicateDMR(JSDfile,inFolder,outFolder,
   #		         correction='BH',pAdjThresh=0.05)

  requirements: 

   The following R libraries must be installed: 
     - rtracklayer
     - logitnorm
     - mixtools 
     - annotatr
     - Homo.Sapiens
     - ggplot2


REFERENCES
----------

[1] Jenkninson, G., Abante, J., Feinberg, A.P., and 
    Goutsias, J. (2018) An information-theoretic approach 
    to the modeling and analysis of whole-genome bisulfite 
    sequencing data, BMC Bioinformatics, 19:87, 
    https://doi.org/10.1186/s12859-018-2086-5.