Testing Your Installation - GarrettJenkinson/informME GitHub Wiki
informME
is distributed with a comprehensive but small "toy model" intended for testing and debugging your local installation, and for familiarizing yourself with the tool. The reference genome consists of five chromosomes of length 10 kb each. The WGBS reads were simulated so that the resulting mean methylation level is known in advance and the cancer sample suffers from genome-wide hypo-methylation. Roughly, processing the entire toy example takes about 15 minutes. Once informME
has been installed through install.sh
, follow the steps described here to test our comprehensive toy example:
- If on a server that uses modules to load dependencies, load MATLAB and SAMtools:
module load matlab
module load samtools
- Reference Genome. Run the following:
cd informME/src/bash_src/parseBamFile/fastaToCpg/main
./main.sh
and then run ls -lthr
to check that a file CpGlocationChrX.mat
with a size of approximately 3.2K has been created for each of the five chromosomes:
total 76K
-rw-rw-r-- 1 usr usr 49K Apr 20 17:23 toy_genome.fa
-rwxrwxr-x 1 usr usr 1.1K Jun 5 15:43 main.sh
-rw-rw-r-- 1 usr usr 3.2K Jun 5 15:50 CpGlocationChr1.mat
-rw-rw-r-- 1 usr usr 3.2K Jun 5 15:50 CpGlocationChr2.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr3.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr4.mat
-rw-rw-r-- 1 usr usr 3.1K Jun 5 15:50 CpGlocationChr5.mat
- Generate input matrices by running the following:
cd informME/src/bash_src/parseBamFile/getMatrices/main
./main.sh
and then run ls -lthrR out/
to check that files toy_normal_pe_matrices.mat
and toy_cancer_pe_matrices.mat
with
a size of approximately 70K have been created for each of the five chromosomes:
out/:
total 20K
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr5
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr4
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:53 chr3
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:52 chr2
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:52 chr1
out/chr5:
total 144K
-rw-rw-r-- 1 usr usr 69K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr4:
total 136K
-rw-rw-r-- 1 usr usr 67K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 67K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr3:
total 144K
-rw-rw-r-- 1 usr usr 70K Jun 5 15:53 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 70K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr2:
total 144K
-rw-rw-r-- 1 usr usr 70K Jun 5 15:52 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:52 toy_normal_pe_matrices.mat
out/chr1:
total 144K
-rw-rw-r-- 1 usr usr 72K Jun 5 15:52 toy_cancer_pe_matrices.mat
-rw-rw-r-- 1 usr usr 72K Jun 5 15:51 toy_normal_pe_matrices.mat
- Run informME using the following:
cd informME/src/bash_src/informME_run/main
./main.sh
and then run ls -lthrR out/
to check that analysis files for the normal, cancer, and pooled model with a size of approximately 68K have been created for each of the five chromosomes:
out/:
total 20K
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:01 chr5
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:01 chr4
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:00 chr3
drwxrwxr-x 2 usr usr 4.0K Jun 5 16:00 chr2
drwxrwxr-x 2 usr usr 4.0K Jun 5 15:59 chr1
out/chr5:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:01 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:59 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:56 toy_normal_analysis.mat
out/chr4:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:01 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:58 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:56 toy_normal_analysis.mat
out/chr3:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:00 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:57 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:55 toy_normal_analysis.mat
out/chr2:
total 204K
-rw-rw-r-- 1 usr usr 68K Jun 5 16:00 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:57 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 68K Jun 5 15:55 toy_normal_analysis.mat
out/chr1:
total 216K
-rw-rw-r-- 1 usr usr 69K Jun 5 15:59 toy_pooled_analysis.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:56 toy_cancer_analysis.mat
-rw-rw-r-- 1 usr usr 69K Jun 5 15:54 toy_normal_analysis.mat
- Obtain bedGraph output for single analysis and check mean methylation level is approximately 0.8 for normal and 0.5 for cancer by looking at files
MML-toy_normal.bed
andMML-toy_cancer.bed
respectively:
cd informME/src/bash_src/analysis/singleAnalysis/singleMethAnalysisToBed/main
./main.sh
cat out/MML-toy_normal.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
cat out/MML-toy_cancer.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
also you should run ls -lthr out/
to see the following files of similiar file sizes:
total 236K
-rw-rw-r-- 1 usr usr 113 Jun 5 16:02 VAR-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 TURN-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_normal.bed
-rw-rw-r-- 1 usr usr 7.7K Jun 5 16:02 VAR-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.1K Jun 5 16:02 TURN-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_cancer.bed
-rw-rw-r-- 1 usr usr 161 Jun 5 16:02 VAR-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 TURN-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 RDE-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 NME-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MSI-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 MML-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 METH-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ESI-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 ENTR-toy_pooled.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:02 CAP-toy_pooled.bed
- Obtain bedGraph output for differential analysis and check mean JSD is approximately 0.68 by looking at file
JSD-toy_normal-VS-toy_cancer.bed
:
cd informME/src/bash_src/analysis/diffAnalysis/diffMethAnalysisToBed/main
./main.sh
cat out/JSD-toy_normal-VS-toy_cancer.bed | awk '{if(NR>1){total+=$4}}END{print total/NR}'
also you should run ls -lthr out/
to see the following files with similiar file sizes:
total 80K
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 JSD-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dRDE-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.3K Jun 5 16:05 dNME-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 DMU-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dMSI-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 dMML-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.3K Jun 5 16:05 DEU-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 7.9K Jun 5 16:05 dESI-toy_normal-VS-toy_cancer.bed
-rw-rw-r-- 1 usr usr 8.0K Jun 5 16:05 dCAP-toy_normal-VS-toy_cancer.bed
This concludes the toy model included as part of the repository.