140716_M01132_0090_000000000 A9WAH Project_Diag TruSight01 2014 07 03 - yvancouver/Workflow GitHub Wiki

Get a copy of the pipeline HowTo

Text in italic is the original text.
Text in bold is the corrections needed to make it work on Yvan's machine

Requirement

1, Mount the following drives:

  • smb://192.168.1.101:/runScratch/ OK
  • smb://192.168.1.31:/data.odin/ OK
  • smb://192.168.1.31:/data1.odin/ OK
  • "condor_sh" should be under root OK

2, R, python libraries

  • pcus572:~ yvans$ python --version
    Python 2.7.7

  • Packages:
    ConfigParser
    argparse
    os
    re
    sys
    pyvcf

3, SNP fingerprinting results should be under /Volumn/data1.odin/001_exome/diagnosticSamples/Taqman_results (path right?)

** 3. SNP fingerprinting results should be under /Volumes/data1.odin/001_exome/diagnosticSamples/TaqManSNP-ID/ **

Running Variant Calling Pipeline

1. Mapping the reads

The fastq files are NOT are the default place, they are located in

/Volumes/runScratch/completed/1407/140716_M01132_0090_000000000-A9WAH/Data/Intensities/BaseCalls  
  • Copy fastq files to "condor_sh/projects"
cd /Volumes/runScratch/completed/1407/140716_M01132_0090_000000000-A9WAH/Data/Intensities/BaseCalls/
mkdir /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
rsync -avn 140716_M01132.Project_Diag-TruSight01-2014-07-03/ /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
rsync -avz 140716_M01132.Project_Diag-TruSight01-2014-07-03/ /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
  • Change ownership and permissions of the project folder

  • log into tor, move to the project.

$cd /condor_sh/projects/  
su  
root@tor:/condor_sh/projects #chmod -R 777 Project_Diag-TruSight01-2014-07-03/  
root@tor:/condor_sh/projects #chown -R SBSUser:SBSUser Project_Diag-TruSight01-2014-07-03/
  • logout from tor
  • Run the perl script (generateCMDmappingCondor.pl) to prepare the condor job
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/generateCMDmappingCondor.pl  

Have a look at the script output,generateCMDmappingCondorProject_Diag-TruSight01-2014-07-03.md,

This generateCMDmapping script failed….

  • start the condor job * log on Tor (192.168.1.32) * run for each sample the "commandMapping.bash" located in the data directory. You can submit 4~5 alignment per day * cd /condor_sh /projects/Project_Diag-excap20-2014-02-24 * bash SampleXXXXX/data/commandMapping.bash
  • Check results * Mapping takes around 10 hours on the current setup, once the mapping finished the mapping directory should be around 80-100 GB. One can also have a look into the /mapping_*/log directory where there should contain three log files, and there is "Done" in *sam.out, and there is no "Exception" in collectAlignmentSummary.out and *bam.out files.

2. Starting the variant calling

  • on the working machine, create a directory named after the project name.
mkdir Project_Diag-excap20-2014-02-24
  • Move into the newly created directory and start the pipeline by calling the runGroupPipeline.pl with the group.conf as an argument.
cd Project_Diag-excap20-2014-02-24
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/runGroupPipeline.pl /condor_sh/project/Project_Diag-excap20-2014-02-24/group.conf > output.txt

3. Checks

  • The fastest way of checking if everything went well is to check the size and/or the content of the files starting with err*. Most of them should be empty or without "Exception" in the files.
  • Here is the list of file which should be empty if the process went well:

errIndelRealigner

errRealignerTargetCreator

errAnalyzeCovariates

errBaseRecalibratorPost

errBaseRecalibratorPre

errPrintReads

errSelectVariantsIndel

errSelectVariantsSNP

errSnpFingerPrintingTestUnifiedGenotyper

errUnifiedGenotyper

errCombineVariants

errVariantFiltrationIndel

errVariantRecalibratorSNP

  • And the files which should be without "Exception" errMarkDup

errCollectAlignmentSummaryMetrics

errCollectInsertSizeMetrics

errCalculateHsMetrics

errConvert2annovarAll

errConvert2annovarInCand

errTableAnnovarAll

errTableAnnovarInCand

  • If the pipeline crashes, one can restart it and it should pick up where it has been stopped.
  • For the fingerprinting test, the scripts are incorporated in the variant calling pipeline. If any of the tests fail the result file will have the _NEED_REVIEW extension and one should open the file to check the problem. The file is located in : Project_Diag-excap20-2014-02-24/SampleXXX/070_QC/

4. post processing

  • Once the variant calling pipeline has finish and all checks have been done, the last steps is to run the postProcess.pl script from the project folder.The script will check , rename and delete redundant files in the different Sample folder. Once the script finish, one can safely transfer the Sample folders to their respective location on loki:/data/diag/samples
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/postProcess.pl

5, fill in "Bioinformatic" section in "HTS database" in hospital computer under /R/lab/HTS?

⚠️ **GitHub.com Fallback** ⚠️