140716_M01132_0090_000000000 A9WAH Project_Diag TruSight01 2014 07 03 - yvancouver/Workflow GitHub Wiki
Text in italic is the original text.
Text in bold is the corrections needed to make it work on Yvan's machine
- smb://192.168.1.101:/runScratch/ OK
- smb://192.168.1.31:/data.odin/ OK
- smb://192.168.1.31:/data1.odin/ OK
- "condor_sh" should be under root OK
-
pcus572:~ yvans$ python --version
Python 2.7.7 -
Packages:
ConfigParser
argparse
os
re
sys
pyvcf
3, SNP fingerprinting results should be under /Volumn/data1.odin/001_exome/diagnosticSamples/Taqman_results (path right?)
** 3. SNP fingerprinting results should be under /Volumes/data1.odin/001_exome/diagnosticSamples/TaqManSNP-ID/ **
The fastq files are NOT are the default place, they are located in
/Volumes/runScratch/completed/1407/140716_M01132_0090_000000000-A9WAH/Data/Intensities/BaseCalls
- Copy fastq files to "condor_sh/projects"
cd /Volumes/runScratch/completed/1407/140716_M01132_0090_000000000-A9WAH/Data/Intensities/BaseCalls/
mkdir /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
rsync -avn 140716_M01132.Project_Diag-TruSight01-2014-07-03/ /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
rsync -avz 140716_M01132.Project_Diag-TruSight01-2014-07-03/ /condor_sh/projects/Project_Diag-TruSight01-2014-07-03/
-
Change ownership and permissions of the project folder
-
log into tor, move to the project.
$cd /condor_sh/projects/
su
root@tor:/condor_sh/projects #chmod -R 777 Project_Diag-TruSight01-2014-07-03/
root@tor:/condor_sh/projects #chown -R SBSUser:SBSUser Project_Diag-TruSight01-2014-07-03/
- logout from tor
- Run the perl script (generateCMDmappingCondor.pl) to prepare the condor job
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/generateCMDmappingCondor.pl
Have a look at the script output,generateCMDmappingCondorProject_Diag-TruSight01-2014-07-03.md,
This generateCMDmapping script failed….
- start the condor job
* log on Tor (192.168.1.32)
* run for each sample the "commandMapping.bash" located in the data directory. You can submit 4~5 alignment per day
*
cd /condor_sh /projects/Project_Diag-excap20-2014-02-24*bash SampleXXXXX/data/commandMapping.bash - Check results * Mapping takes around 10 hours on the current setup, once the mapping finished the mapping directory should be around 80-100 GB. One can also have a look into the /mapping_*/log directory where there should contain three log files, and there is "Done" in *sam.out, and there is no "Exception" in collectAlignmentSummary.out and *bam.out files.
- on the working machine, create a directory named after the project name.
mkdir Project_Diag-excap20-2014-02-24
- Move into the newly created directory and start the pipeline by calling the runGroupPipeline.pl with the group.conf as an argument.
cd Project_Diag-excap20-2014-02-24
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/runGroupPipeline.pl /condor_sh/project/Project_Diag-excap20-2014-02-24/group.conf > output.txt
- The fastest way of checking if everything went well is to check the size and/or the content of the files starting with err*. Most of them should be empty or without "Exception" in the files.
- Here is the list of file which should be empty if the process went well:
errIndelRealigner
errRealignerTargetCreator
errAnalyzeCovariates
errBaseRecalibratorPost
errBaseRecalibratorPre
errPrintReads
errSelectVariantsIndel
errSelectVariantsSNP
errSnpFingerPrintingTestUnifiedGenotyper
errUnifiedGenotyper
errCombineVariants
errVariantFiltrationIndel
errVariantRecalibratorSNP
- And the files which should be without "Exception" errMarkDup
errCollectAlignmentSummaryMetrics
errCollectInsertSizeMetrics
errCalculateHsMetrics
errConvert2annovarAll
errConvert2annovarInCand
errTableAnnovarAll
errTableAnnovarInCand
- If the pipeline crashes, one can restart it and it should pick up where it has been stopped.
- For the fingerprinting test, the scripts are incorporated in the variant calling pipeline. If any of the tests fail the result file will have the _NEED_REVIEW extension and one should open the file to check the problem. The file is located in :
Project_Diag-excap20-2014-02-24/SampleXXX/070_QC/
- Once the variant calling pipeline has finish and all checks have been done, the last steps is to run the postProcess.pl script from the project folder.The script will check , rename and delete redundant files in the different Sample folder. Once the script finish, one can safely transfer the Sample folders to their respective location on loki:/data/diag/samples
perl /Volumes/data.odin/diagnosticBundle/script/amg/variantcalling/pipeline/pipeline_current/postProcess.pl