Usage - shengqh/glmvc GitHub Wiki

[shengq1@cqs2 glmvc]$ mono glmvc.exe
General Linear Model Based Somatic Mutation Caller (GLMVC) - 1.3.0 - Quanhu SHENG ([email protected]/[email protected]) - CQS/VUMC
Those commands are available :
        call    Call somatic mutation and filter candidates by logistic regression model and annotate result
        annotation      Annotate mutation using varies tools.
        validate        Validate somatic mutation in vcf/bed file.
        extract extract base composition of sites in vcf/bed file.

call

Call somatic mutation, filter candidates by logistic regression model and annotate result

[shengq1@cqs2 glmvc]$ mono glmvc.exe call
General Linear Model Based Somatic Mutation Caller (GLMVC) 1.3.0

  --help                             Display this help screen.

  --error_rate=DOUBLE                (Default: 0.01) Sequencing error rate for
                                     normal sample test

  --glm_pvalue=DOUBLE                (Default: 0.05) Maximum adjusted pvalue
                                     used for GLM test

  --annovar_set_default              (Default: false) Set current setting as
                                     annovar default setting

  --annovar_buildver=STRING          Annovar database version, for example:
                                     hg19)

  --annovar_db=DIRECTORY             The directory contains annovar databases

  --annovar_protocol=STRING          Annovar protocols, for example:
                                     refGene,snp138,cosmic68

  --annovar_operation=STRING         Annovar operation, must match with annovar
                                     protocols, for example: g,f,f

  --distance_insertion_bed=FILE      Insertion bed file for distance
                                     annotation.

  --distance_deletion_bed=FILE       Deletion bed file for distance annotation.

  --distance_junction_bed=FILE       Junction bed file for distance annotation.

  --distance_exon_gtf=FILE           Exon gtf file for distance annotation.

  --rnaediting_db=FILE               The rna editing database file

  -t STRING, --type=STRING           Required. Where to read/generate mpileup
                                     result file (bam/mpileup/console)

  --normal=FILE                      Bam file from normal sample (when
                                     type==bam)

  --tumor=FILE                       Bam file from tumor sample (when
                                     type==bam)

  -r STRING, --chromosomes=STRING    (Default: all chromosomes in genome fasta
                                     file) Chromosome names (separted by ',')
                                     (when type==bam)

  -m FILE, --mpileup=FILE            Input samtools mpileup result file (when
                                     type==mpileup)

  -d INT, --read_depth=INT           (Default: 10) Minimum read depth of base
                                     passed mapping quality filter in each
                                     sample

  --fisher_pvalue=DOUBLE             (Default: 0.05) Maximum pvalue used for
                                     fisher exact test

  --max_normal_percentage=DOUBLE     (Default: 0.01) Maximum percentage of
                                     minor allele at normal sample

  --min_tumor_read=INT               (Default: 5) Minimum read count of minor
                                     allele at tumor sample

  --min_tumor_percentage=DOUBLE      (Default: 0.1) Minimum percentage of minor
                                     allele at tumor sample

  -c INT, --thread_count=INT         (Default: 1) Number of thread, only valid
                                     when type is bam

  -o STRING, --output=STRING         Required. Output file suffix

  --no-BAQ=BOOL                      disable BAQ (per-Base Alignment Quality)
                                     for samtools mpileup (when type==bam)

  --read_quality=INT                 (Default: 20) Minimum mapQ of read for
                                     samtools mpileup (when type==bam)

  --base_quality=INT                 (Default: 20) Minimum base quality

  -f FILE, --fasta=FILE              Genome fasta file for samtools mpileup
                                     (when type==bam)

annotation

Annotate mutation using varies tools.

[shengq1@cqs2 glmvc]$ mono glmvc.exe annotation
General Linear Model Based Somatic Mutation Caller (GLMVC) 1.3.0
Copyright (C) 2013-2015 Center for Quantitative Sciences/VUMC

  --help                           Display this help screen.

  -i FILE, --input=FILE            Required. Input file generated by filter
                                   function.

  -o FILE, --output=FILE           Output annotated file.

  --annovar_set_default            (Default: false) Set current setting as
                                   annovar default setting

  --annovar_buildver=STRING        Annovar database version, for example: hg19)

  --annovar_db=DIRECTORY           The directory contains annovar databases

  --annovar_protocol=STRING        Annovar protocols, for example:
                                   refGene,snp138,cosmic68

  --annovar_operation=STRING       Annovar operation, must match with annovar
                                   protocols, for example: g,f,f

  --distance_insertion_bed=FILE    Insertion bed file for distance annotation.

  --distance_deletion_bed=FILE     Deletion bed file for distance annotation.

  --distance_junction_bed=FILE     Junction bed file for distance annotation.

  --distance_exon_gtf=FILE         Exon gtf file for distance annotation.

  --rnaediting_db=FILE             The rna editing database file

validate

validate somatic mutation sites using another normal/tumor paired data

[shengq1@cqs2 glmvc]$ mono glmvc.exe validate
General Linear Model Based Somatic Mutation Caller (GLMVC) 1.3.0

  --help                             Display this help screen.

  -v FILE, --validation_file=FILE    Bed format file for somatic mutation
                                     validation

  --glm_pvalue=DOUBLE                (Default: 0.05) Maximum adjusted pvalue
                                     used for GLM test

  --error_rate=DOUBLE                (Default: 0.01) Sequencing error rate for
                                     normal sample test

  -t STRING, --type=STRING           Required. Where to read/generate mpileup
                                     result file (bam/mpileup/console)

  --normal=FILE                      Bam file from normal sample (when
                                     type==bam)

  --tumor=FILE                       Bam file from tumor sample (when
                                     type==bam)

  -r STRING, --chromosomes=STRING    (Default: all chromosomes in genome fasta
                                     file) Chromosome names (separted by ',')
                                     (when type==bam)

  -m FILE, --mpileup=FILE            Input samtools mpileup result file (when
                                     type==mpileup)

  -d INT, --read_depth=INT           (Default: 10) Minimum read depth of base
                                     passed mapping quality filter in each
                                     sample

  --fisher_pvalue=DOUBLE             (Default: 0.05) Maximum pvalue used for
                                     fisher exact test

  --max_normal_percentage=DOUBLE     (Default: 0.01) Maximum percentage of
                                     minor allele at normal sample

  --min_tumor_read=INT               (Default: 5) Minimum read count of minor
                                     allele at tumor sample

  --min_tumor_percentage=DOUBLE      (Default: 0.1) Minimum percentage of minor
                                     allele at tumor sample

  -c INT, --thread_count=INT         (Default: 1) Number of thread, only valid
                                     when type is bam

  -o STRING, --output=STRING         Required. Output file suffix

  --no-BAQ=BOOL                      disable BAQ (per-Base Alignment Quality)
                                     for samtools mpileup (when type==bam)

  --read_quality=INT                 (Default: 20) Minimum mapQ of read for
                                     samtools mpileup (when type==bam)

  --base_quality=INT                 (Default: 20) Minimum base quality

  -f FILE, --fasta=FILE              Genome fasta file for samtools mpileup
                                     (when type==bam)

extract

extract allele frequencies of predefined sites from single or multiple BAM files

[shengq1@cqs2 glmvc]$ mono glmvc.exe extract
General Linear Model Based Somatic Mutation Caller (GLMVC) 0.9.9.0
Copyright (C) 2013-2015 Center for Quantitative Sciences/VUMC

  --help                        Display this help screen.

  -v FILE, --bed_file=FILE      Bed format file for sites

  --bam_files=FILES             Required. Bam files, separated by ','

  --bam_names=STRINGS           Required. Bam names, separated by ','

  -o STRING, --output=STRING    Required. Output file

  --no-BAQ=BOOL                 disable BAQ (per-Base Alignment Quality) for
                                samtools mpileup (when type==bam)

  --read_quality=INT            (Default: 20) Minimum mapQ of read for samtools
                                mpileup (when type==bam)

  --base_quality=INT            (Default: 20) Minimum base quality

  -f FILE, --fasta=FILE         Genome fasta file for samtools mpileup (when
                                type==bam)

  1. example

mono glmvc.exe \
     call \
		 -c 8 -t bam --max_normal_percentage 0.01 --glm_pvalue 0.05 \ 
		 -f hg19_16569_M.fa \
		 --rnaediting_db hg19.txt \
		 --annovar_buildver hg19 \
		 --annovar_protocol refGene,snp138,cosmic70 \
		 --annovar_operation g,f,f \
		 --annovar_db human_db \
		 --distance_exon_gtf Homo_sapiens.GRCh37.75.M.gtf \
		 --normal TCGA-A7-A0D9-RNA-NT.rmdup.split.recal.bam \
		 --tumor TCGA-A7-A0D9-RNA-TP.rmdup.split.recal.bam \
		 -o TCGA-A7-A0D9-RNA-TP-NT