INDEP - winkusch/Easy2 GitHub Wiki
FUNCTION | PARAMETER | DEFAULT | DESCRIPTION |
---|---|---|---|
INDEP | --rcdCriterion | Criterion that defines SNPs to be used for independentisation. Required | |
INDEP | --arcdCriterion | array of R expression criteria (if clumping should be done on multiple columns/traits/ancestries simultaneously) | |
INDEP | --acolPval | array of P Value columns (usually only one value; yet multiple columns can be given. In this case clumping is done for each column and then combined/compared. This is useful for multi-trait/group/ancestry clumping.) | |
INDEP | --astrPvalTag | tag for indep p value (tag for P values in clump groups) | |
INDEP | --astrTag | ||
INDEP | --anumPvalLim | 1 | array of Pvalue limits (array should be used if varying thresholds want to be used between clump groups) |
INDEP | --colInChr | Column name of the input chromosome. | |
INDEP | --colInPos | Column name of the input position column. | |
INDEP | --numPosLim | 500000 | distance threshold for region-based clumping (minimum distance between clumps of genome-wide significant variants |
INDEP | --numPosRegionExtension | -1, | base positions by which genome-wide significant clump coordinates will be extended to define a region (by default this is --numPosLim/2, which ensures non-overlapping "extended" regions) |
INDEP | --acolIndep | array of columns that will be used for independization (minimized or maximized per region; alterntive for --acolPval) | |
INDEP | --astrIndepTag | tag for indep (alternative for --astrPvalTag) | |
INDEP | --anumIndepLim | 1 | array of numeric values for independentization limit (alternative for --anumPvalLim) |
INDEP | --strIndepDir | min | clumping direction (minimize, 'min' per clump or maximize, 'max' per clump; useful when logarithmized P value column is given at --acolIndep) |
INDEP | --fileClumpBed | bed file(s); if defined, LD-based clumping within regions is performed (if placeholder is used in --fileClumpBed, the function loops over chromosomes) | |
INDEP | --fileClumpSample | if bed files are defined, optional sample file to subset bed files can be given | |
INDEP | --numR2Thrs | 0.2 | LD clumping r2 threshold (clumps within region are combined by this threshold) |
INDEP | --blnParal | FALSE | logical whether clumping process should be parallelized by chromosome (requires placeholder in --fileClumpBed) |
INDEP | --pathLibLoc | ||
INDEP | --blnClumpInSignal | TRUE | |
INDEP | --blnAddIndepInfo | FALSE | logical whether indep columns should be added to larger data set (helpful if filtering in later functions should be done on INDEP results) |
INDEP | --colInMarker | Column name of the input marker column. | |
INDEP | --strTag | character | Tag for the function step that will be added to related variables in the REPORT and to related output to ensure unique and easily recognizable file names and REPORT variable names. |
INDEP --rcdCriterion Pvalue<5e-8
--acolPval Pvalue
--colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb
## results will be indicated by *region*
INDEP --rcdCriterion Pvalue<5e-8
--acolPval Pvalue
--colInChr chr
--colInPos pos
--numPosLim 500000
--fileClumpBed /path/to/bedfiles/1000g_topmed_imputed_chr<CHR>.hqx.cpaid.maf001
--fileClumpSample /path/to/bedfiles/1000g_topmed_imputed_chr<CHR>.hqx.cpaid.maf001
--numR2Thrs 0.1
--blnParal 1
--colInMarker MarkerName
--strTag INDEP.d500kb.r201
## results will be indicated by *region* (distance based) and *locus* (LD based)
Distance (+/-500kb) based clumping on variants with Pvalue<5e-8 and using max log-Pvalues to define lead variants:
INDEP --rcdCriterion Pvalue<5e-8
--acolIndep logPvalue
--strIndepDir max
--colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb.log
## results will be indicated by *region* and region lead variants are defined by max(logPvalue) per region
INDEP --rcdCriterion Pmen<5e-8|Pwomen<5e-8
--acolPval Pmen;Pwomen
--astrPvalTag MEN;WOMEN
--colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb.men_women
## results will be indicated by *region* and indicator columns will be added that show whether each region contains MEN and/or WOMEN significnat variants