INDEP - winkusch/Easy2 GitHub Wiki

FUNCTION PARAMETER DEFAULT DESCRIPTION
INDEP --rcdCriterion Criterion that defines SNPs to be used for independentisation. Required
INDEP --arcdCriterion array of R expression criteria (if clumping should be done on multiple columns/traits/ancestries simultaneously)
INDEP --acolPval array of P Value columns (usually only one value; yet multiple columns can be given. In this case clumping is done for each column and then combined/compared. This is useful for multi-trait/group/ancestry clumping.)
INDEP --astrPvalTag tag for indep p value (tag for P values in clump groups)
INDEP --astrTag
INDEP --anumPvalLim 1 array of Pvalue limits (array should be used if varying thresholds want to be used between clump groups)
INDEP --colInChr Column name of the input chromosome.
INDEP --colInPos Column name of the input position column.
INDEP --numPosLim 500000 distance threshold for region-based clumping (minimum distance between clumps of genome-wide significant variants
INDEP --numPosRegionExtension -1, base positions by which genome-wide significant clump coordinates will be extended to define a region (by default this is --numPosLim/2, which ensures non-overlapping "extended" regions)
INDEP --acolIndep array of columns that will be used for independization (minimized or maximized per region; alterntive for --acolPval)
INDEP --astrIndepTag tag for indep (alternative for --astrPvalTag)
INDEP --anumIndepLim 1 array of numeric values for independentization limit (alternative for --anumPvalLim)
INDEP --strIndepDir min clumping direction (minimize, 'min' per clump or maximize, 'max' per clump; useful when logarithmized P value column is given at --acolIndep)
INDEP --fileClumpBed bed file(s); if defined, LD-based clumping within regions is performed (if placeholder is used in --fileClumpBed, the function loops over chromosomes)
INDEP --fileClumpSample if bed files are defined, optional sample file to subset bed files can be given
INDEP --numR2Thrs 0.2 LD clumping r2 threshold (clumps within region are combined by this threshold)
INDEP --blnParal FALSE logical whether clumping process should be parallelized by chromosome (requires placeholder in --fileClumpBed)
INDEP --pathLibLoc
INDEP --blnClumpInSignal TRUE
INDEP --blnAddIndepInfo FALSE logical whether indep columns should be added to larger data set (helpful if filtering in later functions should be done on INDEP results)
INDEP --colInMarker Column name of the input marker column.
INDEP --strTag character Tag for the function step that will be added to related variables in the REPORT and to related output to ensure unique and easily recognizable file names and REPORT variable names.

Example code:

Distance (+/-500kb) based clumping on variants with Pvalue<5e-8:

INDEP --rcdCriterion Pvalue<5e-8
--acolPval Pvalue
--colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb
## results will be indicated by *region*

Distance (+/-500kb) and LD based (r2<0.1) clumping on variants with Pvalue<5e-8:

INDEP --rcdCriterion Pvalue<5e-8
--acolPval Pvalue
--colInChr chr
--colInPos pos
--numPosLim 500000
--fileClumpBed /path/to/bedfiles/1000g_topmed_imputed_chr<CHR>.hqx.cpaid.maf001
--fileClumpSample /path/to/bedfiles/1000g_topmed_imputed_chr<CHR>.hqx.cpaid.maf001
--numR2Thrs 0.1
--blnParal 1
--colInMarker MarkerName
--strTag INDEP.d500kb.r201
## results will be indicated by *region* (distance based) and *locus* (LD based)

Distance (+/-500kb) based clumping on variants with Pvalue<5e-8 and using max log-Pvalues to define lead variants:

INDEP --rcdCriterion Pvalue<5e-8
--acolIndep logPvalue
--strIndepDir max --colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb.log
## results will be indicated by *region* and region lead variants are defined by max(logPvalue) per region

Distance (+/-500kb) based clumping on multiple Pvalue columns:

INDEP --rcdCriterion Pmen<5e-8|Pwomen<5e-8
--acolPval Pmen;Pwomen
--astrPvalTag MEN;WOMEN
--colInChr chr
--colInPos pos
--numPosLim 500000
--colInMarker MarkerName
--strTag INDEP.d500kb.men_women
## results will be indicated by *region* and indicator columns will be added that show whether each region contains MEN and/or WOMEN significnat variants

⚠️ **GitHub.com Fallback** ⚠️