Discovery of hidden confounders of QTLs - molgenis/systemsgenetics GitHub Wiki
This tool can be used to identify modulators of previously identified QTL effects as we have shown here: Hypothesis-free identification of modulators of genetic risk factors
The source code is availible here: https://github.com/molgenis/systemsgenetics/tree/master/eQTLInteractionAnalyser
The last build can be downloaded here
Input data
We require a folder with the following files:
- A file with genotypes called:
Genotypes.binary
- A file with expression of the eQTLs:
Expression.binary
- A file with covariate data:
Covariates.binary
Optionally a file with known eQTL effect used to correct the covariate data and a list of samples to include in the analysis.
Below the plain text version of the needed files is described. The following command can be used to convert these to the .binary formats.
java -jar ../eQTLInteractionAnalyser-1.2-SNAPSHOT-jar-with-dependencies.jar \
--convertMatrix \
-i inputfile.txt \
-o output.binary \
:exclamation: Important the rows (variants) in the genotype file must correspond to rows (genes) in the expression file. This means that both must have an equal number of rows. If a variant is affecting two genes, this variant should be included twice in the dosage file. The sample order must also be exactly the same
Genotype dosage data
Tab-separated matrix with variants in rows and samples in columns.
eQTL expression data
Tab-separated matrix with genes in rows and samples in columns.
Covariate expression data
Tab-separated matrix with proxy gene expression and other other potential covariates such as PCs to test in rows and samples in the same order as in the other files in columns.
QTL file
An eQTL result file as produced by our QTL mapping pipeline.
Columns |
---|
PValue |
SNPName |
SNPChr |
SNPChrPos |
ProbeName |
ProbeChr |
ProbeCenterChrPos |
CisTrans |
SNPType |
AlleleAssessed |
OverallZScore |
DatasetsWhereSNPProbePairIsAvailableAndPassesQC |
DatasetsZScores |
DatasetsNrSamples |
IncludedDatasetsMeanProbeExpression |
IncludedDatasetsProbeExpressionVariance |
HGNCName |
IncludedDatasetsCorrelationCoefficient |
Meta-Beta (SE) |
Beta (SE) |
FoldChange |
FDR |
Samples to include file
Per line a sample to include. No heading.
Example command
java -Xmx80g -jar ../eQTLInteractionAnalyser-1.2-SNAPSHOT-jar-with-dependencies.jar \
-i /inputfolder/ \
-o /outputfolder/ \
-e eQTLs.txt \
-c gender MEDIAN_3PRIME_BIAS MEDIAN_5PRIME_BIAS GC PCT_INTRONIC_BASES \
-c2 Batch1 Batch2 \
-is includedSamples.txt \
-n 20 \
-pc 20 \
-nt 8
Options
Short | Long | Description |
---|---|---|
-dif | --chi2sumDiff | Find chi2sum differences for each covariate between 2 consequtive interaction runs |
-nn | --noNormalization | Skip all normalization step. n must be 1 |
-perm | --permute | Run permutation |
-nt | --threads | Number of threads |
-ncn | --noCovNormalization | Skip covariate normalization step. n must be 1 |
-ec | --eqtlsCovariates | Path to the eQTL file to correct covariates |
-c | --cov | covariates to correct for using an interaction term before running the interaction analysis |
-cf | --covFile | File containing the covariates to correct for using an interaction term before running the interaction analysis. No header, each covariate on a separate line |
-sw | --swap | File containing the SNPs to swap |
-e | --eqtls | Path to the eQTL file to test for interactions |
-ch | --cohorts | Covariates to correct for without interaction term before running the interaction analysis |
-i | --input | Path to the folder containing expression and genotype data |
-cm | --convertMatrix | Convert matrix |
-is | --includedSamples | Included samples |
-it | --interpret | Interpret the z-score matrices |
-snps | --snpsToTest | SNPs to test |
-n | --maxcov | Maximum number of covariates to regress out |
-o | --output | Path to the output folder |
-c2 | --cov2 | Covariates to correct for without interaction term before running the interaction analysis |
-ct | --covTest | Covariates to to test in interaction analysis. Optional, all are tested if not used |
-s | --start | Start round for chi2sumDiff option |
-pc | --numpc | Number of PCs to correct for |
-thr | --threshold | Z-score difference threshold for interpretation |