Simulating quantitative traits - gc5k/GEAR GitHub Wiki

Simulation for quantitative traits


Options

--sample-size/--n

Specify the sample size. 100 by default.

--marker/--m

Specify the number of total markers. 100 by default.

--null-marker

Specify the number of markers from null distribution. 0 by default.

--freq

Specify the frequencies for the markers. 0.5 by default.

--unif-freq

It generates frequency spectrum from uniform distribution between 0.01~0.5.

--freq-file

Specify the file that has frequencies for the reference alleles. One element per line.

--poly-effect

It generates polygenic effects from the standard normal distribution.

--poly-effect-sort

It generates polygenic effects from the standard normal distribution. Different from --poly-effect, this option will sort the genetic effects in ascending order, so that the first marker has the smallest effect and the last the biggest.

--effect

Specify the universal effect for each loci. It defaults to 0.5.

--effect-file

Specify the file that has the effect for each locus, one element per line.

--ld

Specify LD in Lewontin's D', a value between -1 to 1. It defaults to 0, linkage equilibrium for markers.

--rand-ld

It generates Lewontin's D' from the uniform distribution between -1 to 1.

--ld-file

Specify LD in for two consecutive markers. Given m markers, this file has m-1 lines.

--hsq

Specify the heritability. It defaults to 0.5.

--rep

Specify the replication for simulation. It defaults to 1.

--make-bed

It generates genotypes in bed format.

--fam-prefix

Specify the prefix for family ids.

Examples

gear simuqt --n 1000 --m 1000 --null-marker 900 --freq 0.45 --ld 0.3 --hsq 0.25 --poly-effect --out test
gear simuqt --n 1000 --m 1000 --null-marker 500 --unif-freq --rand-ld --hsq 0.2 --effect-file eff.txt --out test
gear simuqt --n 1000 --m 1000 --freq-file frq.txt --poly-effect --ld 0.8 --out test
gear simuqt --n 1000 --m 1000 --freq-file frq.txt --poly-effect --ld-file ld.txt --out test

The output files includes *.bim, *.fam, and *.bed (the genotype file in plink binary format).

*.phe: there are three columns included. The first two columns are family id, and individual id. The 3rd column is phenotypic value. When replication is bigger than 1, from the 3rd column represents phenotypic values for each replication.

*.breed: genotypic values for the simulated population.

*.rnd: there are three columns included. 1st is the marker name, 2nd is the reference allele, the 3rd column is its additive effect.

*.add: the genotype in additive model coding scheme.

Return to GEAR Home

⚠️ **GitHub.com Fallback** ⚠️