GENESIS: GRM to GWAS - pcgoddard/Burchardlab_Tutorials GitHub Wiki
GENESIS PC-Relate Tutorial
Pagé Goddard
Sep 28, 2017
Update: Oct 24, 2017
GENESIS
is an R-based pipeline developed by TOPMed. It uses PC-AiR
for population structure inference that is robust to known or cryptic relatedness, and it uses PC-Relate
for accurate relatedness estimation in the presence of population structure, admixutre, and departures from Hardy-Weinberg equilibrium.
Contents
- Resources
- Required Libraries
- 1. Input File Options
- 2. Calculating pairwise measures of ancestry divergence
- 3. PC-AiR estimation of population structure
- 4. PC-Relate estimation of recent relatedness
- 5. Convert to GCTA-friendly format
- GENESIS to GWAS Pipeline
Resources
Genesis
- uses sample HapMap data
- details on the concepts behind the programs used
External packages
Libraries
# be sure to install biocLite
source("https://bioconductor.org/biocLite.R")
biocLite(c('GENESIS',"GWASTools", "SNPRelate"))
libs <- c("GENESIS", "GWASTools", "SNPRelate", "gdsfmt")
lapply(libs, require, character.only = TRUE)
rm(libs)
Errors:
If you are having difficulty installing the necessary packages, you may need to update your R environment. To check version, run version()
. In R-studio, run updateR()
.
1. Input Files
GENESIS uses a model-free approach and thus requires only the genotype file and no externally calculated ancestry proportions. The functions in the GENESIS
package read genotype data from a GenotypeData class object created by the GWASTools
package. Through the use of GWASTools
, a GenotypeData
class object can easily be created from:
Your input can be in one of these three forms
- Plink Files (.bim .bed .fam) instructions
- GDS File instructions
- R Matrix of Genotype Data instructions
PLINK files
The SNPRelate
package provides the snpgdsBED2GDS
function to convert binary PLINK files into a GDS file.
file | description |
---|---|
bed.fn |
path to PLINK.bed file |
bim.fn |
path to PLINK.bim file |
fam.fn |
path to PLINK.fam file |
out.gdsfn |
path for output GDS file |
# convert to gds with SNPRelate
snpgdsBED2GDS(bed.fn = "genotype.bed", bim.fn = "genotype.bim", fam.fn = "genotype.fam", out.gdsfn = "mygenotype.gds")
# Then continue with GDS file instructions:
mygeno <- GdsGenotypeReader(filename = "PATH_TO/mygenotype.gds")
myenoData <- GenotypeData(geno)
GDS files
mygeno <- GdsGenotypeReader(filename = "PATH_TO/mygenotype.gds")
myenoData <- GenotypeData(geno)
R Matrix
file | description |
---|---|
genotype |
matrix of genotype values coded as 0 / 1 / 2; rows index SNPs; columns index samples |
snpID |
integer vector of unique SNP IDs |
chromosome |
integer vector specifying chr of each snp |
position |
integer vector psecifying position of each SNP |
scanID |
vector of unique individual IDs |
mygeno <- MatrixGenotypeReader(genotype = genotype, snpID = snpID, chromosome = chromosome, position = position, scanID = scanID)
mygenoData <- GenotypeData(mygeno)
2. Calculating pairwise measures of ancestry divergence
Identifying a mutually unrelated and ancestry representative subset of individuals. KING-robust
kinship coefficient estimator provides negative estimates for unrelated pairs with divergent ancestry to prioritize ancestrally-diverse individuals for the representative subset. Can be calculated with KING-robust
from GENESIS
or snpgdsIBDKING
from SNPRelate
.
with SNPRelate
prefered since it can be done without leaving R
The vignette states: "Alternative to running the KING software, the snpgdsIBDKING
function from the SNPRelate
package can be used to calculate the KING-robust estimates directly from a GDS file. The ouput of this function contains a matrix of pairwise estimates, which can be used by the GENESIS
functions" this is a lie You must extract the matrix and prepend the IIDs as row and column names.
Input: mygenotype.gds
Command: snpgdsIBDKING
Output: matrix of pairwise estimates
Options: type ("KING-robust" - for admixed pop,"KING-homo" - for homogeneous pop), sample.id (choose samples subset; default all), snp.id (choose SNPs subset; default all), autosome.only (default TRUE), maf (to use the SNPs with ">=maf" only; default no threshold) family.id (default NULL; all individuals treated as singletons. If provided, within- and between- family relationships are estimated differently), verbose
# snpgdsIBDKING requires a gds object; you cannot just point command to a .gds file
# read in the GDS file you just generated and verify its class
gdsfile <- snpgdsOpen(paste(genotype,".gds",sep=""))
class(gdsfile) #check for "gds.class"
# calculate KING IBD Kinship coefficients
ibd_king <- snpgdsIBDKING(gdsfile, type="KING-robust", verbose=TRUE)
class(ibd_king)
# snpgdsIBDClass (5 elements)
names(ibd_king)
# [1] "sample.id" "snp.id" "afreq" "IBS0" "kinship"
# extract kinship matrix
KINGmat = as.matrix(ibd_king$kinship)
# check output
KINGmat[1:5,1:5]
## 0.5000000000 -0.004695793 0.001941412 -0.0009816093 -0.01903618
## -0.0046957930 0.500000000 -0.009200767 -0.0070120462 -0.03430987
## 0.0019414117 -0.009200767 0.500000000 -0.0085147706 -0.01809720
## -0.0009816093 -0.007012046 -0.008514771 0.5000000000 -0.02480543
## -0.0190361844 -0.034309866 -0.018097203 -0.0248054274 0.50000000
#add row and column labels
rownames(KINGmat) <- c(ibd_king$sample.id)
colnames(KINGmat) <- c(ibd_king$sample.id)
# SAGE2 output using SNPRelate
KINGmat[1:5,1:5]
## CH30380 VA30171 VA30167 CH30357 VA70028
## CH30380 0.5000000000 -0.004695793 0.001941412 -0.0009816093 -0.01903618
## VA30171 -0.0046957930 0.500000000 -0.009200767 -0.0070120462 -0.03430987
## VA30167 0.0019414117 -0.009200767 0.500000000 -0.0085147706 -0.01809720
## CH30357 -0.0009816093 -0.007012046 -0.008514771 0.5000000000 -0.02480543
## VA70028 -0.0190361844 -0.034309866 -0.018097203 -0.0248054274 0.50000000
with KING-robust
program)
KING (external- input: PLINK binary ped bile
genotype.bed
- command: king -b
genotype.bed
--kinship - output:
king.kin
,king.kin0
KING Output: .kin
and .kin0
text files. king2mat
extracts kinship coefficients for GENESIS
functions.
king -b genotype.bed --kinship
# sample output
head king.kin # within family
## FID ID1 ID2 N_SNP Z0 Phi HetHet IBS0 Kinship Error
## 28 1 2 2359853 0.000 0.2500 0.162 0.0008 0.2459 0
## 28 1 3 2351257 0.000 0.2500 0.161 0.0008 0.2466 0
## 28 2 3 2368538 1.000 0.0000 0.120 0.0634 -0.0108 0
## 117 1 2 2354279 0.000 0.2500 0.163 0.0006 0.2477 0
## 117 1 3 2358957 0.000 0.2500 0.164 0.0006 0.2490 0
head king.kin0 # between family
## FID1 ID1 FID2 ID2 N_SNP HetHet IBS0 Kinship
## 28 3 117 1 2360618 0.143 0.0267 0.1356
## 28 3 117 2 2352628 0.161 0.0009 0.2441
## 28 3 117 3 2354540 0.120 0.0624 -0.0119
## 28 3 1344 1 2361807 0.093 0.1095 -0.2295
## 28 3 1344 12 2367180 0.094 0.1091 -0.2225
Convert to matrix with king2mat (GENESIS package)
- input:
.kin
and.kin0
andiid
- command:
king2mat
- output: matrix
iid
= text file containing iids in order from GenotypeData
# read individual IDs from GenotypeData object
iids <- getScanID(genoData)
head(iids)
# create matrix of KING estimates
KINGmat <- king2mat(file.kin0 = system.file("wrkdir", "data.kin0", package="GENESIS"),
file.kin = system.file("wrkdir", "data.kin", package="GENESIS"),
iids = iids)
# sample output
KINGmat[1:5,1:5]
## NA19919 NA19916 NA19835 NA20282 NA19703
## NA19919 0.5000 -0.0009 -0.0059 -0.0080 0.0014
## NA19916 -0.0009 0.5000 -0.0063 -0.0150 -0.0039
## NA19835 -0.0059 -0.0063 0.5000 -0.0094 -0.0104
## NA20282 -0.0080 -0.0150 -0.0094 0.5000 -0.0134
## NA19703 0.0014 -0.0039 -0.0104 -0.0134 0.5000
PC-AiR
3. Estimate principle components in related samples Uses pairwise measure sof kinship and ancestry divergence to determine the unrelated and representative subset for analysis.
The KING-robust estimates are always used as measures of ancestry divergence for unrelated pairs of individuals; can also be used as measures of kinship for relatives (NOTE: they may be biased measures of kinship for admixed relatives with different ancestry)
Input:
input | description |
---|---|
genoData |
GenotypeData class object |
kinMat |
matrix of pairwise kinship coefficient estimates (KING-robust estimates or other source) |
divMat |
matrix of pairwise measures of ancestry divergence (KING-robust estimates) |
Command:
# run PC-AiR
mypcair <- pcair(genoData = mygenoData, kinMat = KINGmat, divMat = KINGmat)
You should see the following verbage:
Partitioning Samples into Related and Unrelated Sets...
Unrelated Set: 1665 Samples
Related Set: 320 Samples
Running Analysis with 748665 SNPs - in 75 Block(s)
Computing Genetic Correlation Matrix for the Unrelated Set: Block 1 of 75 ...
...
Computing Genetic Correlation Matrix for the Unrelated Set: Block 75 of 75 ...
Performing PCA on the Unrelated Set...
Predicting PC Values for the Related Set: Block 1 of 75 ...
...
Predicting PC Values for the Related Set: Block 75 of 75 ...
Concatenating Results...
Output:
summary(mypcair)
Call:
pcair(genoData = myGenoData, kinMat = KINGmat, divMat = KINGmat)
PCA Method: PC-AiR
Sample Size: 1985
Unrelated Set: 1665 Samples
Related Set: 320 Samples
Kinship Threshold: 0.02209709
Divergence Threshold: -0.02209709
Principal Components Returned: 20
Eigenvalues: 11.286 2.216 1.95 1.924 1.914 1.866 1.849 1.837 1.815 1.797 ...
MAF Filter: 0.01
SNPs Used: 644529
Options
using a reference population alongside sample; perhaps to determine which PCs capture which ancestral groups
pcair(genoData = mygenoData, unrel.set = IDs) # IDs of individuals from ref panel included in mygenoDate
pcair(genoData = mygenoData, kinMat = KINGmat, divMat = KINGmat, unrel.set = IDs) # IDs of individuals from ref panel included in mygenoDate; partition IDs first, then sample
Plotting PC-Air PCs
plot
method provided by GENESIS
package. Each point represents one individual. Visualization of population structure to identify clusters of individuals with similar ancestry. Can by altered by standard plot
function manipulation. Basis and More Details
default: black dots = unrelated subset**;** blue pluses = related subsets
# plot top 2 PCs
plot(mypcair)
# plot PCs 3 and 4
plot(mypcair, vx = 3, vy = 4)
PC-Relate
estimation of recent relatedness
4.Provides genetic relatedness estimates. Uses top PCs from PC-Air (ancestry capturing components) to adjust for population structure & individual ancestry in sample.
Input:
input | file | description |
---|---|---|
training.set |
mypcair $unrels |
vector of IIDs specifying unrelated subset to be used for ancestry adjustment per SNP |
genoData |
mygenoData |
GenotypeData class object |
pcMat |
mypcair $vectors[,1:n] |
matrix; columns are PCs 1 - n |
Command:
# run PC-Relate
mypcrelate <- pcrelate(genoData = mygenoData, pcMat = mypcair$vectors[,1:2], training.set = mypcair$unrels)
You should see the following verbiage:
Running Analysis with 748665 SNPs - in 75 Block(s)
Running Analysis with 1985 Samples - in 1 Block(s)
Using 2 PC(s) in pcMat to Calculate Adjusted Estimates
Using 1665 Samples in training.set to Estimate PC effects on Allele Frequencies
Computing PC-Relate Estimates...
...SNP Block 1 of 75 Completed - 7.165 mins
...
...SNP Block 75 of 75 Completed - 5.876 mins
Performing Small Sample Correction...
(optional) format output as gds
The pcrelate function will either return an object of class pcrelate
(when the input write.to.gds = FALSE
) or it will save the output to a GDS file (when the input write.to.gds = TRUE
). Saving the output to a GDS file is useful for large samples, as it allows for efficient storage and access to the estimates (see the package gdsfmt
for more details). The following command can be used to read in the results from a previous PC-Relate analysis saved to the GDS file “tmp_pcrelate.gds”
write.to.gds = FALSE
= pcrelate obj (default)write.to.gds = TRUE
= gds filetmp_pcrelate.gds
to read tmp_pcrelate.gds
:
library(gdsfmt)
mypcrelate <- openfn.gds("tmp_pcrelate.gds")
Output:
command | output |
---|---|
pcrelateReadKinship |
make a table of pairwise relatedness estimates |
pcrelateReadInbreed |
make a table of individual inbreeding coeficients |
pcrelateMakeGRM |
make a genetic relatedness matrix |
option | description |
---|---|
pcrelObj |
output from pcrelate; either a class pcrelate object or a GDS file |
scan.include |
vector of individual IDs specifying which individuals to include in the table or matrix; default NULL |
kin.thresh |
minimum kinship coefficient value to include in the table; default NULL which includes all pairs |
f.thresh |
minimum inbreeding coefficient value to include in the table |
scaleKin |
factor to multiply the kinship coefficients by in the GRM; default 2 |
# make pairwise relatedness estimates table
relatepairs.tbl <- pcrelateReadKinship(pcrelObj = mypcrelate, kin.thresh = NULL)
## ID1 ID2 nsnp kin k0 k1 k2
## 7 NA19919 NA19908 19437 0.24668482 0.0015676062 1.0157801 -0.017347662
## 17 NA19919 NA19909 19626 0.24792942 0.0000000000 0.9911198 0.008880185
## 138 NA20282 NA20301 19765 0.32326286 0.0573556808 0.5780613 0.364583049
## 186 NA19902 NA19901 19640 0.25122134 0.0000000000 0.9883502 0.011649796
## 212 NA19902 NA19900 19771 0.23177090 0.0007574672 1.0228805 -0.023637964
# make inbreeding coefficient table
inbreedcoef.tbl <- pcrelateReadInbreed(pcrelObj = mypcrelate, f.thresh = 2^(-11/2))
## ID nsnp f
## 11 NA20335 19949 0.03210545
## 13 NA19904 19814 0.02276732
## 20 NA20317 19933 0.02302650
## 30 NA20294 19827 0.02705174
## 37 NA20344 19905 0.03607983
# make GRM
mygrm <- pcrelateMakeGRM(pcrelObj = mypcrelate, scan.include = iids[1:5], scaleKin = 2)
# grm output
mygrm[1:5,1:5]
## CH30380 VA30171 VA30167 CH30357 VA70028
## 0.9921236602 0.001587362 0.004589604 0.002636372 0.0008911428
## 0.0015873618 1.006070553 0.002143034 0.001831966 -0.0025865272
## 0.0045896042 0.002143034 1.002365658 -0.005183818 0.0049334393
## 0.0026363721 0.001831966 -0.005183818 1.003906724 0.0062427112
## 0.0008911428 -0.002586527 0.004933439 0.006242711 1.0005800629
quantile(mygrm)
## 0% 25% 50% 75% 100%
## -3.260058e-02 -3.014012e-03 -5.799979e-05 2.950375e-03 1.169809e+00
save files
# for .csv output: sep=',' and change suffix to ".csv"
write.table(mygrm, paste("GENESIS_Kincoef_matrix_",genotype,date,".txt",sep=""), row.names = F, quote = F)
write.table(relatepairs.tbl.nothresh, paste("GENESIS_relatedpairs_",genotype,date,".txt",sep=""), row.names = F, quote = F)
write.table(inbreedcoef.tbl.nothresh, paste("GENESIS_inbreedcoef_",genotype,date,".txt",sep=""), row.names = F, quote = F)
5. Convert to GCTA-friendly format
Our goal is to run a GWAS using our new GRM to correct for population structure and recent genetic relatedness in a more robust way than relying on cryptic relatedness filters in PLINK and correcting for % global African Ancestry.
Approach:
- convert GENESIS grm to the gcta grm.gz format
- use gcta to convert the grm.gz to a gcta-compatible grm
grm.gz format: Estimate the GRM, save the lower triangle elements to a compressed text file (e.g. test.grm.gz) and save the IDs in a plain text file (e.g. test.grm.id). source; grm.gz description
Output file format test.grm.gz (no header line; columns are indices of pairs of individuals (row numbers of the test.grm.id), number of non-missing SNPs and the estimate of genetic relatedness)
1 1 1000 1.0021
2 1 998 0.0231
2 2 999 0.9998
3 1 1000 -0.0031
...
test.grm.id (no header line; columns are family ID and individual ID)
011 0101
012 0102
013 0103
...
Reformat to GCTA-friendly .grm.gz format
## R
# get id file
ids <- as.data.frame(colnames(grm))
colnames(ids) <- "fid" # since our cohort is (supposedly) unrelated, fid=iid
ids$iid <- colnames(grm)
head(ids)
# fid iid
# CH30380 CH30380
# VA30171 VA30171
# VA30167 VA30167
# CH30357 CH30357
# VA70028 VA70028
# CH30420 CH30420
write.table(ids, paste(geno,"_",date,"_genesis.grm.id", sep=""), quote = F, row.names = F, col.names = F) # col.names=F excludes header row
# get pairwaise kinship measure in linear dataframe
grm[upper.tri(grm, diag = F)] <- NA # set upper triangle of matrix to NA
cbind <- cbind(which(!is.na(grm),arr.ind = TRUE),na.omit(as.vector(grm)))
cbind <- as.data.frame(cbind)
head(cbind)
# [rownames] row col V3
# CH30380 1 1 0.9921236602
# VA30171 2 1 0.0015873618
# VA30167 3 1 0.0045896042
# CH30357 4 1 0.0026363721
# VA70028 5 1 0.0008911428
# CH30420 6 1 0.0036559534
Now we are just missing the nsnp column indicating the number of non-missing snps between individuals. This information is stored in the pcrelate
object we created earlier.
# take a look
mypcrelate$nsnp[1:5,1:5]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 628577 NA NA NA NA
# [2,] 613092 618998 NA NA NA
# [3,] 620244 613901 640392 NA NA
# [4,] 608682 608707 619293 623879 NA
# [5,] 618972 608727 630423 612669 688307
# get nsnp between samples
nsnp <- cbind(which(!is.na(mypcrelate$nsnp),arr.ind = TRUE),na.omit(as.vector(mypcrelate$nsnp)))
nsnp <- as.data.frame(nsnp)
head(nsnp)
# row col
# [1,] 1 1 628577
# [2,] 2 1 613092
# [3,] 3 1 620244
# [4,] 4 1 608682
# [5,] 5 1 618972
# [6,] 6 1 556901
# merge into gcta dataframe
gcta <- as.data.frame(cbind$row)
gcta$col <- cbind$col
gcta$nsnp <- nsnp$V3
gcta$kin <- cbind$V3
head(gcta)
# index$X1 X2 nsnp kin
# 1 2 613092 0.0015873618
# 1 3 620244 0.0045896042
# 1 4 608682 0.0026363721
# 1 5 618972 0.0008911428
# 1 6 556901 0.0036559534
# 1 7 578414 -0.0004089310
write.table(gcta, paste(geno,"_",date,"_genesis.grm.gz", sep=""), quote = F, row.names = F, col.names = F)
Convert from .grm.gz to gcta binary format
## commandline
geno="mydata.prefix"
bin="/media/BurchardRaid01/LabShare/Home/pgoddard/gcta1.26.0"
$bin/gcta64 --grm-gz ${geno} --make-grm --out ${geno}_gcta
You are now ready to do a GWAS in GCTA using your shiny new custom grm
GENESIS Pipeline
Here I have concatenated all the commands for the GENESIS grm development to gwas pipeline with minimal commenting. Pipeline assumes plink bindary files. Extensive explanations of each can be found above in the vignette.
## r
# set up
source("https://bioconductor.org/biocLite.R")
biocLite(c('GENESIS',"GWASTools", "SNPRelate"))
libs <- c("GENESIS", "GWASTools", "SNPRelate", "gdsfmt")
lapply(libs, require, character.only = TRUE)
rm(libs)
geno = "SAGE_mergedLAT-LATP_030816_rsIDs" # set to your plink data prefix
date=yymmdd # set date
setwd("wrkdir") # set wrkdir
# read in genotype data
snpgdsBED2GDS(bed.fn = paste(geno,".bed",sep=""), bim.fn = paste(geno,".bim",sep=""), fam.fn = paste(geno,".fam",sep=""), out.gdsfn = paste(geno,".gds",sep=""))
mygeno <- GdsGenotypeReader(paste(geno,".gds",sep=""))
mygenoData <- GenotypeData(mygeno)
close(mygeno)
## Pairwise measures of ancestry divergence
# calculate KING scores
gdsfile <- snpgdsOpen(paste(geno,".gds",sep="")) # read in GDS in SNPRelate format
class(gdsfile) # verify class "gds.class"
ibd_king <- snpgdsIBDKING(gdsfile, type="KING-robust", verbose=TRUE) # calculate king coefficients
KINGmat = as.matrix(ibd_king$kinship) # extract kinship matrix
rownames(KINGmat) <- c(ibd_king$sample.id)
colnames(KINGmat) <- c(ibd_king$sample.id)
# run pc-air
mypcair <- pcair(genoData = mygenoData, kinMat = KINGmat, divMat = KINGmat)
# plot PCs
plot(mypcair)
## Estimation of recent relatedness
# run pc-relate # this is VERY slow
mypcrelate <- pcrelate(genoData = myenoData, pcMat = mypcair$vectors[,1:2], training.set = mypcair$unrels)
# get kinship table, inbreeding table and grm
relatepairs.tbl.diag <- pcrelateReadKinship(pcrelObj = mypcrelate, kin.thresh = NULL)
inbreedcoef.tbl <- pcrelateReadInbreed(pcrelObj = mypcrelate, f.thresh = 2^(-11/2))
grm <- pcrelateMakeGRM(pcrelObj = mypcrelate, scaleKin = 2)
write.table(grm, paste("GENESIS_Kincoef_matrix_",geno,"_",date,".txt",sep=""), row.names = F, quote = F)
write.table(relatepairs.tbl, paste("GENESIS_relatedpairs_",geno,"_",date,".txt",sep=""), row.names = F, quote = F)
write.table(inbreedcoef.tbl, paste("GENESIS_inbreedcoef_",geno,"_",date,".txt",sep=""), row.names = F, quote = F)
## Reformat to GCTA-friendly .grm.gz format
ids <- as.data.frame(colnames(grm))
colnames(ids) <- "fid"
ids$iid <- colnames(grm)
# get pairwaise kinship measure in linear dataframe
grm[upper.tri(grm, diag = F)] <- NA
cbind <- cbind(which(!is.na(grm),arr.ind = TRUE),na.omit(as.vector(grm)))
# get nsnp between samples
nsnp <- cbind(which(!is.na(mypcrelate$nsnp),arr.ind = TRUE),na.omit(as.vector(mypcrelate$nsnp)))
nsnp <- as.data.frame(nsnp)
# merge into gcta dataframe
cbind <- as.data.frame(cbind)
gcta <- as.data.frame(cbind$row)
gcta$col <- cbind$col
gcta$nsnp <- nsnp$V3
gcta$kin <- cbind$V3
# save files
write.table(ids, paste(geno,"_",date,"_genesis.grm.id", sep=""), quote = F, row.names = F, col.names = F)
write.table(gcta, paste(geno,"_",date,"_genesis.grm.gz", sep=""), quote = F, row.names = F, col.names = F)
## commandline
# set variables
date="yymmdd" # update
geno="mydata.prefix_${date}" # point to your .grm.gz and .grm.id files
bin="/media/BurchardRaid01/LabShare/Home/pgoddard/gcta1.26.0" # wherever gcta is installed
phenofile="myphenotypes"
covarcat="my_categorical_covariates"
covarquant="my_numeric_covariates"
outname="${geno}_gcta_mlma"
tread="32" # adjust based on your system; typical is 10
# convert from .gz grm to gcta-ready grm
$bin/gcta64 --grm-gz ${geno}_genesis --make-grm --out ${geno}_gcta
# run GWAS in gcta
$bin/gcta64 --mlma --bfile $geno --grm $geno_gcta --pheno $phenofile --covar $covarcat --qcovar $covarquant --out $outname --thread-num $thread
# QED #
note: to look at just the first 5 rows and first 5 columns of matrix
# in shell
cut -d' ' -f 1-5 test.file | head -5 -
# in r
test.file[1:5,1:5]