GWAS built in functions - hello31337/BI-SGX GitHub Wiki
Qliphoth provides built-in functions for GWAS using VCF format data uploaded from data owner.
Obtain list of filenames of available VCF files which match conditions (chromosome number, population and disease) designated by arguments. Note that conditions are arbitrarily registered by data owners when they upload their VCF files. The list is output to result message directly.
-
string: Chromosome number. It should be 1 to 22 or X or Y. -
string: Population. For more info, see this website. -
string: Disease type.
If you pass empty string for argument, it means that you don't designate any condition in that context. For example, if you pass empty string "" in the first argument, all of chromosome numbers will be regarded as the ones which should be searched.
Inquired filename list is directly output to result string.
-
int: Exit status.0means success,-1means error.
status = searchAnnotation("21", "JPT", "")
if status != 0
println "Error has occurred while inquiring available VCF list."
end
Search annotation whose reference ID (like rsXXXXXXXXX) is equal to the given argument from local database. You have to prepare database of VCF format data.
-
string: Reference ID. -
int: Display mode.0or1can be used.0is for raw VCF format,1is for listed format. -
int: Whether include information about diseases from Clinvar DB table, if exists.
-
int: Exit status.0means success,-1means error.
Inquired annotations are directly output to result string.
status = searchAnnotation("rs1337534909", 1, 1)
if status != 0
println "Error has occurred while searching annotation."
end
Calculate allele frequency of personal genotype data in uploaded VCF file which match with conditions designated by arguments.
-
string: Chromosome number. It should be 1 to 22 or X or Y. -
int: Position. Note that it is NOT a reference ID. It is the integer ofPOSin VCF file, not the value ofID. -
string: Population. For more info, see this website. -
string: Disease type.
-
int: Exit status.0means success,-1means error.
Obtained allele frequencies are directly output to result string. for every major/minor alleles with the alleles themselves.
status = alleleFreq("21", 10417440, "JPT", "none")
if status != 0
println "Error has occurred while calculating allele frequency."
end
Execute Fisher's exact test (FET) with personal genotype data in uploaded VCF file which match with conditions designated by arguments. Currently this function can execute FET for the difference of populations. p-values are directly output to result string.
-
string: Chromosome number. It should be 1 to 22 or X or Y. -
string: Position(s). Note that it is NOT a reference ID. It is the integer ofPOSin VCF file, not the value ofID. If you require to FET for multiple positions, divide them by;like"1000;2000;3000". -
string: Two populations. The number of population MUST BE 2, therefore 1 population and 3 or more populations are not allowed currently. -
string: Disease type to filter.
-
int: Exit status.0means success,-1means error.
Obtained p-values are directly output to result string.
status = fisherExactTest("21", "9411410;10417440", "JPT;CHB", "")
if status != 0
println "Error has occurred while executing Fisher's exact test."
end
Execute learning and prediction of logistic regression (LR) with personal genotype data in uploaded VCF file which match with conditions designated by arguments. Currently this function can execute LR for the difference of populations. This function executes LR learning for designated positions with all available VCF files firstly. Then this function executes prediction with learned machine for designated position and population. Prediction results are directly output to result string.
-
string: Chromosome number. It should be 1 to 22 or X or Y. -
string: Position(s). Note that it is NOT a reference ID. It is the integer ofPOSin VCF file, not the value ofID. If you require to LR for multiple positions, divide them by;like"1000;2000;3000". -
string: Populations to predict with learned machine. -
string: Disease type to filter.
-
int: Exit status.0means success,-1means error.
Obtained prediction results are directly output to result string.
status = logisticRegression("21", "9411410;10417440", "JPT", "")
if status != 0
println "Error has occurred while executing logistic regression."
end
Execute principal component analysis (PCA) with personal genotype data in uploaded VCF file which match with conditions designated by arguments.
-
string: Chromosome number. It should be 1 to 22 or X or Y. -
string: Position(s). Note that it is NOT a reference ID. It is the integer ofPOSin VCF file, not the value ofID. If you require to PCA for multiple positions, divide them by;like"1000;2000;3000". -
string: Population. -
string: Disease type to filter.
-
int: Exit status.0means success,-1means error.
Results of PCA are directly output to result string.
status = PCA("21", "9411410;10417440", "JPT", "")
if status != 0
println "Error has occurred while executing PCA."
end