GWAS built in functions - hello31337/BI-SGX GitHub Wiki

Qliphoth provides built-in functions for GWAS using VCF format data uploaded from data owner.

List of GWAS built-in functions

inquiryVCF

Obtain list of filenames of available VCF files which match conditions (chromosome number, population and disease) designated by arguments. Note that conditions are arbitrarily registered by data owners when they upload their VCF files. The list is output to result message directly.

arguments

  • string: Chromosome number. It should be 1 to 22 or X or Y.
  • string: Population. For more info, see this website.
  • string: Disease type.

If you pass empty string for argument, it means that you don't designate any condition in that context. For example, if you pass empty string "" in the first argument, all of chromosome numbers will be regarded as the ones which should be searched. Inquired filename list is directly output to result string.

return value

  • int: Exit status. 0 means success, -1 means error.

example

status = searchAnnotation("21", "JPT", "")

if status != 0
   println "Error has occurred while inquiring available VCF list." 
end



searchAnnotation

Search annotation whose reference ID (like rsXXXXXXXXX) is equal to the given argument from local database. You have to prepare database of VCF format data.

arguments

  • string: Reference ID.
  • int: Display mode. 0 or 1 can be used. 0 is for raw VCF format, 1 is for listed format.
  • int: Whether include information about diseases from Clinvar DB table, if exists.

return value

  • int: Exit status. 0 means success, -1 means error.

Inquired annotations are directly output to result string.

example

status = searchAnnotation("rs1337534909", 1, 1)

if status != 0
   println "Error has occurred while searching annotation." 
end



alleleFreq

Calculate allele frequency of personal genotype data in uploaded VCF file which match with conditions designated by arguments.

arguments

  • string: Chromosome number. It should be 1 to 22 or X or Y.
  • int: Position. Note that it is NOT a reference ID. It is the integer of POS in VCF file, not the value of ID.
  • string: Population. For more info, see this website.
  • string: Disease type.

return value

  • int: Exit status. 0 means success, -1 means error.

Obtained allele frequencies are directly output to result string. for every major/minor alleles with the alleles themselves.

example

status = alleleFreq("21", 10417440, "JPT", "none")
if status != 0
   println "Error has occurred while calculating allele frequency." 
end



fisherExactTest

Execute Fisher's exact test (FET) with personal genotype data in uploaded VCF file which match with conditions designated by arguments. Currently this function can execute FET for the difference of populations. p-values are directly output to result string.

arguments

  • string: Chromosome number. It should be 1 to 22 or X or Y.
  • string: Position(s). Note that it is NOT a reference ID. It is the integer of POS in VCF file, not the value of ID. If you require to FET for multiple positions, divide them by ; like "1000;2000;3000".
  • string: Two populations. The number of population MUST BE 2, therefore 1 population and 3 or more populations are not allowed currently.
  • string: Disease type to filter.

return value

  • int: Exit status. 0 means success, -1 means error.

Obtained p-values are directly output to result string.

example

status = fisherExactTest("21", "9411410;10417440", "JPT;CHB", "")
if status != 0
   println "Error has occurred while executing Fisher's exact test." 
end



logisticRegression

Execute learning and prediction of logistic regression (LR) with personal genotype data in uploaded VCF file which match with conditions designated by arguments. Currently this function can execute LR for the difference of populations. This function executes LR learning for designated positions with all available VCF files firstly. Then this function executes prediction with learned machine for designated position and population. Prediction results are directly output to result string.

arguments

  • string: Chromosome number. It should be 1 to 22 or X or Y.
  • string: Position(s). Note that it is NOT a reference ID. It is the integer of POS in VCF file, not the value of ID. If you require to LR for multiple positions, divide them by ; like "1000;2000;3000".
  • string: Populations to predict with learned machine.
  • string: Disease type to filter.

return value

  • int: Exit status. 0 means success, -1 means error.

Obtained prediction results are directly output to result string.

example

status = logisticRegression("21", "9411410;10417440", "JPT", "")
if status != 0
   println "Error has occurred while executing logistic regression." 
end



PCA

Execute principal component analysis (PCA) with personal genotype data in uploaded VCF file which match with conditions designated by arguments.

arguments

  • string: Chromosome number. It should be 1 to 22 or X or Y.
  • string: Position(s). Note that it is NOT a reference ID. It is the integer of POS in VCF file, not the value of ID. If you require to PCA for multiple positions, divide them by ; like "1000;2000;3000".
  • string: Population.
  • string: Disease type to filter.

return value

  • int: Exit status. 0 means success, -1 means error.

Results of PCA are directly output to result string.

example

status = PCA("21", "9411410;10417440", "JPT", "")
if status != 0
   println "Error has occurred while executing PCA." 
end
⚠️ **GitHub.com Fallback** ⚠️