de novo Generate HMMs - SysBioChalmers/RAVEN GitHub Wiki
de novo Generate Hidden Markov Models
The users who have their own KEGG FTP Subscription can use KEGG FTP dump files as input in getKEGGModelForOrganism
. This provides an ability to reconstruct the model from the latest KEGG version available at the time while optimizing all the parameter settings in getKEGGModelForOrganism
for the best result. The user should therefore firstly delete all kegg***.mat
files from RAVENdir/external/kegg
and put the following files into the same directory:
reaction
. Can be retrieved from/kegg/ligand/reaction.tar.gz
.reaction.lst
. Can be retrieved from/kegg/ligand/reaction.tar.gz
.reaction_mapformula.lst
. Can be retrieved from/kegg/ligand/reaction.tar.gz
.compound
. This file should be concatenated from the two source files. The first file iscompound
and can be retrieved from/kegg/ligand/compound.tar.gz
. The second file isglycan
and can be retrieved from/kegg/ligand/glycan.tar.gz
.compound.inchi
. Can be retrieved from/kegg/ligand/compound.tar.gz
.ko
. Can be retrieved from/kegg/genes/ko.tar.gz
.genes.pep
. This file should be concatenated from the two source files. The first file iseukaryotes.pep
and can be retrieved from/kegg/genes/fasta/eukaryotes.pep.gz
. The second file isprokaryotes.pep
and can be retrieved from/kegg/genes/fasta/prokaryotes.pep.gz
.taxonomy
. Can be retrieved from/kegg/genes/misc/taxonomy
.
Once all the files are in place, the user can immediately run the reconstruction, e.g.:
model=getKEGGModelForOrganism('hsa','inputFasta.fa','inputDirectory','outputDirectory',true,true,true,true,10^-50,0.8,0.3,-1,inf,1);