4.4.5 WebLogo - WangLabTHU/GPro GitHub Wiki
hcwang and qxdu edited on Aug 4, 2023, 1 version
Seqlogo is an enhanced version of saliencymap, allowing you to gain a clearer understanding of the predictor's base preference for each site in the sequence. We also provide a function to directly draw seqlogos on a sequence set under utils/base.py
Caution: Note that the format of the sequence and expression files here should be consistent with the QuickStart section.
params | description | default value |
---|---|---|
predictor | the trained predictor class | |
predictor_modelpath | the pretrained model checkpoint, should be "x/xxx.pth" format | |
predictor_training_datapath | path of natural sequences, training set for predictor will be the best | |
predictor_expression_datapath | path of corresponding expression level with predictor_seqpath
|
|
report_path | saving folder | |
file_tag | saving name | |
num_seqs_to_test | sampling scales for frequency comparison | 200 |
plot_mode |
to_type mode of logomaker |
"saliency" |
from gpro.evaluator.seqlogo import plot_seqlogos
project_path = "your project path"
predictor_training_datapath = os.path.join(project_path,'data/diffusion_promoter/sequence_data.txt')
from gpro.predictor.cnn_k15.cnn_k15 import CNN_K15_language
predictor = CNN_K15_language(length=50)
predictor_modelpath = os.path.join(project_path, 'checkpoints/cnn_k15/' + 'checkpoint.pth')
plot_seqlogos(predictor, predictor_training_datapath, predictor_modelpath,
report_path="./results/", file_tag="CNNK15")
A direct version for simple sequences analysis is as below:
from gpro.utils.base import plot_weblogos
project_path = "your project path"
seqs_datapath = project_path + '/data/diffusion_prediction/seq.txt'
seqs = open_fa(seqs_datapath)
plot_weblogos("./results/test.png", seqs)
You will see the following output:
Results are highly consistent with https://weblogo.berkeley.edu/logo.cgi
The final result will be saved in the ./results
directory.