4.3.2 Genetic - WangLabTHU/GPro GitHub Wiki

hcwang and qxdu edited on Aug 4, 2023, 1 version

Introduction

The genetic algorithm (GA), developed by John Holland and his collaborators in the 1960s and 1970s, is a model or abstraction of biological evolution based on Charles Darwin's theory of natural selection. Holland was probably the first to use the crossover and recombination, mutation, and selection in the study of adaptive and artificial systems. These genetic operators form the essential part of the genetic algorithm as a problem-solving strategy. Since then, many variants of genetic algorithms have been developed and applied to a wide range of optimization problems, from graph coloring to pattern recognition, from discrete systems (such as the travelling salesman problem) to continuous systems (e.g., the efficient design of airfoil in aerospace engineering), and from financial markets to multi-objective engineering optimization.

Genetic algorithm can be used for implicit space optimization of our WGAN model. The schematic diagram of a workflow is shown below[1].

Caution: The current algorithm defaults to using the WGAN generator and CNNK15 predictor. Please provide the model you have already trained. This program will search for the most effective hidden space

Input Parameters

Initialization params

params description default value
generator_modelpath trained model path of generator None
predictor_modelpath trained model path of predictor None
natural_datapath natural sequences datapath None
sample_number default sampling scale at each epoch None
savepath final results saving directory None
z_dim dimension of hidden state for WGAN model 128
seq_len sequence length 50

Running params

params description default value
P_rep dropping rate of delRep 0.3
P_new New generation scale 0.25
P_elite Elite Probability in Evolutionary Algorithms 0.25
MaxIter Maximum Iteration epoch 1000
MaxPoolsize length of final selecting results 2000

Demo

Before executing optimizer, you should have trained a generator and a predictor.

A simple demo will work like:

from gpro.optimizer.heuristic.genetic import GeneticAlgorithm

# (1) define the generator
default_root = "your working directory"
generator_modelpath = os.path.join(str(default_root), 'checkpoints/wgan/checkpoints/net_G_12.pth')

# (2) define the predictor
predictor_modelpath = os.path.join(default_root), 'checkpoints/cnn_k15/checkpoint.pth')

# (3) select the highly-expressed sequence
natural_datapath = default_root + '/data/diffusion_prediction/seq.txt'

tmp = GeneticAlgorithm(generator_modelpath=generator_modelpath, predictor_modelpath=predictor_modelpath,
                          natural_datapath=natural_datapath, savepath="./optimization/Genetic")

tmp.run()

Results

Resulting files consists of compared_with_natural.pdf, each_iter_distribution.pdf,ExpIter.txt, ExpIter.csv

files description
compared_with_natural.pdf Box plot comparing model generated results with natural results
each_iter_distribution.pdf Record a boxplot of the improvement effect every 100 epochs
ExpIter.txt Save the FASTA file for the final result sequence
ExpIter.csv Save the sequences and predictions for the final result sequence. Store every 100 epochs.

A box plot for compared_with_natural.pdf is shown below.

A box plot for each_iter_distribution.pdf is shown below.

Citations

[1] Woodward, Robert & Kelleher, Edmund. (2016). Towards 'smart lasers': Self-optimisation of an ultrafast pulse source using a genetic algorithm. Scientific Reports. 6. 10.1038/srep37616. 
⚠️ **GitHub.com Fallback** ⚠️