VGALize - nordinzakaria/V-GA-lize GitHub Wiki

What is V-GA-lize?

A long-standing problem with randomized, metaheuristics-based optimization algorithms like Genetic Algorithm (GA) is that while the results it obtains can be competitive compared to deterministic solutions such as linear programming and approximation algorithm, it eludes analysis, making it difficult to address the question of why at times it work optimally, and at other times far from optimality. There have been theoretical work to address this question, but in general, the setup assumed is that of toy problems.

V-GA-lize is a visualization solution to the problem. While it is still work in progress (contributions welcomed), the vision is to enable dashboard-like analytics of GA output dump. The output dump here is assumed to be that of a multi-population GA, comprising of populations of individuals produced by GA at each generation. The idea is that by analyzing this data, one can gain insight into the inner working of a GA on a particular problem.

GA Dump Format

The GA data comprises of populations. A population has generations. A generation consists of clusters, and a cluster contains individuals. The values above (in <>) can vary for each population or generation or cluster.

An individual contains values, and comprises of genes. is fixed globally, that is for all individuals across all populations, generations and clusters. can vary for each individual.

The fitness values in a dataset can be inverted: value = max_value - value. To do so the <invert_fitness> is set to 1 for a fitness value; 0 indicates no inversion.

The data type of each gene in the data can be one of the following integers:

  • 0 : FLOAT
  • 1 : INTEGER
  • 2 : DOUBLE
  • 3 : CHAR
  • 4 : STRING

The GA data file is assumed to be in the following format (ignore new lines, curly brackets and indentation - those are just to make the following readable):

<chromosome-type>
<numfitness> numfitness*<fitness names>  numfitness*<invert_fitness>
<numpop>
 numpop*{
          <numgen>
           numgen*{
                    <numcluster>
                     numcluster*{
                                  <numind>
                                   numind*{
                                            numfitness*<fitness-value> <rank> <p0> <p1> <numgenes> numgenes*<values>
                                          }
                                }
                  }
        }

An example data file is as follows, for a GA with float-type genes, 1 fitness value named "Quality", to be inverted, 1 population, 2 generations, and 1 cluster and 3 individuals (each with 2 genes) at each generation:

0 
1 Quality 1
1
2
1 3
0.4 0 -1 -1 2 0.1 0.2
0.25 1 -1 -1 2 0.12 0.3
0.1 2 -1 -1 2 0.01 0.3
1 3
0.5 0 0 1 2 0.12 0.21
0.3 1 1 2 2 0.22 0.13
0.1 2 0 2 2 0.31 0.23
⚠️ **GitHub.com Fallback** ⚠️