Termination of a BIT_FED run - martaenciso/BIT_FED GitHub Wiki

A major problem of most current swarm techniques is the difficulty to detect when correct sampling has been achieved. BIT_FED uses a completely automatic procedure, reducing in this way the subjectivity of the method.

How BIT_FED works

A BIT_FED run may stop due to two different reasons: (a) all the runs in the current epoch have failed (b) there are no more relevant configurations to spawn from.

The first cause is usually due to issues in the computer cluster. They will depend on the particular architecture and setup of the used machines. The second reason, in contrast, is part of the core of the BIT_FED algorithm and controlled by the SelectSnapshots.f90 program.

In order to explain how BIT_FED works, we will use an example with cas=1 (selecting snapshots from the N most populated bins) and forever=1 (eliminating bins already launched from). You can know more about these options here.

For each epoch, BIT_FED randomly selects snapshots from bins that (1) have not been spawned from before (ie. each bin can only be relaunched once if forever=1) and (2) are one of the N most populated bins (to meet the population criteria of cas=1).
In the first epochs, step 1 is carried out without further issues, always updating the files visited.txt and launched.txt. As the BIT_FED run continues, there may be no snapshots left after meeting the aforementioned criteria (for instance, if all the runs end in a local minimum already explored). In this case, the new snapshots are selected from those present in visited.txt and not in launched.txt (ie. BIT_FED recovers previously reached states not already spawned from).
When the situation in step 2 happens but there are no new snapshots in visited.txt, SelectSnapshots prints the flag "STOP: Sampling is now complete". This flag is read by BIT_FED.sh, which terminates the simulation.