2020 09 09 Threshold offset plots - WojciechMigda/TruRL GitHub Wiki

Threshold offset plot for `max_episode_steps=500`

Run parameters:

Episodes: 1000
max_episode_steps: 500
KBinsDiscretizer({
    {34, -4.800000, 4.800000},
    {34, -2.600000, 2.600000},
    {34, -0.418000, 0.418000},
    {34, -3.000000, 3.000000},})
TsetliniClassifierBitwise({
        "threshold": 32000,
        "s": 4.000000,
        "number_of_regressor_clauses": 3200,
        "number_of_states": 127,
        "boost_true_positive_feedback": 1,
        "random_state": 1,
        "n_jobs": 6,
        "clause_output_tile_size": 16,
        "weighted": true,
        "max_weight": 2147483647,
        "verbose": false
    })

Four Threshold offset values were examined: 0, 400, 800, and 1600.

Plot AUC Plot

Conclusions:

with number of episode steps increased 2.5x (500 vs. 200), but Threshold value kept at 32000, the model learn very slowly,
using Threshold offsets boosts initial learning speed, but then the models lose in performance and even fall behind non-offset model,

Plotting scripts

steps_200_Toff.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import plac
import numpy as np
import pandas as pd


def main():
    df = pd.read_csv('steps_500_Toff.csv', header=None, names=['t0', 't400', 't800', 't1600'])

    import matplotlib.pyplot as plt

    plt.figure()

    lw = 2

    plt.plot(df.index + 1, df['t0'], lw=lw, color='orange', alpha=0.7, label='t=0')
    plt.plot(df.index + 1, df['t400'], lw=lw, color='red', alpha=0.7, label='t=400')
    plt.plot(df.index + 1, df['t800'], lw=lw, color='purple', alpha=0.7, label='t=800')
    plt.plot(df.index + 1, df['t1600'], lw=lw, color='navy', alpha=0.7, label='t=1600')

    plt.xlabel("Episode")
    plt.ylabel("Avg. reward")

    plt.xlim(1, 1000)
    plt.ylim(0, 500)

    plt.title("Averaged 10x runs, 500 steps\n[3200 clauses, T=32000, s=4.0]")
    plt.legend(loc='upper left')

    plt.show()
    return 0


if __name__ == '__main__':
    plac.call(main)

steps_200_Toff_AUC.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import plac
import numpy as np
import pandas as pd


def main():
    df = pd.read_csv('steps_500_Toff.csv', header=None, names=['t0', 't400', 't800', 't1600'])

    df = df.cumsum(axis=0)

    import matplotlib.pyplot as plt

    plt.figure()

    lw = 2

    plt.plot(df.index + 1, df['t0'], lw=lw, color='orange', alpha=0.7, label='t=0')
    plt.plot(df.index + 1, df['t400'], lw=lw, color='red', alpha=0.7, label='t=400')
    plt.plot(df.index + 1, df['t800'], lw=lw, color='purple', alpha=0.7, label='t=800')
    plt.plot(df.index + 1, df['t1600'], lw=lw, color='navy', alpha=0.7, label='t=1600')

    plt.xlabel("Episode")
    plt.ylabel("Avg. cumulative reward")

    plt.xlim(1, 1000)
    plt.ylim(0, 180000)

    plt.title("Averaged 10x runs, 500 steps\n[3200 clauses, T=32000, s=4.0]")
    plt.legend(loc='lower right')

    plt.show()
    return 0


if __name__ == '__main__':
    plac.call(main)

Data

Location: /experiments/2020-09-09_step500_T32k_Toff

Logs were created by running /experiments/run.sh script (invocation parameters hardcoded inside). Logs were transformed into CSV file with averaged runs by executing /experiments/run_csv.py:

../run_csv.py steps_500_Toff.csv run_test2.log run_test3.log run_test4.log run_test5.log

Commit