Configure your experiment - PMatthaei/ma-league GitHub Wiki
League Parameters
Variable | Values | Effect |
---|---|---|
team_size |
int |
Defines the size of a team within the league. Only same size teams are currently reported. |
league_play_time_mins |
int |
Defines the time in mins each league match is running. |
league_runtime_hours |
int |
Defines the time in hours the league is running. |
Environment parameters
Variable | Values | Effect |
---|---|---|
play_mode |
normal/self/league |
|
headless |
True/False |
|
record |
True/False |
|
fps |
int |
|
draw_grid |
True/False |
|
infos |
True/False |
|
global_reward |
True/False |
|
grid_size |
True/False |
|
match_build_plan |
medium |
Experiment parameters
Variable | Values | Effect |
---|---|---|
save_model |
True/False |
Defines if checkpoints of learners are saved |
save_model_interval |
int |
Defines time step interval at which checkpoints of learners are saved |
--config |
qmix |
Defines the deployed algorithm |
runner |
episode/parallel |
Runs one/multiple env(s) for an episode |
batch_size |
int |
Number of episodes to train on |
batch_size_run |
int |
Number of environments to run in parallel |
test_nepisode |
int |
Number of episodes to test for |
test_interval |
int |
Test after X timesteps have passed |
test_greedy |
True/False |
Use greedy evaluation (if False, will set epsilon floor to 0 |
log_interval |
int |
Log summary of stats after every X timesteps |
runner_log_interval |
int |
Log runner stats (not test stats) every X timesteps |
learner_log_interval |
int |
Log training stats every {} timesteps |
t_max |
int |
Stop running after X timesteps |
use_cuda |
True/False |
Use gpu by default unless it isn't available |
buffer_cpu_only |
True/False |
If true we won't keep all of the replay buffer in vram |
RL Hyperparameters
Variable | Default | Effect |
---|---|---|
gamma |
0.99 |
|
batch_size |
32 |
Number of episodes to train on |
buffer_size |
32 |
Size of the replay buffer |
lr |
0.0005 |
Learning rate for agents |
critic_lr |
0.0005 |
Learning rate for critics |
optim_alpha |
0.99 |
RMSProp alpha |
optim_eps |
0.00001 |
RMSProp epsilon |
grad_norm_clip |
10 |
Reduce magnitude of gradients above this L2 norm |
Agent parameters
Variable | Default | Effect |
---|---|---|
agent |
rnn |
Default rnn agent |
rnn_hidden_dim |
64 |
Size of hidden state for default rnn agent |
obs_agent_id |
True |
Include the agent's one_hot id in the observation |
obs_last_action |
True |
Include the agent's last action (one_hot) in the observation |