Configure your experiment - PMatthaei/ma-league GitHub Wiki

League Parameters

Variable	Values	Effect
`team_size`	`int`	Defines the size of a team within the league. Only same size teams are currently reported.
`league_play_time_mins`	`int`	Defines the time in mins each league match is running.
`league_runtime_hours`	`int`	Defines the time in hours the league is running.

Environment parameters

Variable	Values	Effect
`play_mode`	`normal/self/league`
`headless`	`True/False`
`record`	`True/False`
`fps`	`int`
`draw_grid`	`True/False`
`infos`	`True/False`
`global_reward`	`True/False`
`grid_size`	`True/False`
`match_build_plan`	`medium`

Experiment parameters

Variable	Values	Effect
`save_model`	`True/False`	Defines if checkpoints of learners are saved
`save_model_interval`	`int`	Defines time step interval at which checkpoints of learners are saved
`--config`	`qmix`	Defines the deployed algorithm
`runner`	`episode/parallel`	Runs one/multiple env(s) for an episode
`batch_size`	`int`	Number of episodes to train on
`batch_size_run`	`int`	Number of environments to run in parallel
`test_nepisode`	`int`	Number of episodes to test for
`test_interval`	`int`	Test after X timesteps have passed
`test_greedy`	`True/False`	Use greedy evaluation (if False, will set epsilon floor to 0
`log_interval`	`int`	Log summary of stats after every X timesteps
`runner_log_interval`	`int`	Log runner stats (not test stats) every X timesteps
`learner_log_interval`	`int`	Log training stats every {} timesteps
`t_max`	`int`	Stop running after X timesteps
`use_cuda`	`True/False`	Use gpu by default unless it isn't available
`buffer_cpu_only`	`True/False`	If true we won't keep all of the replay buffer in vram

RL Hyperparameters

Variable	Default	Effect
`gamma`	`0.99`
`batch_size`	`32`	Number of episodes to train on
`buffer_size`	`32`	Size of the replay buffer
`lr`	`0.0005`	Learning rate for agents
`critic_lr`	`0.0005`	Learning rate for critics
`optim_alpha`	`0.99`	RMSProp alpha
`optim_eps`	`0.00001`	RMSProp epsilon
`grad_norm_clip`	`10`	Reduce magnitude of gradients above this L2 norm

Agent parameters

Variable	Default	Effect
`agent`	`rnn`	Default rnn agent
`rnn_hidden_dim`	`64`	Size of hidden state for default rnn agent
`obs_agent_id`	`True`	Include the agent's one_hot id in the observation
`obs_last_action`	`True`	Include the agent's last action (one_hot) in the observation