Instructions - mateuslevisf/xrl-pucrio GitHub Wiki

Installation

Creating environment from environment.yml file:

conda env create -f conda_env.yml

To activate resulting environment:

conda activate xrlpucrio

Execution

Running through the CLI

To run with default configuration:

python xrlpucrio.py

The -h option can be added to the above command line in order to get more info about running options. Note that results are saved in the "results" folder as the program is executed but the folder is completely erased at the start of each run - if the user wants to keep their results, they should be moved elsewhere on end of execution.

Example of more detailed configurations:

python xrlpucrio.py -t viper -e blackjack

The above command runs the VIPER technique on the Blackjack environment.

python xrlpucrio.py -t hvalues -e cartpole -n 100000

The above command runs the H-Values/Belief Map technique on the Cart Pole environment for a total of 100.000 episodes.

Using file inputs

Using a file input allows the user to not only have more control of input parameters but also run the project in an easier manenr since writing more complex inputs can be quite difficult through the CLI when one wants to use multiple parameters. To use a file, one can write the following command:

python xrlpucrio.py -f viper_args.json

Both the viper_args.json and h_values_args.json files serve as templates that the user can edit and show the full breadth of parameters the user can alter.

Testing

To run all tests:

python -m unittest discover -v

The "main" tests (in files "test_run_hvalues.py" and "test_run_viper.py") take quite a while to run (around 10 minutes).