Pool of materials - rl-reading-circle/board GitHub Wiki
Pool of materials on RL
Please add here materials for RL that you find interesting and intend to present during reading circle.
Theory
Subdivision into pieces:
- Intro pp. 13-22 (chapter of bandit can be skipped because problem setting is different in only this chapter)
- Finite Markov Decision Process (1) pp. 47-57
- Finite Markov Decision Process (2) pp. 58-69
- Dynamic Programming pp. 73-89
- Monte Carlo Method (1) pp. 91-103
- Monte Carlo Method (2) pp. 103-116 (can skip 5.8, 5.9)
- Temporal-Difference Learning (1) pp. 119-129
- Temporal-Difference Learning (2) pp. 129-138
- n-step Boot strapping pp. 141-157 (can skip 7.4 and 7.6)
- Planning and Learning with Tabular Methods (1) pp. 159-174
- Planning and Learning with Tabular Methods (2) pp. 177-188
- On-policy Prediction with Approximation (1) pp. 197-209
- On-policy Prediction with Approximation (2) pp. 210-222
- On-policy Prediction with Approximation (3) pp. 222-236
- On-policy Control with Approximation pp. 243-256
- Eligibility Trace (1) pp. 288-301
- Eligibility Trace (2) pp. 303-318 (can skip 12.6, 12.9)
- Policy Gradient Methods pp. 321-337
Practice
-
Q Learning Tutorial https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html
-
Super Mario Tutorial https://pytorch.org/tutorials/intermediate/mario_rl_tutorial.html#lets-play
-
Installation of Super Mario https://pypi.org/project/gym-super-mario-bros/