Pool of materials - rl-reading-circle/board GitHub Wiki

Pool of materials on RL

Please add here materials for RL that you find interesting and intend to present during reading circle.

Subdivision into pieces:

Intro pp. 13-22 (chapter of bandit can be skipped because problem setting is different in only this chapter)
Finite Markov Decision Process (1) pp. 47-57
Finite Markov Decision Process (2) pp. 58-69
Dynamic Programming pp. 73-89
Monte Carlo Method (1) pp. 91-103
Monte Carlo Method (2) pp. 103-116 (can skip 5.8, 5.9)
Temporal-Difference Learning (1) pp. 119-129
Temporal-Difference Learning (2) pp. 129-138
n-step Boot strapping pp. 141-157 (can skip 7.4 and 7.6)
Planning and Learning with Tabular Methods (1) pp. 159-174
Planning and Learning with Tabular Methods (2) pp. 177-188
On-policy Prediction with Approximation (1) pp. 197-209
On-policy Prediction with Approximation (2) pp. 210-222
On-policy Prediction with Approximation (3) pp. 222-236
On-policy Control with Approximation pp. 243-256
Eligibility Trace (1) pp. 288-301
Eligibility Trace (2) pp. 303-318 (can skip 12.6, 12.9)
Policy Gradient Methods pp. 321-337