June 18 Notes and New Reward Function - opendigital/RL-collective-action GitHub Wiki
Nash Equilibrium
E := the total amount everyone contributes over 20 rounds i := the total amount you contribute over 20 rounds
cooperating amount: 1.6 * (E + i) / 4 = 0.4(E + i) competing amount: 25 * 20 - i
0.4(E + i) + 500 - i 500 - 0.6i + 0.4E
compute the best value of i for every single value of E
New Reward Function
- Minimize regret
|(average increase in contribution from all other agents between s' and s) - (your change in contribution between s' and s)|