June 18 Notes and New Reward Function - opendigital/RL-collective-action GitHub Wiki

Nash Equilibrium

E := the total amount everyone contributes over 20 rounds i := the total amount you contribute over 20 rounds

cooperating amount: 1.6 * (E + i) / 4 = 0.4(E + i) competing amount: 25 * 20 - i

0.4(E + i) + 500 - i 500 - 0.6i + 0.4E

compute the best value of i for every single value of E

New Reward Function

  • Minimize regret

|(average increase in contribution from all other agents between s' and s) - (your change in contribution between s' and s)|