Monte Carlo Control - HiIAmTzeKean/SC3000-Artificial-Intelligence GitHub Wiki


tags:

  • 🌱
  • AI
  • ComputerScience date: 24--Apr--2023

Monte Carlo Control

Idea

Pseudocode

Repeat till T iterations
    Loop
        Generate episodes
    End
    // Policy Evaluation
    For each s,a
        For each episode
            R <- Calculate G_t of s,a
            Append R to Return(s,a)
        Q(s,a) <- Average(Return(s,a))
    End
    // Policy improvement step
    For each state
        a* <- arg max_a Q(s,a)
        update pi(s) with Epsilon soft policy
    End
End

Links:

⚠️ **GitHub.com Fallback** ⚠️