Epsilon soft policy - HiIAmTzeKean/SC3000-Artificial-Intelligence GitHub Wiki


tags:

  • 🌱
  • AI
  • ComputerScience date: 20--Feb--2023

Epsilon soft policy

Related to Soft policy

Different from the greedy approach where each action has at least $p = \frac{\epsilon}{|A|}$ chance of being selected. $$\pi(s) =
\begin{cases} p=1-\epsilon+\frac{\epsilon}{|A|}, & \alpha=\alpha^* \ p=\frac{\epsilon}{|A|}, & \alpha\ne\alpha^* \end{cases}$$


Links:

⚠️ **GitHub.com Fallback** ⚠️