Reflexion - chunhualiao/public-docs GitHub Wiki

Reflexion: Language Agents with Verbal Reinforcement Learning, 2023

The paper "Reflexion: Language Agents with Verbal Reinforcement Learning" introduces a novel framework for reinforcing language agents through linguistic feedback rather than traditional weight updates[1][3]. Key aspects of the Reflexion framework include:

Verbal Reinforcement: Agents reflect on task feedback signals verbally and maintain their reflective text in an episodic memory buffer[1][3].
Flexibility: Reflexion can incorporate various types of feedback (scalar values or free-form language) from different sources (external or internally simulated)[3][5].
Performance Improvements: The approach achieves significant improvements over baseline agents across diverse tasks, including sequential decision-making, coding, and language reasoning[3][5].
Coding Benchmark: Reflexion achieved a 91% pass@1 accuracy on the HumanEval coding benchmark, surpassing the previous state-of-the-art GPT-4's 80% accuracy[3][5].
Model Components: The framework utilizes three distinct models: an Actor (Ma) for generating text and actions, an Evaluator (Me) for scoring outputs, and a Self-Reflection model (Msr) for generating verbal reinforcement cues[3].
Learning Process: Reflexion converts binary or scalar feedback from the environment into verbal feedback, which is then used as additional context for the language model agent in subsequent episodes[3].
Interpretability: The approach provides a more interpretable and explicit form of episodic memory compared to traditional reinforcement learning methods[4].

The authors conducted ablation and analysis studies to provide insights into how different feedback signals, incorporation methods, and agent types affect performance[3][5].

Citations: