ReAct (Reasoning Acting) - joehubert/ai-agent-design-patterns GitHub Wiki
Classification
Intent
ReAct interleaves reasoning steps with action steps to create a structured approach for solving complex problems. It enables LLM agents to think critically about their next move, execute that move, observe the results, and refine their approach based on those observations in an iterative process.
Also Known As
- Reasoning-Action Cycles
- Think-Then-Act Pattern
- Observation-Reasoning-Action Loop
Motivation
Traditional LLM prompting often encounters challenges when dealing with complex reasoning tasks that require multiple steps and external interaction:
- LLMs may jump to conclusions without fully analyzing the problem
- They might fail to leverage external tools effectively
- Complex problems require breaking down into manageable steps
- Without feedback loops, errors can propagate and compound
The ReAct pattern addresses these challenges by explicitly alternating between reasoning about a situation and taking concrete actions based on that reasoning. Each action yields observations that inform the next reasoning step, creating a feedback loop that leads to more reliable problem solving.
For example, when answering a complex factual question, a ReAct agent might:
- Reason: Analyze what information is needed and identify knowledge gaps
- Act: Query a search engine for relevant facts
- Observe: Process the search results
- Reason: Determine if the information is sufficient or if additional searches are needed
- Act: If needed, perform a more refined search or proceed to synthesize an answer
This structured approach helps prevent hallucination, improves accuracy, and creates transparency in the agent's decision-making process.
Applicability
Use the ReAct pattern when:
- Tasks require multiple steps to complete
- External tools or information sources are available
- Problems involve complex reasoning that benefits from breaking down into smaller steps
- Transparency in the agent's decision-making process is important
- Tasks involve exploration of multiple possible solution paths
- Error recovery and self-correction capabilities are needed
- Integration with external systems is required
Structure
flowchart TD
Start([Task Received]) --> Init[Initialize ReAct Process]
Init --> Reason[Reason About Current State]
Reason --> Plan[Plan Next Action]
Plan --> Decision{Is Task Complete?}
Decision -- Yes --> Synthesize[Synthesize Final Answer]
Synthesize --> End([Return Response])
Decision -- No --> Execute[Execute Planned Action]
Execute --> Observe[Collect Observations]
Observe --> UpdateMem[Update Memory & Context]
UpdateMem --> Reason
subgraph Reasoning Phase
Reason
Plan
Decision
Synthesize
end
subgraph Action Phase
Execute
Observe
UpdateMem
end
classDef startNode fill:#bfb,stroke:#333,stroke-width:1px
classDef endNode fill:#fbb,stroke:#333,stroke-width:1px
classDef reasoningNode fill:#f9f,stroke:#333,stroke-width:1px
classDef actionNode fill:#bbf,stroke:#333,stroke-width:1px
classDef decisionNode fill:#ffd,stroke:#333,stroke-width:1px
class Start,Init startNode
class End endNode
class Reason,Plan,Synthesize reasoningNode
class Execute,Observe,UpdateMem actionNode
class Decision decisionNode
Components
-
Reasoning Module: Responsible for analyzing the current state, evaluating options, and planning the next action. This component handles the thinking phase of the cycle.
-
Action Controller: Executes the planned actions, which may include calling external APIs, retrieving information, or generating output. This component translates reasoning into concrete operations.
-
Observation Collector: Captures the results of actions, such as API responses, search results, or system state changes, and formats them for the reasoning module to process.
-
Memory Manager: Maintains context across reasoning-action cycles, storing previous observations, reasoning steps, and actions to inform future decisions.
-
Planning Orchestrator: Coordinates the overall flow between reasoning and action steps, determining when to switch between them and tracking progress toward the goal.
Interactions
The ReAct pattern typically follows this sequence:
-
The agent receives a task or query that requires multiple steps to resolve.
-
The Reasoning Module analyzes the task, breaks it down into sub-goals, and identifies the next action to take.
-
The Planning Orchestrator passes control to the Action Controller, which executes the identified action (e.g., search for information, calculate a value, generate content).
-
The Observation Collector captures the results of the action and formats them appropriately.
-
The Memory Manager updates the context with new information from the observation.
-
Control returns to the Reasoning Module, which evaluates the new information, determines if the goal has been met, and plans the next action if needed.
-
This cycle continues until the task is complete or a stopping condition is met.
Throughout this process, each component maintains transparency by explicitly documenting its reasoning, actions, and observations.
Consequences
Benefits
- Improved Accuracy: The iterative feedback loop helps correct errors and refine understanding.
- Greater Transparency: The explicit reasoning steps make the agent's decision process visible and auditable.
- Enhanced Tool Use: The pattern naturally integrates with external tools and APIs.
- Reduced Hallucination: Grounding reasoning in concrete actions and observations reduces the risk of fabricated information.
- Adaptability: The agent can adjust its approach based on intermediate results.
Limitations
- Computational Overhead: The explicit reasoning steps require additional computation compared to simpler approaches.
- Increased Latency: The multi-step nature of ReAct can lead to longer response times.
- Complexity: Implementing the pattern requires more sophisticated orchestration logic.
- Context Management: Maintaining the full reasoning and action history may exceed context limits for complex tasks.
Performance Implications
- Token consumption is higher due to the explicit reasoning steps
- Response time increases with the number of reasoning-action cycles
- Memory usage grows as the interaction history expands
Implementation
To implement the ReAct pattern effectively:
-
Define a Clear Prompt Structure:
- Include specific sections for Reasoning, Action, and Observation
- Provide examples demonstrating the expected format for each section
- Establish conventions for indicating transitions between phases
-
Design an Effective Action Space:
- Define the set of possible actions the agent can take
- Create structured formats for action parameters
- Implement validation logic for action requests
-
Establish Observation Formatting:
- Standardize how observations are presented to the reasoning module
- Include relevant metadata such as source reliability or timestamp
- Handle errors and unexpected results gracefully
-
Implement Memory Management:
- Develop strategies for summarizing previous cycles when context length is limited
- Prioritize retention of critical information
- Consider using external storage for detailed history
-
Create Stopping Criteria:
- Define clear conditions for task completion
- Implement timeouts or cycle limits to prevent infinite loops
- Add error recovery mechanisms for when actions fail
Common pitfalls to avoid:
- Allowing reasoning to become too verbose, consuming valuable context space
- Implementing too many or overly complex actions that the agent struggles to use effectively
- Failing to provide sufficient guidance on when to use which actions
- Not handling error states from external tools gracefully
Code Examples
Variations
ReAct with Self-Reflection
This variation adds an explicit reflection step after each observation, where the agent evaluates the quality of its previous reasoning and actions before proceeding. This enhances self-correction capabilities but adds computational overhead.
Multi-Agent ReAct
Distributes the reasoning and action responsibilities across multiple specialized agents, such as a "Researcher" agent that performs information gathering actions and a "Synthesizer" agent that integrates the findings into a coherent response.
Hierarchical ReAct
Implements nested ReAct loops at different levels of abstraction, with high-level planning ReAct cycles delegating to more specific ReAct cycles for detailed tasks, similar to hierarchical planning in robotics.
Constrained ReAct
Limits the action space based on user permissions, safety considerations, or resource constraints, ensuring the agent only attempts actions that are appropriate for its context.
Real-World Examples
-
Research Assistants: Systems like Elicit use ReAct-like patterns to decompose research questions, search for relevant academic papers, extract key findings, and synthesize information across sources.
-
Code Generation and Debugging Tools: GitHub Copilot X incorporates reasoning about code structure and purpose before generating or modifying code, then observes the results (such as compiler errors) to refine its approach.
-
Customer Support Agents: Systems that need to search knowledge bases, reason about customer problems, query account information, and formulate responses based on company policies.
-
Data Analysis Workflows: Agents that reason about data characteristics, execute analysis operations, observe statistical patterns, and determine next analysis steps based on findings.
Related Patterns
-
Chain-of-Thought Prompting: Provides the foundational reasoning capabilities that ReAct builds upon, focusing on explicit step-by-step thinking.
-
Tool Use Pattern: Complements ReAct by defining how agents interact with external systems during the action phase.
-
Reflection Pattern: Often combined with ReAct to enable self-evaluation of reasoning quality and action effectiveness.
-
Planner Pattern: Can be used to generate the initial decomposition of a complex task before applying ReAct to each sub-task.
-
Router Pattern: May direct different types of tasks to specialized ReAct agents based on the problem domain.