ReAct (Reasoning Acting) - joehubert/ai-agent-design-patterns GitHub Wiki

Classification

Intent

ReAct interleaves reasoning steps with action steps to create a structured approach for solving complex problems. It enables LLM agents to think critically about their next move, execute that move, observe the results, and refine their approach based on those observations in an iterative process.

Also Known As

Reasoning-Action Cycles
Think-Then-Act Pattern
Observation-Reasoning-Action Loop

Motivation

Traditional LLM prompting often encounters challenges when dealing with complex reasoning tasks that require multiple steps and external interaction:

LLMs may jump to conclusions without fully analyzing the problem
They might fail to leverage external tools effectively
Complex problems require breaking down into manageable steps
Without feedback loops, errors can propagate and compound

The ReAct pattern addresses these challenges by explicitly alternating between reasoning about a situation and taking concrete actions based on that reasoning. Each action yields observations that inform the next reasoning step, creating a feedback loop that leads to more reliable problem solving.

For example, when answering a complex factual question, a ReAct agent might:

Reason: Analyze what information is needed and identify knowledge gaps
Act: Query a search engine for relevant facts
Observe: Process the search results
Reason: Determine if the information is sufficient or if additional searches are needed
Act: If needed, perform a more refined search or proceed to synthesize an answer

This structured approach helps prevent hallucination, improves accuracy, and creates transparency in the agent's decision-making process.

Applicability

Use the ReAct pattern when:

Tasks require multiple steps to complete
External tools or information sources are available
Problems involve complex reasoning that benefits from breaking down into smaller steps
Transparency in the agent's decision-making process is important
Tasks involve exploration of multiple possible solution paths
Error recovery and self-correction capabilities are needed
Integration with external systems is required

Structure

flowchart TD
    Start([Task Received]) --> Init[Initialize ReAct Process]
    Init --> Reason[Reason About Current State]
    
    Reason --> Plan[Plan Next Action]
    Plan --> Decision{Is Task Complete?}
    
    Decision -- Yes --> Synthesize[Synthesize Final Answer]
    Synthesize --> End([Return Response])
    
    Decision -- No --> Execute[Execute Planned Action]
    Execute --> Observe[Collect Observations]
    Observe --> UpdateMem[Update Memory & Context]
    UpdateMem --> Reason
    
    subgraph Reasoning Phase
        Reason
        Plan
        Decision
        Synthesize
    end
    
    subgraph Action Phase
        Execute
        Observe
        UpdateMem
    end
    
    classDef startNode fill:#bfb,stroke:#333,stroke-width:1px
    classDef endNode fill:#fbb,stroke:#333,stroke-width:1px
    classDef reasoningNode fill:#f9f,stroke:#333,stroke-width:1px
    classDef actionNode fill:#bbf,stroke:#333,stroke-width:1px
    classDef decisionNode fill:#ffd,stroke:#333,stroke-width:1px
    
    class Start,Init startNode
    class End endNode
    class Reason,Plan,Synthesize reasoningNode
    class Execute,Observe,UpdateMem actionNode
    class Decision decisionNode

Components

Reasoning Module: Responsible for analyzing the current state, evaluating options, and planning the next action. This component handles the thinking phase of the cycle.
Action Controller: Executes the planned actions, which may include calling external APIs, retrieving information, or generating output. This component translates reasoning into concrete operations.
Observation Collector: Captures the results of actions, such as API responses, search results, or system state changes, and formats them for the reasoning module to process.
Memory Manager: Maintains context across reasoning-action cycles, storing previous observations, reasoning steps, and actions to inform future decisions.
Planning Orchestrator: Coordinates the overall flow between reasoning and action steps, determining when to switch between them and tracking progress toward the goal.

Interactions

The ReAct pattern typically follows this sequence:

The agent receives a task or query that requires multiple steps to resolve.
The Reasoning Module analyzes the task, breaks it down into sub-goals, and identifies the next action to take.
The Planning Orchestrator passes control to the Action Controller, which executes the identified action (e.g., search for information, calculate a value, generate content).
The Observation Collector captures the results of the action and formats them appropriately.
The Memory Manager updates the context with new information from the observation.
Control returns to the Reasoning Module, which evaluates the new information, determines if the goal has been met, and plans the next action if needed.
This cycle continues until the task is complete or a stopping condition is met.

Throughout this process, each component maintains transparency by explicitly documenting its reasoning, actions, and observations.

Consequences

Benefits

Improved Accuracy: The iterative feedback loop helps correct errors and refine understanding.
Greater Transparency: The explicit reasoning steps make the agent's decision process visible and auditable.
Enhanced Tool Use: The pattern naturally integrates with external tools and APIs.
Reduced Hallucination: Grounding reasoning in concrete actions and observations reduces the risk of fabricated information.
Adaptability: The agent can adjust its approach based on intermediate results.

Limitations

Computational Overhead: The explicit reasoning steps require additional computation compared to simpler approaches.
Increased Latency: The multi-step nature of ReAct can lead to longer response times.
Complexity: Implementing the pattern requires more sophisticated orchestration logic.
Context Management: Maintaining the full reasoning and action history may exceed context limits for complex tasks.

Performance Implications

Token consumption is higher due to the explicit reasoning steps
Response time increases with the number of reasoning-action cycles
Memory usage grows as the interaction history expands

Implementation

To implement the ReAct pattern effectively:

Define a Clear Prompt Structure:
- Include specific sections for Reasoning, Action, and Observation
- Provide examples demonstrating the expected format for each section
- Establish conventions for indicating transitions between phases
Design an Effective Action Space:
- Define the set of possible actions the agent can take
- Create structured formats for action parameters
- Implement validation logic for action requests
Establish Observation Formatting:
- Standardize how observations are presented to the reasoning module
- Include relevant metadata such as source reliability or timestamp
- Handle errors and unexpected results gracefully
Implement Memory Management:
- Develop strategies for summarizing previous cycles when context length is limited
- Prioritize retention of critical information
- Consider using external storage for detailed history
Create Stopping Criteria:
- Define clear conditions for task completion
- Implement timeouts or cycle limits to prevent infinite loops
- Add error recovery mechanisms for when actions fail

Common pitfalls to avoid:

Allowing reasoning to become too verbose, consuming valuable context space
Implementing too many or overly complex actions that the agent struggles to use effectively
Failing to provide sufficient guidance on when to use which actions
Not handling error states from external tools gracefully

Code Examples

See examples

Variations

ReAct with Self-Reflection

This variation adds an explicit reflection step after each observation, where the agent evaluates the quality of its previous reasoning and actions before proceeding. This enhances self-correction capabilities but adds computational overhead.

Multi-Agent ReAct

Distributes the reasoning and action responsibilities across multiple specialized agents, such as a "Researcher" agent that performs information gathering actions and a "Synthesizer" agent that integrates the findings into a coherent response.

Hierarchical ReAct

Implements nested ReAct loops at different levels of abstraction, with high-level planning ReAct cycles delegating to more specific ReAct cycles for detailed tasks, similar to hierarchical planning in robotics.

Constrained ReAct

Limits the action space based on user permissions, safety considerations, or resource constraints, ensuring the agent only attempts actions that are appropriate for its context.

Real-World Examples

Research Assistants: Systems like Elicit use ReAct-like patterns to decompose research questions, search for relevant academic papers, extract key findings, and synthesize information across sources.
Code Generation and Debugging Tools: GitHub Copilot X incorporates reasoning about code structure and purpose before generating or modifying code, then observes the results (such as compiler errors) to refine its approach.
Customer Support Agents: Systems that need to search knowledge bases, reason about customer problems, query account information, and formulate responses based on company policies.
Data Analysis Workflows: Agents that reason about data characteristics, execute analysis operations, observe statistical patterns, and determine next analysis steps based on findings.

Related Patterns

Chain-of-Thought Prompting: Provides the foundational reasoning capabilities that ReAct builds upon, focusing on explicit step-by-step thinking.
Tool Use Pattern: Complements ReAct by defining how agents interact with external systems during the action phase.
Reflection Pattern: Often combined with ReAct to enable self-evaluation of reasoning quality and action effectiveness.
Planner Pattern: Can be used to generate the initial decomposition of a complex task before applying ReAct to each sub-task.
Router Pattern: May direct different types of tasks to specialized ReAct agents based on the problem domain.