09 creating agents - the-omics-os/lobster-local GitHub Wiki
Creating Agents - Lobster AI Agent Development Guide
๐ฏ Overview
This guide covers how to create new specialized agents in the Lobster AI system. Agents are the core orchestrators that handle specific bioinformatics domains, coordinate workflows, and manage user interactions through natural language processing.
๐๏ธ Agent Architecture
Agent Responsibilities
- Domain Expertise: Handle specific bioinformatics workflows (RNA-seq, proteomics, etc.)
- Tool Coordination: Orchestrate calls to stateless services
- Data Management: Interact with DataManagerV2 for modality handling
- User Communication: Provide formatted responses and guidance
- Workflow Handoffs: Transfer tasks between specialized agents
Agent Components
Agent Implementation
โโโ Factory Function # Entry point for agent creation
โโโ Agent Tools # Domain-specific functionality (@tool decorators)
โโโ LLM Configuration # Model parameters and callbacks
โโโ State Management # Agent-specific state schema
โโโ Assistant Classes # LLM-powered analysis helpers
โโโ Handoff Integration # Inter-agent communication
๐ Agent Registry System
1. Registry Configuration
All agents must be registered in the centralized registry to be available in the system.
# lobster/config/agent_registry.py
@dataclass
class AgentRegistryConfig:
name: str # Unique identifier
display_name: str # Human-readable name
description: str # Agent capabilities description
factory_function: str # Module path to factory function
handoff_tool_name: Optional[str] # Auto-generated handoff tool name
handoff_tool_description: Optional[str] # Handoff tool description
2. Adding to Registry
# Add to AGENT_REGISTRY dictionary
AGENT_REGISTRY = {
'your_new_agent': AgentRegistryConfig(
name='your_new_agent',
display_name='Your New Agent',
description='Handles your specific bioinformatics workflow',
factory_function='lobster.agents.your_agent.your_agent_factory',
handoff_tool_name='handoff_to_your_agent',
handoff_tool_description='Assign specific workflow tasks to your agent'
),
}
3. Automatic Supervisor Discovery (v0.2+)
Once registered in agent_registry.py, your agent is automatically discovered by the supervisor:
- No manual updates needed: The supervisor dynamically discovers all registered agents
- Automatic delegation rules: The supervisor creates delegation rules based on your agent's description
- Dynamic prompt generation: Your agent appears in the supervisor's available experts list
- Capability extraction: The system can optionally extract your agent's @tool functions
This eliminates the need to manually update supervisor.py or graph.py when adding new agents.
๐จ Creating a New Agent
Step 1: Create Agent Module
Create a new Python file in lobster/agents/:
# lobster/agents/your_agent.py
"""
Your Agent for handling specific bioinformatics workflow.
This agent specializes in [describe the specific domain and capabilities].
"""
from typing import List, Dict, Optional
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from langchain_aws import ChatBedrockConverse
from lobster.agents.state import YourAgentState # Create if needed
from lobster.config.settings import get_settings
from lobster.core.data_manager_v2 import DataManagerV2
from lobster.utils.logger import get_logger
logger = get_logger(__name__)
def your_agent_factory(
data_manager: DataManagerV2,
callback_handler=None,
agent_name: str = "your_agent",
handoff_tools: List = None
):
"""
Create your specialized agent.
Args:
data_manager: DataManagerV2 instance for modality management
callback_handler: Optional callback for LLM monitoring
agent_name: Agent identifier (should match registry)
handoff_tools: List of handoff tools for agent communication
Returns:
LangGraph agent instance
"""
# Initialize LLM with agent-specific configuration
settings = get_settings()
model_params = settings.get_agent_llm_params('your_agent')
llm = ChatBedrockConverse(**model_params)
if callback_handler and hasattr(llm, 'with_config'):
llm = llm.with_config(callbacks=[callback_handler])
# Define agent-specific tools
tools = [
check_available_modalities,
perform_domain_analysis,
# Add more tools as needed
]
# Add handoff tools if provided
if handoff_tools:
tools.extend(handoff_tools)
# Create agent with system prompt
system_prompt = """You are a specialized bioinformatics agent for [domain].
Your responsibilities include:
- [Specific responsibility 1]
- [Specific responsibility 2]
- [Specific responsibility 3]
Always use the available tools to:
1. Check available data modalities first
2. Validate data before analysis
3. Provide detailed analysis results
4. Suggest next steps or handoffs when appropriate
When analysis is complete, provide a comprehensive summary including:
- What analysis was performed
- Key findings and statistics
- Generated visualizations
- Recommendations for next steps
"""
return create_react_agent(
llm,
tools,
state_schema=YourAgentState, # Optional custom state
system_prompt=system_prompt
)
Step 2: Define Agent Tools
Each tool should follow the standard pattern:
@tool
def check_available_modalities() -> str:
"""Check what data modalities are currently available."""
try:
modalities = data_manager.list_modalities()
if not modalities:
return "No data modalities are currently loaded. Use data loading tools first."
summary = "Available data modalities:\n"
for modality in modalities:
adata = data_manager.get_modality(modality)
summary += f"- {modality}: {adata.n_obs} observations, {adata.n_vars} features\n"
return summary
except Exception as e:
logger.error(f"Error checking modalities: {e}")
return f"Error checking available data: {str(e)}"
@tool
def perform_domain_analysis(
modality_name: str,
parameter1: float = 1.0,
parameter2: str = "default"
) -> str:
"""
Perform domain-specific analysis on the specified modality.
Args:
modality_name: Name of the data modality to analyze
parameter1: Analysis parameter with default value
parameter2: Another analysis parameter
Returns:
Formatted analysis results
"""
try:
# 1. Validate modality exists
if modality_name not in data_manager.list_modalities():
raise ValueError(f"Modality '{modality_name}' not found")
# 2. Get data and call stateless service
adata = data_manager.get_modality(modality_name)
from lobster.tools.your_service import YourService
service = YourService()
result_adata, statistics = service.perform_analysis(
adata,
parameter1=parameter1,
parameter2=parameter2
)
# 3. Store results with descriptive naming
result_modality = f"{modality_name}_analyzed"
data_manager.modalities[result_modality] = result_adata
# 4. Log operation for provenance
data_manager.log_tool_usage(
"perform_domain_analysis",
{"parameter1": parameter1, "parameter2": parameter2},
statistics
)
# 5. Format response
response = f"""Analysis completed successfully!
**Analysis Parameters:**
- Parameter 1: {parameter1}
- Parameter 2: {parameter2}
**Key Results:**
{format_statistics(statistics)}
**New Dataset:** {result_modality}
- Observations: {result_adata.n_obs}
- Features: {result_adata.n_vars}
Use this dataset for downstream analysis or visualization.
"""
return response
except Exception as e:
logger.error(f"Analysis failed: {e}")
return f"Analysis failed: {str(e)}"
def format_statistics(stats: Dict) -> str:
"""Helper function to format analysis statistics."""
formatted = ""
for key, value in stats.items():
if isinstance(value, float):
formatted += f"- {key}: {value:.3f}\n"
else:
formatted += f"- {key}: {value}\n"
return formatted
Step 3: Create Agent State (Optional)
If your agent needs custom state beyond the default:
# lobster/agents/state.py (add to existing file)
from typing import TypedDict, List, Optional, Dict, Any
class YourAgentState(TypedDict):
"""State schema for your specialized agent."""
messages: List[Any]
analysis_parameters: Optional[Dict[str, Any]]
current_modality: Optional[str]
workflow_stage: str
# Add other agent-specific state fields
Step 4: Create Assistant Class (Optional)
For complex LLM-powered analysis:
# lobster/agents/your_agent_assistant.py
"""Assistant class for LLM-powered analysis tasks."""
from typing import Dict, Any, Optional
from dataclasses import dataclass
@dataclass
class YourAnalysisConfig:
"""Configuration for your analysis workflow."""
parameter1: float
parameter2: str
advanced_options: Dict[str, Any]
class YourAgentAssistant:
"""LLM-powered assistant for your domain-specific tasks."""
def __init__(self):
self.llm = None # Initialize if needed for complex tasks
def extract_analysis_parameters(
self,
metadata: Dict[str, Any],
user_input: str
) -> YourAnalysisConfig:
"""
Extract analysis parameters from metadata and user input.
Args:
metadata: Dataset metadata
user_input: User's analysis request
Returns:
Configuration object with extracted parameters
"""
# Implement parameter extraction logic
# This could use LLM for complex parameter inference
return YourAnalysisConfig(
parameter1=1.0, # Extracted or default
parameter2="inferred",
advanced_options={}
)
Step 5: Register Agent
Add your agent to the registry:
# lobster/config/agent_registry.py
AGENT_REGISTRY['your_new_agent'] = AgentRegistryConfig(
name='your_new_agent',
display_name='Your New Agent',
description='Handles your specific bioinformatics workflow with advanced analysis capabilities',
factory_function='lobster.agents.your_agent.your_agent_factory',
handoff_tool_name='handoff_to_your_agent',
handoff_tool_description='Assign workflow tasks to the specialized agent'
)
๐ Agent Communication & Handoffs
Handoff Tools
Handoff tools are automatically generated from registry configuration:
# Handoff tools allow agents to transfer tasks
@tool
def handoff_to_your_agent(
task_description: str,
state: Dict,
tool_call_id: str
) -> Command:
"""Transfer task to your specialized agent."""
# Auto-generated from registry configuration
pass
Best Practices for Handoffs
- Clear Task Description: Provide detailed context when handing off
- Data Validation: Ensure required data is available before handoff
- State Preservation: Important state should be maintained across handoffs
- Error Handling: Graceful handling of handoff failures
# Example handoff logic in agent tool
if requires_specialized_analysis:
return f"""This task requires specialized analysis. I'll transfer you to the {target_agent} agent.
**Transfer Reason:** {reason}
**Current Data:** {current_modality}
**Next Steps:** {recommended_steps}
Please use the handoff tool to continue with specialized analysis.
"""
๐งช Testing Your Agent
Unit Tests
# tests/unit/agents/test_your_agent.py
import pytest
from unittest.mock import Mock
from lobster.agents.your_agent import your_agent_factory
from lobster.core.data_manager_v2 import DataManagerV2
class TestYourAgent:
@pytest.fixture
def mock_data_manager(self):
return Mock(spec=DataManagerV2)
def test_agent_creation(self, mock_data_manager):
"""Test agent factory creates agent successfully."""
agent = your_agent_factory(mock_data_manager)
assert agent is not None
def test_check_modalities_tool(self, mock_data_manager):
"""Test modality checking tool."""
mock_data_manager.list_modalities.return_value = ['test_data']
# Test tool execution
# Implementation depends on your testing framework
Integration Tests
# tests/integration/test_your_agent_integration.py
def test_your_agent_workflow(client, sample_data):
"""Test complete workflow with your agent."""
# Load test data
response = client.query("Load the sample dataset")
assert response['success']
# Test your agent's functionality
response = client.query("Perform your specialized analysis")
assert response['success']
assert 'analyzed' in response['response']
๐ Best Practices
1. Tool Design
- Single Responsibility: Each tool should have a clear, focused purpose
- Error Handling: Comprehensive error handling with specific exceptions
- Parameter Validation: Validate all inputs before processing
- Descriptive Returns: Provide detailed, formatted responses
2. Data Management
- Modality Validation: Always check data exists before processing
- Naming Conventions: Follow professional naming patterns
- Provenance Logging: Log all operations for reproducibility
- Result Storage: Store intermediate and final results appropriately
3. Scientific Rigor
- Parameter Documentation: Document all analysis parameters
- Statistical Validation: Include proper statistical methods
- Quality Control: Implement appropriate QC checks
- Reproducibility: Ensure analyses can be reproduced
4. User Experience
- Clear Communication: Provide informative responses
- Progress Indication: Show progress for long-running operations
- Error Messages: User-friendly error descriptions
- Next Steps: Suggest logical next steps in workflows
๐ Debugging Agents
Common Issues
- Registry Not Found: Check agent name matches registry exactly
- Import Errors: Verify factory function path is correct
- Tool Failures: Check data_manager access and service imports
- Handoff Issues: Ensure handoff tools are properly configured
Debugging Commands
# Check registry
from lobster.config.agent_registry import AGENT_REGISTRY
print(AGENT_REGISTRY.keys())
# Test factory function
from lobster.agents.your_agent import your_agent_factory
agent = your_agent_factory(mock_data_manager)
# Test in CLI
lobster chat
/status # Check system status
/help # List available commands
๐ Agent Performance
Optimization Tips
- Lazy Loading: Load services only when needed
- Caching: Cache expensive computations appropriately
- Memory Management: Handle large datasets efficiently
- Progress Tracking: Provide user feedback for long operations
Monitoring
# Use structured logging
logger.info(f"Starting analysis with parameters: {params}")
logger.debug(f"Processing {adata.n_obs} observations")
# Track performance metrics
import time
start_time = time.time()
# ... analysis ...
duration = time.time() - start_time
logger.info(f"Analysis completed in {duration:.2f} seconds")
๐ฏ Complete Example
See the complete agent implementation template in lobster/agents/template_agent.py for a full working example that demonstrates all the patterns and best practices described in this guide.
This guide provides everything needed to create sophisticated, domain-specific agents that integrate seamlessly with the Lobster AI platform while maintaining scientific rigor and user-friendly interactions.