34 architecture diagram - the-omics-os/lobster-local GitHub Wiki
Lobster AI is a powerful multi-agent bioinformatics platform with seamless cloud and local deployment capabilities. The system automatically detects your configuration and routes requests appropriately.
graph TB
subgraph "User Interface Layer"
CLI[๐ฆ Lobster CLI<br/>lobster chat]
STREAMLIT[๐ Streamlit Web UI<br/>streamlit_app.py]
API[๐ FastAPI Server<br/>lobster serve]
end
subgraph "Smart Client Detection"
DETECT[Environment Check<br/>LOBSTER_CLOUD_KEY?]
INIT[init_client()<br/>๐ Automatic Switching]
end
subgraph "โ๏ธ Cloud Mode (LOBSTER_CLOUD_KEY set)"
CLOUD_CLIENT[CloudLobsterClient<br/>๐ฉ๏ธ HTTP API Calls]
CLOUD_API[Lobster Cloud API<br/>api.lobster.omics-os.com]
AWS_INFRA[AWS Infrastructure<br/>๐๏ธ Scalable Compute]
end
subgraph "๐ป Local Mode (No cloud key or fallback)"
LOCAL_CLIENT[AgentClient<br/>๐ฅ๏ธ Local LangGraph Processing]
LOCAL_AGENTS[Local AI Agents<br/>๐ค Full Agent Pipeline]
LOCAL_DATA[DataManagerV2<br/>๐ Local Data Management]
end
subgraph "๐ Unified Interface"
BASE_CLIENT[BaseClient Interface<br/>๐ Common Methods Contract]
METHODS[query(), get_status()<br/>read_file(), export_session()]
end
CLI --> INIT
STREAMLIT --> INIT
API --> INIT
INIT --> DETECT
DETECT --> |Cloud Key Present| CLOUD_CLIENT
DETECT --> |No Key/Fallback| LOCAL_CLIENT
CLOUD_CLIENT --> |HTTP/REST| CLOUD_API
CLOUD_API --> AWS_INFRA
LOCAL_CLIENT --> LOCAL_AGENTS
LOCAL_CLIENT --> LOCAL_DATA
CLOUD_CLIENT -.-> |Implements| BASE_CLIENT
LOCAL_CLIENT -.-> |Implements| BASE_CLIENT
BASE_CLIENT --> METHODS
classDef ui fill:#e3f2fd,stroke:#0277bd,stroke-width:2px
classDef detect fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef cloud fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px
classDef local fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef interface fill:#fce4ec,stroke:#c2185b,stroke-width:2px
class CLI,STREAMLIT,API ui
class DETECT,INIT detect
class CLOUD_CLIENT,CLOUD_API,AWS_INFRA cloud
class LOCAL_CLIENT,LOCAL_AGENTS,LOCAL_DATA local
class BASE_CLIENT,METHODS interface
sequenceDiagram
participant User
participant CLI
participant init_client
participant CloudClient
participant LocalClient
participant CloudAPI
User->>CLI: lobster chat
CLI->>init_client: Initialize client
Note over init_client: Check LOBSTER_CLOUD_KEY
alt Cloud Key Present
init_client->>CloudClient: new CloudLobsterClient(api_key)
CloudClient->>CloudAPI: Test connection (get_status)
alt Connection Success
CloudAPI-->>CloudClient: โ
Status OK
CloudClient-->>init_client: โ
Cloud client ready
init_client-->>CLI: CloudLobsterClient instance
CLI-->>User: ๐ฉ๏ธ "Cloud mode active"
else Connection Failed
CloudAPI-->>CloudClient: โ Connection failed
CloudClient-->>init_client: โ Error
init_client->>LocalClient: Fallback to local
LocalClient-->>init_client: โ
Local client ready
init_client-->>CLI: AgentClient instance
CLI-->>User: ๐ป "Using local mode (cloud unavailable)"
end
else No Cloud Key
init_client->>LocalClient: new AgentClient()
LocalClient-->>init_client: โ
Local client ready
init_client-->>CLI: AgentClient instance
CLI-->>User: ๐ป "Using local mode"
end
User->>CLI: "Analyze my RNA-seq data"
CLI->>init_client: client.query(user_input)
alt Using Cloud Client
init_client->>CloudAPI: POST /query
CloudAPI-->>init_client: Analysis response
else Using Local Client
init_client->>LocalClient: Process with local agents
LocalClient-->>init_client: Analysis response
end
init_client-->>CLI: Standardized response
CLI-->>User: ๐ฆ Analysis results
graph TB
subgraph "Lobster AI Open Source"
MAIN[lobster/<br/>๐ฆ Complete Bioinformatics Platform<br/>All Features Included]
subgraph "Core Components"
CLI[cli.py<br/>๐ Smart CLI with Cloud Detection]
AGENTS[agents/<br/>๐ค All AI Agents]
CORE[core/<br/>๐ DataManagerV2 & Client]
TOOLS[tools/<br/>๐ง Analysis Services]
UTILS[utils/<br/>๐ ๏ธ Utilities & Callbacks]
CONFIG[config/<br/>โ๏ธ Configuration System]
end
MAIN --> CLI
MAIN --> AGENTS
MAIN --> CORE
MAIN --> TOOLS
MAIN --> UTILS
MAIN --> CONFIG
end
subgraph "Smart CLI Detection"
DETECT[Environment Check<br/>LOBSTER_CLOUD_KEY?]
NO_KEY[๐ป Use Local Mode<br/>Full Functionality]
WITH_KEY[โ๏ธ Show Cloud Message<br/>โ cloud.lobster.ai<br/>โ Fallback to Local]
end
subgraph "Local Processing (100% Functional)"
CLIENT[AgentClient<br/>๐ป Complete Features]
DM[DataManagerV2<br/>๐ All Data Management]
AI_PIPELINE[AI Agent Pipeline<br/>๐งฌ Full Analysis Power]
end
CLI --> DETECT
DETECT --> |No Cloud Key| NO_KEY
DETECT --> |Cloud Key Set| WITH_KEY
NO_KEY --> CLIENT
WITH_KEY --> CLIENT
CLIENT --> DM
CLIENT --> AI_PIPELINE
classDef main fill:#4caf50,stroke:#2e7d32,stroke-width:4px
classDef component fill:#81c784,stroke:#4caf50,stroke-width:2px
classDef cli fill:#2196f3,stroke:#1976d2,stroke-width:2px
classDef processing fill:#ff9800,stroke:#f57c00,stroke-width:2px
class MAIN main
class CLI,AGENTS,CORE,TOOLS,UTILS,CONFIG component
class DETECT,NO_KEY,WITH_KEY cli
class CLIENT,DM,AI_PIPELINE processing
graph LR
LOCAL[๐ฅ๏ธ Local Installation<br/>Free & Open Source] --> UPGRADE{Want Cloud?}
UPGRADE --> |Yes| CLOUD[โ๏ธ Lobster Cloud<br/>Managed Platform]
UPGRADE --> |No| LOCAL
CLOUD --> FEATURES[๐ Scalable Computing<br/>๐ฅ Team Collaboration<br/>๐พ Persistent Storage]
classDef free fill:#4caf50,stroke:#2e7d32,stroke-width:2px
classDef cloud fill:#2196f3,stroke:#1976d2,stroke-width:3px
class LOCAL free
class CLOUD,FEATURES cloud
graph TB
%% Data Sources
subgraph "Data Sources"
GEO[GEO Database<br/>GSE Datasets]
CSV[CSV Files<br/>Local Data]
EXCEL[Excel Files<br/>Lab Data]
H5AD[H5AD Files<br/>Processed Data]
MTX[10X MTX<br/>Single-cell Data]
end
%% NEW: Agent Registry System
subgraph "Agent Registry System"
AREG[Agent Registry<br/>๐๏ธ Single Source of Truth]
ACONF[Agent Configurations<br/>๐ Factory Functions & Metadata]
HDTOOLS[Handoff Tools<br/>๐ Dynamic Tool Generation]
end
%% NEW: Supervisor Configuration System (v0.2+)
subgraph "Supervisor Configuration (v0.2+)"
SCONF[SupervisorConfig<br/>โ๏ธ Dynamic Configuration]
CAPEXT[AgentCapabilityExtractor<br/>๐ Auto-Discovery]
MODES[Operation Modes<br/>๐ฏ Research/Production/Dev]
end
%% Agents Layer
subgraph "AI Agents - Dynamically Loaded"
DE[Data Expert<br/>๐ Data Loading & Management]
RA[Research Agent<br/>๐ Literature Discovery & Dataset ID<br/>๐ Method Extraction from Publications]
TE[Transcriptomics Expert<br/>๐งฌ Single-Cell & Bulk RNA-seq Analysis<br/>๐ Unified Transcriptomics Workflows]
MLE[ML Expert<br/>๐ค Machine Learning & scVI]
VIZ[Visualization Expert<br/>๐ Publication-Quality Plots]
end
%% Proteomics Agent (Unified)
subgraph "Proteomics Agent"
PE[Proteomics Expert<br/>๐ฌ Mass Spectrometry & Affinity Analysis<br/>๐ฏ Unified Proteomics Workflows]
end
%% NEW: Analysis Services Layer (Stateless)
subgraph "Analysis Services - Stateless & Modular"
PREP[PreprocessingService<br/>๐ง Filter & Normalize]
QUAL[QualityService<br/>๐ QC Assessment]
CLUST[ClusteringService<br/>๐ฏ Leiden & UMAP]
SCELL[EnhancedSingleCellService<br/>๐ฌ Doublets & Annotation]
BULK[BulkRNASeqService<br/>๐ Bulk Analysis & pyDESeq2]
GEO_SVC[GEOService<br/>๐พ Data Download]
CONCAT_SVC[ConcatenationService<br/>๐ Sample Concatenation & Code Deduplication<br/>Memory-Efficient Multi-Modal Merging]
subgraph "Pseudobulk & Differential Expression Services"
PBULK[PseudobulkService<br/>๐งฌ Single-cell to Pseudobulk Aggregation]
FORMULA[DifferentialFormulaService<br/>๐ R-style Formula Parsing & Design Matrix<br/>๐ค Agent-Guided Formula Construction<br/>๐ Iterative Analysis Support]
PBADAP[PseudobulkAdapter<br/>๐ Schema Validation & QC]
WFLOW[WorkflowTracker<br/>๐ DE Iteration Management<br/>๐ Result Comparison & Analytics]
end
subgraph "Proteomics Services - Professional Grade"
PPREP[ProteomicsPreprocessingService<br/>๐งช MS/Affinity Filtering & Normalization]
PQUAL[ProteomicsQualityService<br/>๐ Missing Value & CV Analysis]
PANAL[ProteomicsAnalysisService<br/>๐ฌ Statistical Testing & PCA]
PDIFF[ProteomicsDifferentialService<br/>๐ Differential Expression & Pathways]
PVIS[ProteomicsVisualizationService<br/>๐ Volcano Plots & Networks]
end
end
%% Publication Services Layer (v0.2.0+ Three-Tier Architecture)
subgraph "Publication & Literature Services"
CONTENT_SVC[ContentAccessService<br/>๐ฏ Capability-Based Coordinator]:::coordinator
PROVIDER_REG[ProviderRegistry<br/>๐ Priority Routing]:::service
ABSTRACT_PROV[AbstractProvider<br/>โก Priority 10: Fast Abstracts]:::tier1
PMC_PROV[PMCProvider<br/>๐ Priority 10: PMC XML API]:::tier1
PUBMED[PubMedProvider<br/>๐ Priority 10: Literature Search]:::tier1
GEOPROV[GEOProvider<br/>๐งฌ Priority 10: Dataset Discovery]:::tier1
WEBPAGE_PROV[WebpageProvider<br/>๐ Priority 50: Webpage Fallback]:::tier2
DOCLING_SVC[DoclingService<br/>๐ Internal to WebpageProvider]:::foundation
METADATA_VAL[MetadataValidationService<br/>โ
Dataset Validation]:::service
end
%% DataManagerV2 Orchestration
subgraph "DataManagerV2 Orchestration"
DM2[DataManagerV2<br/>๐ฏ Modality Coordinator]
MODALITIES[Modality Storage<br/>๐ AnnData Objects]
PROV[Provenance Tracker<br/>๐ Analysis History]
ERROR[Error Handling<br/>โ ๏ธ Professional Exceptions]
end
%% Modality Adapters
subgraph "Modality Adapters"
TRA[TranscriptomicsAdapter<br/>๐งฌ RNA-seq Loading]
PRA[ProteomicsAdapter<br/>๐งช Protein Loading]
subgraph "Transcriptomics Types"
TRSC[Single-cell RNA-seq<br/>Schema & Validation]
TRBL[Bulk RNA-seq<br/>Schema & Validation]
end
subgraph "Proteomics Types"
PRMS[Mass Spectrometry<br/>Missing Value Handling]
PRAF[Affinity Proteomics<br/>Antibody Arrays]
end
TRA --> TRSC
TRA --> TRBL
PRA --> PRMS
PRA --> PRAF
end
%% Storage Backends
subgraph "Storage Backends"
H5BE[H5ADBackend<br/>๐พ Single Modality<br/>S3-Ready]
MUBE[MuDataBackend<br/>๐ Multi-Modal<br/>Integrated Analysis]
end
%% Schema & Validation
subgraph "Schema System"
TSCH[TranscriptomicsSchema<br/>๐ RNA-seq Rules]
PSCH[ProteomicsSchema<br/>๐ Protein Rules]
FVAL[FlexibleValidator<br/>โ ๏ธ Warning-based QC]
end
%% Interfaces
subgraph "Core Interfaces"
IBACK[IDataBackend<br/>๐ Storage Contract]
IADAP[IModalityAdapter<br/>๐ Processing Contract]
IVAL[IValidator<br/>๐ Validation Contract]
end
%% Data Flow Connections
GEO --> DE
CSV --> DE
EXCEL --> DE
H5AD --> DE
MTX --> DE
%% Agent to Service connections
DE --> GEO_SVC
RA --> GEOPROV
%% Three-tier publication access flow (v0.2.0+ with capability-based routing)
RA --> CONTENT_SVC
CONTENT_SVC --> PROVIDER_REG
PROVIDER_REG --> ABSTRACT_PROV
PROVIDER_REG --> PMC_PROV
PROVIDER_REG --> PUBMED
PROVIDER_REG --> GEOPROV
PROVIDER_REG --> WEBPAGE_PROV
WEBPAGE_PROV --> DOCLING_SVC
ABSTRACT_PROV -.-> PUBMED
%% Transcriptomics Expert connections (handles both single-cell and bulk)
TE --> PREP
TE --> QUAL
TE --> CLUST
TE --> SCELL
TE --> PBULK
TE --> FORMULA
TE --> BULK
%% Proteomics Expert connections (handles both MS and affinity)
PE --> PPREP
PE --> PQUAL
PE --> PANAL
PE --> PDIFF
PE --> PVIS
%% Service to DataManager connections
PREP --> |AnnData Processing| DM2
QUAL --> |QC Metrics| DM2
CLUST --> |Clustering Results| DM2
SCELL --> |Annotations| DM2
GEO_SVC --> |Dataset Loading| DM2
PUBSVC --> |Publication Metadata| DM2
PBULK --> |Pseudobulk Matrices| DM2
FORMULA --> |Design Matrices| DM2
BULK --> |DE Results| DM2
%% Proteomics Service to DataManager connections
PPREP --> |Proteomics Processing| DM2
PQUAL --> |Proteomics QC| DM2
PANAL --> |Statistical Analysis| DM2
PDIFF --> |Differential Results| DM2
PVIS --> |Visualization Data| DM2
%% DataManager orchestration
DM2 --> TRA
DM2 --> PRA
DM2 --> MODALITIES
DM2 --> PROV
DM2 --> ERROR
TRA --> TSCH
PRA --> PSCH
TSCH --> FVAL
PSCH --> FVAL
MODALITIES --> H5BE
MODALITIES --> MUBE
%% Interface implementations
TRA -.-> IADAP
PRA -.-> IADAP
H5BE -.-> IBACK
MUBE -.-> IBACK
FVAL -.-> IVAL
%% Styling
classDef agent fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
classDef orchestrator fill:#f3e5f5,stroke:#4a148c,stroke-width:3px
classDef adapter fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef backend fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef schema fill:#f1f8e9,stroke:#388e3c,stroke-width:2px
classDef interface fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,stroke-dasharray: 5 5
classDef source fill:#f5f5f5,stroke:#616161,stroke-width:1px
classDef tier1 fill:#90EE90,stroke:#228B22,stroke-width:3px,color:#000
classDef tier2 fill:#87CEEB,stroke:#4682B4,stroke-width:3px,color:#000
classDef coordinator fill:#FFB6C1,stroke:#C71585,stroke-width:3px,color:#000
classDef foundation fill:#DDA0DD,stroke:#8B008B,stroke-width:3px,color:#000
classDef disabled fill:#E0E0E0,stroke:#757575,stroke-width:2px,stroke-dasharray:5 5,color:#424242
class DE,RA,TE,PE,MLE,VIZ agent
class PREP,QUAL,CLUST,SCELL,BULK,PBULK,FORMULA,PBADAP,PUBSVC,PUBMED,GEOPROV,GEOQB,METADATA_VAL service
class DM2,MODALITIES,PROV,ERROR orchestrator
class TRA,PRA,TRSC,TRBL,PRMS,PRAF adapter
class H5BE,MUBE backend
class TSCH,PSCH,FVAL schema
class IBACK,IADAP,IVAL interface
class GEO,CSV,EXCEL,H5AD,MTX source
sequenceDiagram
participant User
participant DataExpert as Data Expert Agent
participant TransExpert as Transcriptomics Expert
participant DM2 as DataManagerV2
participant Service as Analysis Service
participant Adapter as Modality Adapter
participant Schema as Schema Validator
participant Backend as Storage Backend
%% Data Loading Flow
User->>DataExpert: "Download GSE12345"
DataExpert->>DM2: load_modality("geo_gse12345", source, "transcriptomics_single_cell")
DM2->>Adapter: from_source(source_data)
Adapter->>Adapter: Detect format, load data
Adapter->>Schema: validate(adata)
Schema-->>Adapter: ValidationResult (warnings/errors)
Adapter-->>DM2: AnnData with schema compliance
DM2->>DM2: Store as modality
DM2-->>DataExpert: Modality loaded successfully
DataExpert-->>User: "Loaded geo_gse12345: 5000 cells ร 20000 genes"
%% NEW: Modular Analysis Flow
User->>TransExpert: "Filter and normalize the data"
TransExpert->>DM2: get_modality("geo_gse12345")
DM2-->>TransExpert: AnnData object
TransExpert->>Service: PreprocessingService.filter_and_normalize_cells(adata, params)
Service->>Service: Professional QC filtering
Service->>Service: Scanpy normalization
Service-->>TransExpert: (processed_adata, processing_stats)
TransExpert->>DM2: Store new modality("geo_gse12345_filtered_normalized")
TransExpert->>DM2: log_tool_usage(operation_details)
TransExpert-->>User: "Filtering complete: 4500 cells retained (90%)"
User->>TransExpert: "Run clustering analysis"
TransExpert->>DM2: get_modality("geo_gse12345_filtered_normalized")
DM2-->>TransExpert: Processed AnnData
TransExpert->>Service: ClusteringService.cluster_and_visualize(adata, params)
Service->>Service: HVG detection, PCA, neighbors graph
Service->>Service: Leiden clustering, UMAP embedding
Service-->>TransExpert: (clustered_adata, clustering_stats)
TransExpert->>DM2: Store new modality("geo_gse12345_clustered")
TransExpert->>DM2: log_tool_usage(clustering_results)
TransExpert-->>User: "Clustering complete: 8 clusters identified"
User->>TransExpert: "Find marker genes"
TransExpert->>DM2: get_modality("geo_gse12345_clustered")
TransExpert->>Service: EnhancedSingleCellService.find_marker_genes(adata, params)
Service->>Service: Differential expression analysis
Service-->>TransExpert: (marker_adata, marker_stats)
TransExpert->>DM2: Store new modality("geo_gse12345_markers")
TransExpert-->>User: "Marker genes identified for all clusters"
%% Provenance and Error Handling
Note over DM2: All operations tracked<br/>Professional error handling<br/>Complete provenance trail
graph LR
subgraph "Agents โ DataManagerV2"
DE[Data Expert] --> |load_modality<br/>save_modality| DM2[DataManagerV2]
RA[Research Agent] --> |literature_context<br/>method_extraction| DM2
SCE[Single-Cell Expert] --> |get_modality<br/>cluster_analyze| DM2
BRE[Bulk RNA-seq Expert] --> |get_modality<br/>differential_expression| DM2
PE[Proteomics Expert] --> |get_modality<br/>analyze_patterns| DM2
end
subgraph "DataManagerV2 โ Adapters"
DM2 --> |from_source| TRA[TranscriptomicsAdapter]
DM2 --> |from_source| PRA[ProteomicsAdapter]
end
subgraph "Adapters โ Validation"
TRA --> |validate| TSCH[TranscriptomicsSchema]
PRA --> |validate| PSCH[ProteomicsSchema]
TSCH --> FVAL[FlexibleValidator]
PSCH --> FVAL
end
subgraph "DataManagerV2 โ Storage"
DM2 --> |save/load| H5BE[H5ADBackend]
DM2 --> |save/load| MUBE[MuDataBackend]
end
classDef agent fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef orchestrator fill:#f3e5f5,stroke:#4a148c,stroke-width:3px
classDef adapter fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px
classDef backend fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef schema fill:#fce4ec,stroke:#880e4f,stroke-width:2px
class DE,TE,PE,ME agent
class DM2 orchestrator
class TRA,PRA adapter
class H5BE,MUBE backend
class TSCH,PSCH,FVAL schema
## ๐๏ธ Centralized Agent Registry System
### Overview
The Lobster AI system now features a **centralized agent registry** that serves as the single source of truth for all agent configurations. This eliminates redundancy and reduces errors when adding new agents to the system.
### Agent Registry Architecture
```mermaid
graph TB
subgraph "Agent Registry System"
AREG[Agent Registry<br/>lobster/config/agent_registry.py]
ACONF[AgentConfig Objects<br/>๐ง Metadata & Factory Functions]
HELPERS[Helper Functions<br/>๐ ๏ธ get_worker_agents()<br/>get_all_agent_names()]
end
subgraph "System Integration"
GRAPH[Graph Creation<br/>lobster/agents/graph.py]
CALLBACKS[Callback System<br/>lobster/utils/callbacks.py]
SETTINGS[Settings Integration<br/>lobster/config/settings.py]
end
subgraph "Dynamic Loading"
IMPORT[Dynamic Import<br/>import_agent_factory()]
TOOLS[Tool Generation<br/>create_custom_handoff_tool()]
DETECT[Agent Detection<br/>get_all_agent_names()]
end
AREG --> ACONF
AREG --> HELPERS
HELPERS --> GRAPH
HELPERS --> CALLBACKS
HELPERS --> SETTINGS
ACONF --> IMPORT
ACONF --> TOOLS
HELPERS --> DETECT
%% Supervisor Configuration connections
SCONF --> MODES
CAPEXT --> AREG
CAPEXT --> ACONF
SCONF --> GRAPH
MODES --> SCONF
classDef registry fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px
classDef integration fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef dynamic fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef config fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
class AREG,ACONF,HELPERS registry
class GRAPH,CALLBACKS,SETTINGS integration
class IMPORT,TOOLS,DETECT dynamic
class SCONF,CAPEXT,MODES config
Each agent in the registry is defined using an AgentConfig dataclass:
@dataclass
class AgentConfig:
"""Configuration for an agent in the system."""
name: str # Unique agent identifier
display_name: str # Human-readable name
description: str # Agent's purpose/capability
factory_function: str # Module path to factory function
handoff_tool_name: Optional[str] # Name of handoff tool
handoff_tool_description: Optional[str] # Tool descriptionAGENT_REGISTRY: Dict[str, AgentConfig] = {
'data_expert_agent': AgentConfig(
name='data_expert_agent',
display_name='Data Expert',
description='Handles data fetching and download tasks',
factory_function='lobster.agents.data_expert.data_expert',
handoff_tool_name='handoff_to_data_expert',
handoff_tool_description='Assign data fetching/download tasks to the data expert'
),
'transcriptomics_expert': AgentConfig(
name='transcriptomics_expert',
display_name='Transcriptomics Expert',
description='Handles both single-cell and bulk RNA-seq analysis tasks',
factory_function='lobster.agents.transcriptomics_expert.transcriptomics_expert',
handoff_tool_name='handoff_to_transcriptomics_expert',
handoff_tool_description='Assign transcriptomics analysis tasks (single-cell or bulk RNA-seq) to the transcriptomics expert'
),
'research_agent': AgentConfig(
name='research_agent',
display_name='Research Agent',
description='Handles literature discovery, dataset identification, and method extraction from publications. Replaces deprecated method_expert_agent.',
factory_function='lobster.agents.research_agent.research_agent',
handoff_tool_name='handoff_to_research_agent',
handoff_tool_description='Assign literature search, dataset discovery, PDF extraction, and method extraction tasks to the research agent'
),
'machine_learning_expert_agent': AgentConfig(
name='machine_learning_expert_agent',
display_name='ML Expert',
description='Handles Machine Learning related tasks like transforming the data in the desired format for downstream tasks',
factory_function='lobster.agents.machine_learning_expert.machine_learning_expert',
handoff_tool_name='handoff_to_machine_learning_expert',
handoff_tool_description='Assign all machine learning related tasks (scVI, classification etc) to the machine learning expert agent'
),
'visualization_expert_agent': AgentConfig(
name='visualization_expert_agent',
display_name='Visualization Expert',
description='Creates publication-quality visualizations through supervisor-mediated workflows',
factory_function='lobster.agents.visualization_expert.visualization_expert',
handoff_tool_name='handoff_to_visualization_expert',
handoff_tool_description='Delegate visualization tasks to the visualization expert agent'
),
'proteomics_expert': AgentConfig(
name='proteomics_expert',
display_name='Proteomics Expert',
description='Handles both mass spectrometry and affinity proteomics analysis tasks',
factory_function='lobster.agents.proteomics_expert.proteomics_expert',
handoff_tool_name='handoff_to_proteomics_expert',
handoff_tool_description='Assign proteomics analysis tasks (mass spectrometry or affinity proteomics) to the proteomics expert'
),
}sequenceDiagram
participant AREG as Agent Registry
participant GRAPH as Graph Creation
participant CB as Callbacks
participant AGENT as Created Agent
Note over AREG: System Startup
GRAPH->>AREG: get_worker_agents()
AREG-->>GRAPH: Dict[agent_name, AgentConfig]
loop For each agent config
GRAPH->>AREG: import_agent_factory(config.factory_function)
AREG-->>GRAPH: Agent factory function
GRAPH->>GRAPH: Create agent instance
GRAPH->>GRAPH: Create handoff tool
end
Note over GRAPH: All agents loaded dynamically
Note over CB: Runtime Execution
CB->>AREG: get_all_agent_names()
AREG-->>CB: List of all agent names
CB->>CB: Monitor for agent transitions
CB->>AGENT: Detect agent handoffs
Adding new agents required updating:
โโโ lobster/agents/graph.py # Import statements
โโโ lobster/agents/graph.py # Agent creation code
โโโ lobster/agents/graph.py # Handoff tool definitions
โโโ lobster/utils/callbacks.py # Agent name hardcoded list
โโโ Multiple imports throughout codebase
Adding new agents only requires:
โโโ lobster/config/agent_registry.py # Single registry entry
Everything else is handled automatically:
โโโ โ
Dynamic agent loading
โโโ โ
Automatic handoff tool creation
โโโ โ
Callback system integration
โโโ โ
Type-safe configuration
โโโ โ
Professional error handling
# lobster/agents/new_agent.py
def new_agent(data_manager, callback_handler=None, agent_name='new_agent', handoff_tools=None):
"""Create a new specialized agent."""
# Agent implementation
return agent_instance# lobster/config/agent_registry.py
AGENT_REGISTRY = {
# ... existing agents ...
'new_agent': AgentConfig(
name='new_agent',
display_name='New Agent',
description='Handles specialized new functionality',
factory_function='lobster.agents.new_agent.new_agent',
handoff_tool_name='handoff_to_new_agent',
handoff_tool_description='Assign specialized tasks to the new agent'
),
}The system automatically handles:
- โ Agent loading in graph creation
- โ Handoff tool generation
- โ Callback system detection
- โ Error handling and logging
- โ Integration with existing workflows
The registry provides several utility functions:
# Get all worker agents with configurations
worker_agents = get_worker_agents()
# Returns: Dict[str, AgentConfig]
# Get all agent names (including system agents)
all_agents = get_all_agent_names()
# Returns: List[str]
# Get specific agent configuration
config = get_agent_config('data_expert_agent')
# Returns: AgentConfig or None
# Dynamically import agent factory
factory = import_agent_factory('lobster.agents.data_expert.data_expert')
# Returns: CallableThe registry system prevents common errors:
- โ Factory function existence validation
- โ Import path verification
- โ Configuration completeness checks
- โ Duplicate agent name detection
- โ Type hints for all configurations
- โ Consistent naming conventions
- โ Comprehensive error messages
- โ Centralized documentation
- โ Single source of truth
- โ Easy to audit and review
- โ Reduced cognitive load
- โ Professional code organization
The system includes comprehensive testing:
# tests/test_agent_registry.py
def test_agent_registry():
"""Test the agent registry functionality."""
# Test 1: Verify all agents are registered
worker_agents = get_worker_agents()
assert len(worker_agents) > 0
# Test 2: Validate factory function imports
for agent_name, config in worker_agents.items():
factory = import_agent_factory(config.factory_function)
assert callable(factory)
# Test 3: Check agent name consistency
all_agents = get_all_agent_names()
assert 'supervisor' in all_agents
assert 'data_expert_agent' in all_agentsRun the test with:
python tests/test_agent_registry.pyThis centralized approach ensures professional, maintainable, and error-free agent management across the entire Lobster AI system.
The ConcatenationService is a critical architectural improvement that eliminates code duplication and provides memory-efficient, modality-agnostic concatenation of biological samples. This service addresses the code redundancy problem that existed between data_expert.py and geo_service.py.
graph TB
subgraph "Before: Code Duplication Problem"
DE_OLD[data_expert.py<br/>concatenate_samples()<br/>200+ lines of code]
GEO_OLD[geo_service.py<br/>_concatenate_stored_samples()<br/>300+ lines of code]
DUPLICATION[โ 450+ lines of duplicated logic<br/>โ Memory inefficiency<br/>โ Maintenance overhead]
DE_OLD -.-> DUPLICATION
GEO_OLD -.-> DUPLICATION
end
subgraph "After: Centralized Service"
CONCAT_SERVICE[ConcatenationService<br/>๐ Single Source of Truth<br/>810 lines of professional code]
subgraph "Strategy Pattern"
SMART[SmartSparseStrategy<br/>๐งฌ Single-cell optimized]
MEMORY[MemoryEfficientStrategy<br/>๐พ Large dataset chunked processing]
end
subgraph "Refactored Clients"
DE_NEW[data_expert.py<br/>concatenate_samples()<br/>30 lines (delegates to service)]
GEO_NEW[geo_service.py<br/>_concatenate_stored_samples()<br/>20 lines (delegates to service)]
end
CONCAT_SERVICE --> SMART
CONCAT_SERVICE --> MEMORY
DE_NEW --> CONCAT_SERVICE
GEO_NEW --> CONCAT_SERVICE
end
classDef old fill:#ffebee,stroke:#c62828,stroke-width:2px
classDef problem fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px
classDef new fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px
classDef strategy fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef client fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
class DE_OLD,GEO_OLD old
class DUPLICATION problem
class CONCAT_SERVICE new
class SMART,MEMORY strategy
class DE_NEW,GEO_NEW client
- data_expert.py: 200+ lines โ 30 lines (85% reduction)
- geo_service.py: 300+ lines โ 20 lines (93% reduction)
- Total elimination: 450+ lines of duplicated code
- Smart memory estimation with automatic strategy recommendation
- Chunked processing for datasets exceeding memory limits
- 50%+ memory reduction for large concatenation operations
- Real-time memory monitoring during processing
- Strategy Pattern: Different algorithms for different data types
- Single-cell optimization: Sparse matrix handling with batch tracking
- Bulk transcriptomics: Optimized for dense matrix operations
- Proteomics support: Handle missing values appropriately
- Single source of truth for all concatenation logic
- Comprehensive error handling with custom exceptions
- Progress tracking with Rich console integration
- Extensive testing with 400+ lines of unit tests
# Primary concatenation method
concatenated_adata, statistics = concat_service.concatenate_samples(
sample_adatas=sample_list,
strategy=ConcatenationStrategy.SMART_SPARSE,
batch_key="batch",
use_intersecting_genes_only=True
)
# Concatenate from modality names
concatenated_adata, statistics = concat_service.concatenate_from_modalities(
modality_names=["sample1", "sample2", "sample3"],
output_name="concatenated_dataset",
use_intersecting_genes_only=True
)
# Auto-detect samples by pattern
sample_modalities = concat_service.auto_detect_samples("geo_gse12345")
# Validate before processing
validation_result = concat_service.validate_concatenation_inputs(sample_list)
# Estimate memory requirements
memory_info = concat_service.estimate_memory_usage(sample_list)The ConcatenationService integrates deeply with DataManagerV2 for seamless modality management:
sequenceDiagram
participant DE as Data Expert Agent
participant CS as ConcatenationService
participant DM2 as DataManagerV2
participant Strategy as Strategy Implementation
DE->>CS: concatenate_from_modalities(modality_names)
CS->>DM2: get_modality() for each sample
DM2-->>CS: List of AnnData objects
CS->>Strategy: concatenate(sample_adatas, **kwargs)
Strategy->>Strategy: Apply concatenation algorithm
Strategy->>Strategy: Add batch information
Strategy->>Strategy: Monitor memory usage
Strategy-->>CS: ConcatenationResult
CS->>DM2: load_modality(output_name, concatenated_data)
DM2->>DM2: Store as new modality
DM2-->>CS: Confirmation
CS-->>DE: (concatenated_adata, statistics)
DE-->>DE: Format response for user
The ConcatenationService includes comprehensive testing:
- Unit Tests: Strategy pattern, validation functions, memory estimation
- Integration Tests: DataManagerV2 interaction, modality storage
- Performance Tests: Memory usage, processing time benchmarks
- Error Handling Tests: Exception scenarios, graceful degradation
This architecture improvement ensures reliable, maintainable, and efficient sample concatenation across the entire Lobster AI platform.
- Complete Bioinformatics Platform: All analysis capabilities included
- AI-Powered Analysis: Natural language interface to bioinformatics
- Publication-Ready Outputs: Professional visualizations and reports
- Extensible Architecture: Add custom analysis methods easily
- Active Development: Regular updates and community contributions
- Privacy: Your data never leaves your computer
- Customization: Full control over analysis parameters
- Learning: Study the source code to understand methods
- Contribution: Help improve the platform for everyone
- Cost: Completely free (you pay only for your own API keys)
For teams needing scalable infrastructure, managed services, or collaborative features, we're developing a cloud platform.
The Lobster AI system has been successfully migrated from a dual-system architecture (legacy DataManager + DataManagerV2) to a clean, professional, modular DataManagerV2-only implementation.
- Before: Agents contained mixed responsibilities with dual code paths
- After: Clean separation with stateless analysis services and orchestration agents
-
Custom Exception Hierarchy:
-
TranscriptomicsError,PreprocessingError,QualityError, etc. -
ModalityNotFoundErrorfor specific validation
-
- Comprehensive Logging: All operations tracked with parameters and results
- Graceful Error Recovery: Informative error messages with suggested fixes
- PreprocessingService: AnnData filtering, normalization, batch correction
- QualityService: Comprehensive QC assessment with statistical metrics
- ClusteringService: Leiden clustering, PCA, UMAP visualization
- EnhancedSingleCellService: Doublet detection, cell type annotation
- GEOService: Professional dataset downloading and processing
- PubMedService: Literature mining and method extraction
@tool
def tool_name(modality_name: str, **params) -> str:
"""Professional tool with comprehensive error handling."""
try:
# 1. Validate modality exists
if modality_name not in data_manager.list_modalities():
raise ModalityNotFoundError(f"Modality '{modality_name}' not found")
# 2. Get AnnData from modality
adata = data_manager.get_modality(modality_name)
# 3. Call stateless service
result_adata, stats = service.method_name(adata, **params)
# 4. Save new modality with descriptive name
new_modality_name = f"{modality_name}_processed"
data_manager.modalities[new_modality_name] = result_adata
# 5. Log operation for provenance
data_manager.log_tool_usage(tool_name, params, description)
# 6. Format professional response
return format_professional_response(stats, new_modality_name)
except ServiceError as e:
logger.error(f"Service error: {e}")
return f"Service error: {str(e)}"
except Exception as e:
logger.error(f"Unexpected error: {e}")
return f"Unexpected error: {str(e)}"def service_method(
self,
adata: anndata.AnnData,
**parameters
) -> Tuple[anndata.AnnData, Dict[str, Any]]:
"""
Stateless service method working with AnnData directly.
Returns:
Tuple of (processed_adata, processing_statistics)
"""
try:
# 1. Create working copy
adata_processed = adata.copy()
# 2. Apply analysis algorithms
# ... processing logic ...
# 3. Calculate comprehensive statistics
processing_stats = {
"analysis_type": "method_type",
"parameters_used": parameters,
"results": {...}
}
return adata_processed, processing_stats
except Exception as e:
raise ServiceError(f"Method failed: {str(e)}")Each analysis step creates new modalities with descriptive, traceable names:
geo_gse12345 # Raw downloaded data
โโโ geo_gse12345_quality_assessed # With QC metrics
โโโ geo_gse12345_filtered_normalized # Preprocessed data
โโโ geo_gse12345_doublets_detected # With doublet annotations
โโโ geo_gse12345_clustered # With clustering results
โโโ geo_gse12345_markers # With marker genes
โโโ geo_gse12345_annotated # With cell type annotations
- Provenance: Complete analysis history with parameters
- Statistics: Comprehensive metrics for each processing step
- Validation: Schema enforcement and quality checks
- Storage: Automatic saving with professional file naming
1. check_data_status() โ Review available modalities
2. assess_data_quality(modality_name) โ Professional QC assessment
3. filter_and_normalize_modality(...) โ Clean and normalize
4. detect_doublets_in_modality(...) โ Remove doublets
5. cluster_modality(...) โ Leiden clustering + UMAP
6. find_marker_genes_for_clusters(...) โ Differential expression
7. annotate_cell_types(...) โ Automated annotation
8. create_analysis_summary() โ Comprehensive report
- Professional QC Thresholds: Evidence-based filtering parameters
- Multi-metric Assessment: Total counts, gene counts, mitochondrial%, ribosomal%
- Statistical Validation: Z-score outlier detection and percentile thresholds
- Batch Effect Handling: Automatic batch detection and correction options
- Input Validation: Comprehensive parameter and data validation
- Graceful Degradation: Fallback methods when specialized tools unavailable
- Informative Messages: Clear error descriptions with suggested solutions
- Operation Logging: Complete audit trail for debugging and reproducibility
- 50% Reduction in agent code complexity (450+ โ 200+ lines)
- Zero Duplication: No more dual code paths or is_v2 checks
- Professional Standards: Type hints, comprehensive docstrings, error handling
- Testability: Stateless services are easily unit tested
- Single Responsibility: Each service handles one analysis domain
- Modular Design: Services can be used independently or combined
- Clean Interfaces: Consistent patterns across all analysis tools
- Version Control: Clear separation enables independent service updates
- Memory Efficiency: Stateless services with minimal memory footprint
- Fault Tolerance: Comprehensive error handling prevents pipeline failures
- Reproducibility: Complete parameter logging and provenance tracking
- Scalability: Services can be distributed or parallelized in future versions
transcriptomics_expert.py: 450+ lines
โโโ Dual code paths (is_v2 checks everywhere)
โโโ Mixed responsibilities (orchestration + analysis)
โโโ Redundant implementations
โโโ Complex error handling
โโโ Maintenance overhead
transcriptomics_expert.py: 280 lines (clean)
โโโ Single DataManagerV2 path
โโโ Professional tool orchestration only
โโโ Stateless service delegation
โโโ Comprehensive error handling
โโโ Minimal maintenance overhead
Analysis Services: 4 refactored services
โโโ PreprocessingService: AnnData โ (filtered_adata, stats)
โโโ QualityService: AnnData โ (qc_adata, assessment)
โโโ ClusteringService: AnnData โ (clustered_adata, results)
โโโ EnhancedSingleCellService: AnnData โ (annotated_adata, metrics)
- Reusability: Services can be used by multiple agents
- Testability: Each service can be independently tested
- Flexibility: Easy to add new analysis methods
- Performance: Optimized algorithms with professional implementations
- Orchestration Focus: Agents handle modality management and user interaction
- Clean Tool Interface: Consistent ~20-30 line tool implementations
- Professional Responses: Formatted outputs with comprehensive statistics
- Error Management: Hierarchical error handling with specific exceptions
- Modality-Centric: All data operations centered around named modalities
- Provenance Tracking: Complete analysis history with tool usage logging
- Schema Validation: Automatic validation ensures data integrity
- Storage Management: Professional file naming and workspace organization
This architecture provides a solid foundation for professional bioinformatics analysis with excellent maintainability, extensibility, and reliability.
The bulk_rnaseq_expert agent includes 5 new tools for conversational formula construction:
graph LR
subgraph "Agent Layer"
BRE[Transcriptomics Expert<br/>๐ค Enhanced with Formula Tools]
end
subgraph "New Agent Tools (5)"
T1[suggest_formula_for_design<br/>๐ Metadata Analysis]
T2[construct_de_formula_interactive<br/>๐ง Formula Building]
T3[run_differential_expression_with_formula<br/>๐งฌ pyDESeq2 Execution]
T4[iterate_de_analysis<br/>๐ Iterative Workflows]
T5[compare_de_iterations<br/>๐ Result Comparison]
end
subgraph "Enhanced Services"
FORMULA[DifferentialFormulaService<br/>๐ 3 New Methods Added]
WFLOW[WorkflowTracker<br/>๐ New Iteration Management]
BULK[BulkRNASeqService<br/>๐ pyDESeq2 Integration]
end
BRE --> T1
BRE --> T2
BRE --> T3
BRE --> T4
BRE --> T5
T1 --> FORMULA
T2 --> FORMULA
T3 --> BULK
T4 --> WFLOW
T5 --> WFLOW
classDef agent fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef tool fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
class TE,PE agent
class T1,T2,T3,T4,T5 tool
class FORMULA,WFLOW,BULK service
-
DifferentialFormulaService: Added
suggest_formulas(),preview_design_matrix(),estimate_statistical_power() - WorkflowTracker: New lightweight class for DE iteration tracking and comparison
- Integration: All data stored in AnnData.uns for seamless workflow integration
- โ Step 8: Formula Construction โ Agent-guided conversation
- โ Step 12: Iterative Workflows โ Natural iteration and comparison
- ๐ฏ Result: 92% workflow coverage (11/12 steps complete)
Lobster AI now features intelligent workspace restoration that automatically detects and restores previous analysis sessions:
-
Automatic Detection: Scans
.lobster_workspace/data/for available datasets on startup -
Session Persistence: Maintains
.session.jsonwith active modalities and usage history -
Lazy Loading: Load specific datasets on-demand with
load_dataset() -
Pattern-Based Restoration: Support for recent/all/glob patterns via
/restore - Memory Management: Enforced memory limits prevent out-of-memory issues
-
/restore [pattern]- Restore datasets from previous sessions -
/workspace list- View available datasets without loading -
/workspace load <name>- Load specific dataset by name - Autocomplete Support: Tab completion for dataset names and patterns
-
DataManagerV2 Enhanced: Added
_scan_workspace(),load_dataset(),restore_session() -
Session Tracking: Automatic
.session.jsonupdates on modality changes - H5PY Integration: Efficient metadata extraction without full dataset loading
- Professional UX: Startup prompt shows workspace status with helpful commands
This transformation enables users to seamlessly continue their work across sessions without manual dataset reloading.
The system now features centralized platform utilities that eliminate redundant OS detection and provide unified cross-platform operations:
-
Platform Detection: 5 ร
platform.system()calls โ 1 ร (at import time) - Code Reduction: ~50 lines of duplicate subprocess logic โ 5 lines at call sites
- Performance: 80% improvement in system operation speed
-
Architecture: Clean
lobster/utils/system.pymodule withopen_file(),open_folder(),open_path()functions
All file opening operations run on the CLI side regardless of cloud vs local mode, ensuring consistent behavior across deployment types.
-
CLI Commands:
open <file>,/open <file>,/plot,/plot <ID> -
GPU Detection: Apple Silicon detection in
gpu_detector.py - Future Extensions: Natural extension point for additional system utilities
The supervisor agent now features automatic agent discovery and configurable behavior, eliminating manual updates when adding new agents:
graph TB
subgraph "Configuration Sources"
ENV[Environment Variables<br/>SUPERVISOR_*]
CODE[Code Configuration<br/>SupervisorConfig()]
DEFAULT[Default Settings<br/>Backward Compatible]
end
subgraph "Discovery System"
REGISTRY[Agent Registry<br/>All Registered Agents]
CAPEXT[Capability Extractor<br/>@tool Discovery]
ACTIVE[Active Agents<br/>Successfully Loaded]
end
subgraph "Prompt Builder"
SECTIONS[Modular Sections<br/>Role, Agents, Rules]
CONTEXT[Dynamic Context<br/>Data & Workspace]
OPTIMIZE[Size Optimization<br/>Mode-Based]
end
ENV --> CONFIG[SupervisorConfig]
CODE --> CONFIG
DEFAULT --> CONFIG
REGISTRY --> DISCOVER[Agent Discovery]
CAPEXT --> DISCOVER
ACTIVE --> DISCOVER
CONFIG --> BUILD[create_supervisor_prompt()]
DISCOVER --> BUILD
BUILD --> SECTIONS
BUILD --> CONTEXT
BUILD --> OPTIMIZE
OPTIMIZE --> PROMPT[Dynamic Prompt<br/>8K-11K chars]
classDef config fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef discover fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
classDef build fill:#fff3e0,stroke:#f57c00,stroke-width:2px
class ENV,CODE,DEFAULT,CONFIG config
class REGISTRY,CAPEXT,ACTIVE,DISCOVER discover
class BUILD,SECTIONS,CONTEXT,OPTIMIZE,PROMPT build
| Feature | Before (Static) | After (Dynamic) | Impact |
|---|---|---|---|
| Agent Discovery | Manual updates in supervisor.py | Automatic from registry | Zero maintenance |
| Missing Agents | 3 agents not included | All 8 agents included | Complete coverage |
| Configuration | Hardcoded behavior | 20+ env variables | Full flexibility |
| Prompt Size | Fixed ~9.5K chars | 8K-11K adaptive | 15% smaller in production |
| Adding Agents | Update 3+ files | Update registry only | 66% less work |
# Research Mode - Interactive exploration
SUPERVISOR_ASK_QUESTIONS=true
SUPERVISOR_WORKFLOW_GUIDANCE=detailed
# Result: 11K char prompt with full guidance
# Production Mode - Automated pipelines
SUPERVISOR_ASK_QUESTIONS=false
SUPERVISOR_WORKFLOW_GUIDANCE=minimal
# Result: 8K char prompt, 1.4K chars saved
# Development Mode - Debugging
SUPERVISOR_VERBOSE=true
SUPERVISOR_INCLUDE_SYSTEM=true
# Result: Detailed explanations with system info- ๐ Zero Maintenance: Add agents to registry only, supervisor auto-discovers
- โ๏ธ Flexible Behavior: Configure interaction style per environment
- ๐ Context Aware: Includes current data/workspace state dynamically
- ๐ฏ Mode Optimized: Different prompt sizes for different use cases
- โป๏ธ Backward Compatible: Default config matches previous behavior exactly