35 manual annotation service - the-omics-os/lobster-local GitHub Wiki

Manual Cell Type Annotation Service Documentation

Overview

The Manual Cell Type Annotation Service provides expert-guided cell type annotation capabilities for single-cell RNA-seq data with a color-synchronized Rich terminal interface that matches UMAP plot colors. This addresses Step 7 of the customer workflow: "I assign each cluster to a named cell type or to 'Debris', sometimes collapsing multiple clusters into the same cell type".

⚠️ Important: Marker Gene Template Status

CRITICAL: Built-in marker gene templates are preliminary and not scientifically validated.

Current Limitations

The annotation templates provided by this service use hardcoded marker lists without:

  • Evidence scoring: No AUC, logFC, or specificity metrics
  • Reference validation: Not validated against Azimuth, CellTypist, or Human Cell Atlas
  • Tissue optimization: Not optimized for specific tissue contexts
  • Species separation: Some mouse genes may be present in human templates
  • State handling: Activation/injury markers mixed with baseline identity markers

Specific Known Issues

  • SASP/Senescence detection: Removed in v0.1.0 (not reliable with RNA-seq alone)
  • Tumor cell detection: Uses proliferation markers only - not tumor-specific (use CNV inference instead)
  • Cross-tissue transfer: Markers optimized for one tissue may not work in others

Recommended Approach for Production Analysis

Always prefer custom validated markers:

  1. Provide custom markers specific to your tissue/context from:

    • Published references for your specific tissue
    • Azimuth pretrained models
    • CellTypist curated references
    • scANVI transfer learning
    • Literature-derived markers with evidence scores
  2. Validate annotations manually with known markers for your tissue

  3. Use reference-based tools when possible rather than marker-based annotation

When Using Built-in Templates

If you must use the built-in templates:

  • Acknowledge these limitations in your analysis
  • Manually validate results with known tissue markers
  • Cross-reference with literature and atlases
  • Document which template version was used (v0.1.0 preliminary)

Planned Improvements

  • Integration with Azimuth/CellTypist pretrained models
  • Reference atlas-derived markers with evidence scores
  • UCell/AUCell signature scoring
  • CNV-based tumor/normal classification
  • Cell Ontology (CL ID) annotations
  • Species-specific marker sets

See README.md for full details.

Key Features

🎨 Color-Synchronized Rich Interface

  • Perfect color matching between UMAP plots and terminal interface
  • Visual cluster identification eliminates cognitive load
  • Interactive menus with color-coded cluster displays
  • Real-time progress tracking with Rich components

🧬 Expert-Guided Annotation

  • Manual cluster assignment with biological expertise
  • Cluster collapsing for merging similar cell populations
  • Debris identification with QC-based smart suggestions
  • Undo/redo functionality with full annotation history

πŸ“‹ Annotation Templates

  • Tissue-specific templates for common organs (PBMC, Brain, Lung, Heart, etc.)
  • Marker gene validation for biological consistency
  • Custom template creation for specialized analyses
  • Template-based suggestions with confidence scoring

πŸ’Ύ Data Management

  • Export/import mappings for workflow reproducibility
  • Annotation validation with coverage metrics
  • Integration with DataManagerV2 for seamless workflows
  • Provenance tracking with full audit trails

Architecture

Core Components

Manual Annotation Service
β”œβ”€β”€ ManualAnnotationService     # Main service class
β”œβ”€β”€ ClusterInfo                 # Cluster metadata
β”œβ”€β”€ AnnotationState            # Session state management
β”œβ”€β”€ Rich Terminal Interface    # Color-synchronized UI
└── Integration Layer          # Tools for SingleCell Expert

Data Flow

1. πŸ“Š UMAP Plot Generation
   β”œβ”€β”€ Cluster colors extracted from plot
   └── Color palette stored for synchronization

2. πŸ–₯️ Rich Terminal Interface
   β”œβ”€β”€ Colors synchronized with plot
   β”œβ”€β”€ Interactive cluster selection
   └── Real-time annotation progress

3. πŸ’Ύ Annotation Application
   β”œβ”€β”€ Cell type mappings applied to AnnData
   β”œβ”€β”€ Metadata and provenance stored
   └── New modality created with annotations

Quick Start Guide

Step 1: Launch Interactive Annotation

# From SingleCell Expert Agent
manually_annotate_clusters_interactive(
    modality_name="my_dataset_clustered",
    cluster_col="leiden",
    save_result=True
)

This launches the Rich terminal interface with:

  • Color legend matching your UMAP plot
  • Interactive cluster selection menus
  • Progress tracking and validation
  • Export/import capabilities

Step 2: Rich Terminal Workflow

The interactive interface provides:

🧬 Manual Cell Type Annotation Service

Main Menu:
1. Annotate Clusters    - Assign cell types to clusters
2. Mark Debris         - Identify low-quality clusters
3. Collapse Clusters   - Merge clusters with same cell type
4. View Summary        - Show current annotation status
5. Apply Template      - Use predefined annotation template
6. Undo               - Undo last annotation action
7. Export             - Save annotations to file
8. Import             - Load annotations from file
9. Finish             - Complete annotation session

Step 3: Color-Synchronized Annotation

  • Visual Connection: Cluster colors in terminal exactly match UMAP plot
  • Easy Identification: See Cluster 0 in red on plot β†’ see "Cluster 0" in red in terminal
  • Cognitive Relief: No mental mapping between plot and interface required

Tool Reference

Interactive Annotation Tools

manually_annotate_clusters_interactive

Launch Rich terminal interface for manual annotation with color synchronization.

Parameters:

  • modality_name (str): Name of clustered single-cell modality
  • cluster_col (str): Column containing cluster assignments (default: "leiden")
  • save_result (bool): Whether to save annotated modality (default: True)

Returns: Comprehensive annotation results with color-synchronized interface completion.

manually_annotate_clusters

Directly assign cell types to clusters without interactive interface.

Parameters:

  • modality_name (str): Name of clustered single-cell modality
  • annotations (dict): Dictionary mapping cluster IDs to cell type names
  • cluster_col (str): Column containing cluster assignments (default: "leiden")
  • save_result (bool): Whether to save annotated modality (default: True)

Example:

manually_annotate_clusters(
    modality_name="pbmc_clustered",
    annotations={
        "0": "T cells CD4+",
        "1": "T cells CD8+",
        "2": "B cells",
        "3": "NK cells",
        "4": "Monocytes"
    }
)

Cluster Management Tools

collapse_clusters_to_celltype

Merge multiple clusters into a single cell type annotation.

Parameters:

  • modality_name (str): Name of single-cell modality
  • cluster_list (List[str]): List of cluster IDs to collapse
  • cell_type_name (str): New cell type name for collapsed clusters
  • cluster_col (str): Column containing cluster assignments (default: "leiden")
  • save_result (bool): Whether to save result (default: True)

Example:

# Collapse T cell subclusters
collapse_clusters_to_celltype(
    modality_name="pbmc_clustered",
    cluster_list=["0", "1", "5"],
    cell_type_name="T cells",
    cluster_col="leiden"
)

mark_clusters_as_debris

Mark specified clusters as debris for quality control.

Parameters:

  • modality_name (str): Name of single-cell modality
  • debris_clusters (List[str]): List of cluster IDs to mark as debris
  • remove_debris (bool): Whether to remove debris clusters from data (default: False)
  • cluster_col (str): Column containing cluster assignments (default: "leiden")
  • save_result (bool): Whether to save result (default: True)

Quality Control Tools

suggest_debris_clusters

Get smart suggestions for potential debris clusters based on QC metrics.

Parameters:

  • modality_name (str): Name of single-cell modality
  • min_genes (int): Minimum genes per cell threshold (default: 200)
  • max_mt_percent (float): Maximum mitochondrial percentage (default: 50)
  • min_umi (int): Minimum UMI count threshold (default: 500)
  • cluster_col (str): Column containing cluster assignments (default: "leiden")

Returns: Smart suggestions based on:

  • Low gene count per cluster
  • High mitochondrial gene percentage
  • Low UMI counts
  • Very small cluster sizes (<10 cells)

Template-Based Annotation

apply_annotation_template

Apply predefined tissue-specific annotation template.

Parameters:

  • modality_name (str): Name of single-cell modality
  • tissue_type (str): Type of tissue - Available options:
    • "pbmc" - Peripheral Blood Mononuclear Cells
    • "brain" - Brain tissue (neurons, glia, immune)
    • "lung" - Lung tissue (epithelial, immune, vascular)
    • "heart" - Heart tissue (cardiomyocytes, fibroblasts, vascular)
    • "kidney" - Kidney tissue (tubular, glomerular, vascular)
    • "liver" - Liver tissue (hepatocytes, stellate, immune)
    • "intestine" - Intestinal tissue (epithelial, stem, immune)
    • "skin" - Skin tissue (keratinocytes, melanocytes, immune)
    • "tumor" - Tumor microenvironment (malignant, immune, stromal)
  • cluster_col (str): Column containing cluster assignments (default: "leiden")
  • expression_threshold (float): Minimum expression for marker detection (default: 0.5)
  • save_result (bool): Whether to save annotated modality (default: True)

Example:

# Apply PBMC template for blood sample
apply_annotation_template(
    modality_name="blood_clustered",
    tissue_type="pbmc",
    expression_threshold=0.6
)

Review and Export Tools

review_annotation_assignments

Review current manual annotation assignments with coverage statistics.

export_annotation_mapping

Export annotation mapping for reuse in other analyses.

Parameters:

  • output_filename (str): Output filename (default: "annotation_mapping.json")
  • format (str): Export format - "json" or "csv" (default: "json")

import_annotation_mapping

Import and apply annotation mapping from previous analysis.

Parameters:

  • mapping_file (str): Path to mapping file (JSON format)
  • preview_only (bool): If True, only show what would be applied (default: False)

Rich Terminal Interface Guide

Welcome Screen

🧬 Manual Cell Type Annotation Service

Welcome to the interactive annotation interface! This tool provides:

β€’ Color-synchronized visualization matching your UMAP plot
β€’ Interactive cluster assignment with expert guidance
β€’ Debris identification and quality control
β€’ Annotation templates for common tissue types
β€’ Undo/redo functionality with full history

Current Session:
β€’ Total clusters: 12
β€’ Total cells: 8,543
β€’ Annotated: 3
β€’ Debris: 1

The colors in this terminal exactly match your UMAP plot colors for easy identification.

Color Legend

β”Œβ”€ Cluster Color Legend ─┐
β”‚ Cluster ID β”‚ Color β”‚ Cell Count β”‚ Status    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 0          β”‚ ●●●   β”‚ 1,234      β”‚ Annotated β”‚
β”‚ 1          β”‚ ●●●   β”‚ 856        β”‚ Pending   β”‚
β”‚ 2          β”‚ ●●●   β”‚ 445        β”‚ Debris    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Annotation Workflow

1. Cluster Annotation Mode

🎯 Cluster Annotation Mode
Colors match your UMAP plot for easy identification.

β”Œβ”€ Clusters to Annotate ─┐
β”‚ ID β”‚ Color β”‚ Cells β”‚ QC Metrics              β”‚
β”œβ”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 0  β”‚ ●●●●  β”‚ 1,234 β”‚ Genes: 2,150, MT%: 8.5  β”‚
β”‚ 1  β”‚ ●●●●  β”‚ 856   β”‚ Genes: 1,890, MT%: 12.1 β”‚
β””β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

● Enter cell type for cluster 0: T cells CD4+
βœ… Cluster 0 annotated as 'T cells CD4+'

2. Debris Identification Mode

πŸ—‘οΈ Debris Identification Mode

πŸ’‘ Smart debris suggestions based on QC metrics:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Cluster β”‚ Reason                   β”‚ Cells β”‚ Action          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 7       β”‚ Low gene count (145)     β”‚ 23    β”‚ Mark as debris? β”‚
β”‚ 9       β”‚ High MT% (65.2%)         β”‚ 12    β”‚ Mark as debris? β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Apply smart debris suggestions? [y/N]: y

3. Annotation Summary

πŸ“Š Annotation Summary

β”Œβ”€ Overall Statistics ─┐
β”‚ Metric        β”‚ Count β”‚ Percentage β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Total Clustersβ”‚ 10    β”‚ 100%       β”‚
β”‚ Annotated     β”‚ 8     β”‚ 80.0%      β”‚
β”‚ Debris        β”‚ 1     β”‚ 10.0%      β”‚
β”‚ Pending       β”‚ 1     β”‚ 10.0%      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€ Cell Type Annotations ─┐
β”‚ Cell Type      β”‚ Clusters β”‚ Total Cells β”‚ Avg Cells/Cluster β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ T cells CD4+   β”‚ 2        β”‚ 2,090       β”‚ 1,045              β”‚
β”‚ T cells CD8+   β”‚ 2        β”‚ 1,456       β”‚ 728                β”‚
β”‚ B cells        β”‚ 1        β”‚ 445         β”‚ 445                β”‚
β”‚ NK cells       β”‚ 1        β”‚ 332         β”‚ 332                β”‚
β”‚ Monocytes      β”‚ 2        β”‚ 1,123       β”‚ 562                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Programming Interface

Core Service Usage

from lobster.tools.manual_annotation_service import ManualAnnotationService
from rich.console import Console

# Initialize service
console = Console()
service = ManualAnnotationService(console)

# Initialize annotation session
state = service.initialize_annotation_session(
    adata=clustered_data,
    cluster_key='leiden'
)

# Launch interactive interface
cell_type_mapping = service.rich_annotation_interface()

# Apply annotations to data
adata_annotated = service.apply_annotations_to_adata(
    adata=clustered_data,
    cluster_key='leiden',
    cell_type_column='cell_type_manual'
)

Template Service Usage

from lobster.tools.annotation_templates import AnnotationTemplateService, TissueType

# Initialize template service
template_service = AnnotationTemplateService()

# Get available tissue types
tissue_types = template_service.get_all_tissue_types()
print(tissue_types)  # [TissueType.PBMC, TissueType.BRAIN, ...]

# Apply PBMC template
suggestions = template_service.apply_template_to_clusters(
    adata=clustered_data,
    tissue_type=TissueType.PBMC,
    cluster_col='leiden'
)

# Get marker genes for specific cell type
markers = template_service.get_markers_for_cell_type(
    tissue_type=TissueType.PBMC,
    cell_type='T cells CD4+'
)
print(markers)  # ['CD3D', 'CD3E', 'CD4', 'IL7R', 'CCR7', 'LEF1']

Workflow Integration

Standard Single-Cell Pipeline with Manual Annotation

# Step 1-6: Standard single-cell preprocessing and clustering
check_data_status()
assess_data_quality("dataset")
filter_and_normalize_modality("dataset")
detect_doublets_in_modality("dataset_filtered_normalized")
cluster_modality("dataset_filtered_normalized")
find_marker_genes_for_clusters("dataset_clustered")

# Step 7: Manual annotation (THE KEY STEP)
manually_annotate_clusters_interactive("dataset_clustered")

# Step 8: Continue with annotated data
create_pseudobulk_matrix(
    "dataset_manually_annotated",
    sample_col="sample_id",
    celltype_col="cell_type_manual"
)

Annotation Templates Reference

PBMC Template

Cell Types Available:

  • T cells CD4+ (CD3D, CD3E, CD4, IL7R, CCR7, LEF1)
  • T cells CD8+ (CD3D, CD3E, CD8A, CD8B, CCL5, GZMK)
  • T cells regulatory (CD3D, CD3E, CD4, FOXP3, IL2RA, CTLA4)
  • NK cells (GNLY, NKG7, KLRD1, KLRB1, NCAM1, KLRF1)
  • B cells naive (CD19, MS4A1, CD79A, CD79B, IGHD, TCL1A)
  • B cells memory (CD19, MS4A1, CD79A, CD27, CD38, IGHG1)
  • Plasma cells (IGHG1, IGHG2, IGHG3, IGHG4, JCHAIN, XBP1)
  • Monocytes CD14+ (CD14, LYZ, S100A8, S100A9, FCN1, VCAN)
  • Monocytes CD16+ (FCGR3A, MS4A7, LST1, AIF1, SERPINA1)
  • Dendritic cells (FCER1A, CST3, CLEC9A, XCR1, BATF3, IRF8)
  • Platelets (PPBP, PF4, NRGN, GP9, TUBB1, CLU)

Brain Template

Cell Types Available:

  • Excitatory neurons (SLC17A7, CAMK2A, RBFOX3, NEUROD2, NEUROD6, SATB2)
  • Inhibitory neurons (GAD1, GAD2, SLC32A1, PVALB, SST, VIP)
  • Astrocytes (GFAP, AQP4, ALDH1L1, S100B, SOX9, SLC1A3)
  • Oligodendrocytes (MBP, MOG, PLP1, MAG, CNP, MOBP)
  • Oligodendrocyte precursors (PDGFRA, CSPG4, SOX10, OLIG2, OLIG1, NKX2-2)
  • Microglia (CX3CR1, P2RY12, TMEM119, AIF1, CSF1R, TREM2)
  • Endothelial cells (PECAM1, VWF, CDH5, FLT1, CLDN5, PLVAP)
  • Pericytes (PDGFRB, RGS5, ACTA2, CSPG4, ANPEP, MCAM)

Additional Templates

  • Lung: AT1/AT2 cells, Club cells, Ciliated cells, Basal cells, Alveolar macrophages
  • Heart: Cardiomyocytes, Cardiac fibroblasts, Smooth muscle cells, Endothelial cells
  • Kidney: Podocytes, Proximal tubule, Distal tubule, Collecting duct, Loop of Henle
  • Liver: Hepatocytes, Cholangiocytes, Hepatic stellate cells, Kupffer cells
  • Tumor: Tumor cells, Exhausted T cells, TAM M1/M2, Cancer-associated fibroblasts

Best Practices

1. Color Synchronization Workflow

# Ensure UMAP plot is visible before starting annotation
create_umap_plot("dataset_clustered", color_by="leiden")

# Launch annotation with same clustering
manually_annotate_clusters_interactive("dataset_clustered", cluster_col="leiden")

# Colors will automatically match between plot and terminal

2. Quality Control Integration

# Get smart debris suggestions first
suggest_debris_clusters("dataset_clustered", min_genes=300, max_mt_percent=25)

# Apply suggestions or use interactive debris marking
mark_clusters_as_debris("dataset_clustered", debris_clusters=["7", "9"])

# Then proceed with biological annotation
manually_annotate_clusters_interactive("dataset_debris_marked")

3. Template-Guided Annotation

# Start with template suggestions
apply_annotation_template("dataset_clustered", tissue_type="pbmc")

# Review and refine with manual annotation
review_annotation_assignments("dataset_template_pbmc", annotation_col="cell_type_template")

# Use interactive mode for final curation
manually_annotate_clusters_interactive("dataset_template_pbmc")

4. Reproducible Workflows

# Export annotations from reference dataset
export_annotation_mapping("reference_annotated", output_filename="pbmc_reference.json")

# Import and apply to new dataset
import_annotation_mapping("new_dataset_clustered", mapping_file="pbmc_reference.json")

# Review and adjust for dataset-specific differences
review_annotation_assignments("new_dataset_imported_annotations")

Advanced Features

Annotation History and Undo

  • Full history tracking of all annotation actions
  • Undo/redo capability for mistake correction
  • Action timestamps for audit trails
  • Session state preservation throughout workflow

Smart Suggestions

  • QC-based debris detection using statistical thresholds
  • Template-based cell type suggestions with confidence scoring
  • Biological consistency validation against known markers
  • Coverage analysis with annotation completeness metrics

Export Formats

JSON Export Format

{
  "cell_type_mapping": {
    "0": "T cells CD4+",
    "1": "T cells CD8+",
    "2": "B cells"
  },
  "debris_clusters": ["9"],
  "cluster_info": {
    "0": {
      "color": "#1f77b4",
      "cell_count": 1234,
      "assigned_type": "T cells CD4+",
      "is_debris": false,
      "qc_scores": {"mean_genes": 2150, "mean_mt_pct": 8.5}
    }
  },
  "export_timestamp": "2025-09-17T00:30:00"
}

CSV Export Format

cell_type,cell_count,percentage
T cells CD4+,2090,24.5
T cells CD8+,1456,17.1
B cells,445,5.2
NK cells,332,3.9
Monocytes,1123,13.2
Debris,234,2.7
Unassigned,2863,33.5

Performance Considerations

Large Dataset Optimization

  • Efficient cluster sampling for >50k cell datasets
  • Progressive loading of cluster information
  • Color palette caching to avoid recomputation
  • Memory-efficient Rich interface updates

Terminal Compatibility

  • Automatic color support detection via Rich
  • Graceful fallback to text-based interface if needed
  • Cross-platform compatibility (macOS, Linux, Windows)
  • SSH/remote terminal support with color preservation

Integration Points

DataManagerV2 Integration

  • Seamless modality management with provenance tracking
  • Automatic file naming and workspace organization
  • Integration with existing quality metrics and metadata
  • Full compatibility with multi-omics workflows

Existing Lobster Services

  • Quality Service: QC metrics for debris suggestions
  • Clustering Service: Leiden clustering results as input
  • Visualization Service: Color palette extraction and synchronization
  • Enhanced SingleCell Service: Marker gene integration

Success Metrics

βœ… Visual Consistency: Perfect color matching between plot and terminal βœ… Intuitive UX: Immediate visual connection for cluster identification βœ… Fast Implementation: 4-week timeline using existing infrastructure βœ… Professional Polish: Rich terminal with autocomplete, progress tracking βœ… Expert Integration: Seamless workflow for bioinformaticians βœ… Reproducible Results: Export/import for consistent annotations βœ… Quality Control: Smart debris detection with biological validation βœ… Template Support: Tissue-specific annotation guidance

The Manual Cell Type Annotation Service transforms expert-guided annotation from a tedious task into an intuitive, visually-guided workflow while maintaining all the robustness and professional capabilities required for single-cell RNA-seq analysis.