Architecture - Z-M-Huang/openhive GitHub Wiki

OpenHive Architecture

OpenHive is a rules-first agentic system. Behavior lives in markdown rules; infrastructure is minimal TypeScript built on the Vercel AI SDK 6 (ai@^6). Inline AI SDK organization tools act as the control plane, managing a uniform recursive hierarchy of agent teams running as streamText() sessions inside a single Docker container.

Core principle: If behavior can be expressed as a rule, it is a rule -- not code.


Table of Contents


Architecture Overview

graph TB
    subgraph container["Single Docker Container"]
        direction TB

        subgraph process["Main Process (TypeScript)"]
            bootstrap["Bootstrap"]
            channels["Channel Adapters<br/>WebSocket / Discord"]
            trustgate["TrustGate<br/>Sender trust evaluation"]
            classifier["TopicClassifier<br/>(0 topics: skip, 1: agent evaluates,<br/>2+: lightweight LLM)"]
            dashboard["Dashboard<br/>Admin views + limited mutations"]
            orgtools["Organization Tools (Inline)<br/>(Control Plane)"]
            skillrepo["Skill Repo Tool<br/>search_skill_repository"]
            hooks["Tool Guards + withAudit() Wrappers"]
            triggers["Trigger Engine"]
            sqlite["SQLite State Store<br/>(15+ tables incl. topics)"]
            memoryIndex["Memory Index<br/>(embedding + keyword search)"]
        end

        subgraph sessions["Agent Sessions (streamText() calls)"]
            main1["Main Agent Session — Topic 1<br/>cwd: .run/teams/main"]
            main2["Main Agent Session — Topic 2<br/>cwd: .run/teams/main"]

            eng["Engineering Session<br/>cwd: .run/teams/engineering<br/>org tools"]

            ops["Operations Session<br/>cwd: .run/teams/operations<br/>org tools"]

            fe["Frontend Session<br/>cwd: .run/teams/frontend<br/>org tools"]
        end

        bootstrap --> orgtools
        bootstrap --> channels
        bootstrap --> triggers
        bootstrap --> hooks

        channels --> trustgate
        trustgate --> classifier
        classifier --> main1
        classifier --> main2
        orgtools --> main1
        orgtools --> main2
        orgtools --> eng
        orgtools --> ops
        orgtools --> fe
        triggers --> orgtools
        orgtools --> sqlite
        skillrepo --> orgtools
        memoryIndex --> sqlite
        dashboard --> sqlite

        main1 -->|"spawn_team / delegate_task"| eng
        main1 -->|"spawn_team / delegate_task"| ops
        eng -->|"spawn_team / delegate_task"| fe
        eng -->|"escalate"| main1
        fe -->|"escalate"| eng
        ops -->|"escalate"| main1
    end

    users["Users"] --> channels
    schedules["Schedules / Keywords"] --> triggers
Loading

Key points:

  • Single container -- all teams run as AI SDK streamText() sessions in one process. No Docker-per-team.
  • Inline organization tools -- the control plane. Manages hierarchy, messaging, escalation, task queues, routing metadata (list_teams). Defined as AI SDK tool() calls, no HTTP transport.
  • Channel adapters -- the only component unique to the main agent. Everything else is uniform.
  • Trigger Engine -- fires tasks into the task queue on schedule, message, or keyword events.
  • TrustGate -- gates all inbound messages by sender trust level. Untrusted or unknown senders are rejected before reaching the classifier.
  • Dashboard -- operational views for admins. Queries SQLite directly. Two mutations only: trigger enable/disable toggle and plugin lifecycle actions (deprecate/remove). See Admin-Dashboard.
  • SQLite -- durable workflow state (task queues, org tree, dedup). Sessions are disposable.

Uniform Recursive Design

Every child team node is structurally identical: a streamText() session with inline organization tools, an orchestrator, subagents, skills, and plugins. The main agent differs from child teams in two ways: (1) it has channel adapters (WebSocket, Discord) attached, and (2) it has no subagents — it routes and delegates only (ADR-40).

graph TD
    main["Main Agent<br/>org-tools + channel adapters<br/>(routes only, no subagents)"]
    teamA["Team A<br/>org-tools"]
    teamB["Team B<br/>org-tools"]

    subgraph ops["Operations Team"]
        orch_ops["Orchestrator"]
        sa1["loggly-monitor<br/>(subagent)"]
        sa2["incident-responder<br/>(subagent)"]
        sk1["get-loggly-log.md"]
        sk2["incident-response.md"]
    end

    fe["Frontend<br/>org-tools"]
    be["Backend<br/>org-tools"]

    main --> teamA
    main --> teamB
    teamA --> orch_ops
    orch_ops -->|"invokes"| sa1
    orch_ops -->|"invokes"| sa2
    sa1 -->|"follows"| sk1
    sa2 -->|"follows"| sk2
    teamB --> fe
    teamB --> be
Loading

Each child team node can:

  • Delegate down -- send tasks to child teams
  • Escalate up -- send issues to its parent
  • Spawn children -- create new child teams on demand
  • Invoke subagents -- the ONLY way tasks execute within a team. Orchestrators never handle tasks directly (ADR-40).

The main agent can delegate down, escalate to the user, and spawn children — but it has no subagents and performs no direct task execution.

Depth is unlimited. A one-layer deployment (main + workers) and a five-layer deployment use the same code paths.

User → Fix Flow

For a complete walkthrough of how user requests flow through the full hierarchy (Main → Orchestrator → Subagent → Skill → Plugin), see Scenarios#User → Fix Flow.

Session Lifecycle

Each session initialization follows a fixed order to assemble the AI SDK context:

Tool Assembly Order:

  1. Built-in tools — Read, Write, Edit, Glob, Grep, Bash (always loaded)
  2. Org tools — spawn_team, delegate_task, escalate, etc. (if allowed by allowed_tools)
  3. Trigger tools — create_trigger, enable_trigger, etc. (if allowed)
  4. Browser tools — browser_navigate, browser_screenshot, etc. (if allowed + Playwright available)
  5. Vault tools — vault_set, vault_get, vault_list, vault_delete (if allowed)
  6. Plugin tools — Loaded from .run/teams/{name}/plugins/ only if active skill declares them in ## Required Tools

Plugin tool loading flow:

flowchart TD
    A[Session creation] --> B[Load built-in tools]
    B --> C[Load org/trigger/vault tools per allowed_tools]
    C --> D{Active skill exists?}
    D -->|No| E[Generic task: no plugin tools]
    D -->|Yes| F[Parse skill's Required Tools section]
    F --> G["Load declared plugins from .run/teams/{name}/plugins/"]
    G --> H[Verify each tool: typecheck + security scan]
    H --> I{All pass?}
    I -->|Yes| J["Namespace tools: {team}.{tool_name}"]
    I -->|No| K[Reject with error guidance]
    J --> L[Merge into activeTools]
    E --> M[activeTools = built-ins + org tools only]
    L --> N[activeTools includes plugin tools]
Loading

Namespace isolation (AC-9):

  • Plugin tools are namespaced: {team_name}.{tool_name}
  • loadPluginTools(teamName, requiredTools, allowedTools) returns only tools from that team's plugin directory
  • Cross-team tool shadowing is impossible

Execution Model

Every team operation is classified as daily-ops (parallel allowed) or org-ops (serialized via a per-team mutex). A team may process up to max_concurrent_daily_ops (default 5) concurrent sessions; org-ops block new daily-ops admission and wait for in-flight daily-ops to drain before running.

Canonical pool + mutex diagram (ADR-41; referenced from ADRs, Organization-Tools, and scenario walkthroughs):

flowchart TD
    Op[Team Operation] --> Class{Classify}
    Class -->|daily-ops| Pool[Concurrent Pool<br/>≤ max_concurrent_daily_ops]
    Class -->|org-ops| Mutex[Per-team Mutex<br/>single-flight]

    Pool --> SA[daily-ops Session A]
    Pool --> SB[daily-ops Session B]
    Pool --> SC[daily-ops Session C]

    Mutex --> SS[org-ops Session<br/>structural change]

    SA -. shared reads / row-level atomic writes .-> DB[(SQLite WAL)]
    SB -. shared reads / row-level atomic writes .-> DB
    SC -. shared reads / row-level atomic writes .-> DB
    SS -. exclusive for this team .-> DB
Loading

Concurrency Rule

  • Daily-ops (read-heavy, append-only, per-key mutations): parallel up to max_concurrent_daily_ops (default 5).
  • Org-ops (structural changes, governance mutations): single-flight per-team mutex.
  • Per-key exception: mutable stores like memories use per-subject_key locking so concurrent writes to different keys proceed in parallel while same-key writes serialize.

Org-ops tool set (single-flight per team): spawn_team, shutdown_team, update_team, modify_subagent, memory-schema edits, register_plugin_tool, update_trigger.

Drain policy: an org-op waits for in-flight daily-ops to finish, blocks new daily-ops admission, then runs. No mid-flight abort.

See Architecture-Decisions#ADR-41 Daily-ops vs Org-ops Concurrency for the full rule and Organization-Tools for per-tool class tagging.

Trigger Activation Modes

Four trigger types fire tasks into the queue. The window trigger (ADR-42) delivers continuous-watch semantics via periodic ticks + memory cursors + no-op returns — no persistent session.

Trigger Type Fires On Use
schedule cron expression Recurring clock work
message inbound message Per-message work
keyword keyword in channel Keyword-gated work
window cron-open / cron-close with tick_interval_ms inside Continuous watch during a window

See Triggers for trigger-engine details and Tool-Guidelines#Activation Decision Framework for when to pick each.


Data Layout

The data model uses three tiers with distinct mutability and ownership profiles.

Tier 1 ships the system rules directory (/app/system-rules/) containing core agent patterns, main agent identity, SDK capabilities reference, sender trust framework, task workflow, tool guidelines, and system-level skills (learning and reflection cycles). These are bundled at image build time and cannot be changed without rebuilding.

Tier 2 includes two Docker volume mounts: /data/config/ (read-only, containing providers.yaml and channels.yaml) and /data/rules/ (writable by admin, containing org-level rule markdown files). The config volume cannot be modified by the container.

Tier 3 is the runtime workspace at .run/ (named Docker volume). It contains the SQLite database (openhive.db), shared cross-team data, automated backups, and per-team directories. Each team directory contains: config.yaml (team manifest), org-rules/ (cascading rules), team-rules/ (team-only rules, including team-context.md), skills/ (procedure definitions), plugins/ (team-local TypeScript tool definitions), and subagents/ (agent identity definitions).

See Team-Configuration#Runtime Directory Structure for the per-team layout.

Docker Volumes Summary

Mount Source Target Mode
Admin config ./data/config /data/config :ro
Admin rules ./data/rules /data/rules rw (admin only)
Runtime workspace openhive-run (named volume) /app/.run rw

Directory Purposes

Directory Owner Purpose
/app/system-rules/ Image build Core ethics and capability rules; immutable at runtime
/data/config/ Admin Provider profiles and channel config; mounted read-only
/data/rules/ Admin Org-level rules seeded by admin; not writable by agents
.run/openhive.db Runtime SQLite: task queues, org tree, routing keywords, trigger dedup, trigger configs, channel interactions (24h retention, descendant-aware history for prompt injection), sender_trust, trust_audit_log, plugin_tools
.run/teams/{name}/ Team (cwd) Team directory and working directory; contains config, rules, skills, subagents, plugins; tool guards enforce boundaries
.run/teams/{name}/org-rules/ Team + descendants Rules that cascade to this team's sub-teams
.run/teams/{name}/team-rules/ Team only Rules for this team; do not cascade
.run/teams/{name}/subagents/ Team Agent identity definitions (WHO)
.run/teams/{name}/skills/ Team Reusable procedure definitions (HOW)
.run/teams/{name}/plugins/ Team Team-local plugin tools (TypeScript tool() definitions)
.run/openhive.db (memories table) Team Persistent memory entries in SQLite; per-team isolation via data access layer. See Memory-System.
.run/shared/ Any team (opt-in) Explicitly shared cross-team data
.run/backups/ Runtime Automated backup snapshots

Plugin Tools Table Schema

The plugin_tools table tracks lifecycle state and verification results for team-local plugin tools. Each row is keyed by (team_name, tool_name) and stores: lifecycle status (active, deprecated, failed_verification, removed), source path and SHA-256 hash (for change detection), verification results as JSON (TypeScript, interface, and security scan results), and creation/update/verification timestamps.


Code Inventory

Initial codebase is estimated at ~2,000-3,000 lines of TypeScript. This is a soft target, not a hard constraint — the codebase will grow as features are added. Everything else is rules (markdown), skills (markdown), config (YAML), and off-the-shelf MCP servers.

Component ~Lines Purpose
Channel adapters (WebSocket, Discord) ~500 Receive messages, progressive response protocol, notification routing
sessions/tools/org-tools.ts ~350 13 tools (10 org + 3 trust management) as inline AI SDK tool() definitions (spawn_team, delegate_task, escalate, add_trusted_sender, etc.)
sessions/tools/vault-tools.ts ~100 4 vault tools (vault_set, vault_get, vault_list, vault_delete) with write-access separation
sessions/tools/trigger-tools.ts ~180 6 trigger tools as inline tool() definitions
sessions/tools/browser-tools.ts ~200 8 browser tools calling BrowserRelay directly
sessions/tools/web-fetch-tool.ts ~80 Lightweight HTTP fetch with SSRF guards
sessions/tools/guards.ts ~60 Shared guard functions (assertCallerIsParent, assertBrowserEnabled)
Trigger Engine ~250 Schedule/message/keyword handlers + dedup + rate limiting + circuit breaker
Built-in tool guards + audit ~300 Workspace boundary enforcement, governance, credential write protection, audit logging (inline in tool definitions)
Entry point (entrypoint.ts + index.ts + bootstrap-helpers.ts) ~300 Process setup, wire tools, triggers, health checks, start main agent
sessions/task-consumer.ts + message-handler.ts ~250 Poll task queue, spawn sessions, route notifications, credential redaction, stall detection
sessions/ai-engine.ts + provider-registry.ts ~250 AI SDK streamText() runner, multi-provider registry
sessions/tools/ (6 tools + guards + audit) ~400 Built-in Read/Write/Edit/Glob/Grep/Bash tools with inline security guards
sessions/skill-loader.ts + subagent-factory.ts + prompt-builder.ts ~250 Load skills/*.md, create subagent tool definitions, assemble system prompt
State persistence (SQLite + Drizzle ORM) ~400 15+ tables: org_tree, scope_keywords, task_queue, trigger_dedup, log_entries, escalation_correlations, trigger_configs, channel_interactions, topics, memories, memory_chunks, embedding_cache, sender_trust, trust_audit_log, team_vault, plugin_tools (+ FTS5 virtual table)
sessions/tools/skill-repo-tool.ts ~120 search_skill_repository tool: queries Vercel skills ecosystem (skills.sh) for matching skills (see Skill-Repository)
topic-classifier.ts ~100 Server-side topic classification for conversation threading (see Conversation-Threading)
topic-registry.ts ~80 Topic lifecycle management, state transitions, SQLite persistence (see Conversation-Threading)
channels/trust-gate.ts ~180 TrustGate: evaluates sender trust level, rejects untrusted inbound messages before classification
storage/stores/sender-trust-store.ts ~100 CRUD operations for the sender_trust table
storage/stores/trust-audit-store.ts ~80 Append-only writes to the trust_audit_log table
config/trust-policy.ts ~60 Zod schema + loader for trust policy configuration from channels.yaml
Trust tool handlers (3 tools) ~100 add_trusted_sender, revoke_sender_trust, list_trusted_senders — inline tool definitions
System rules: sender-trust.md ~30 LLM decision framework for sender trust evaluation
Dashboard API routes (api/routes.ts) ~300 HTTP endpoints for admin dashboard views (read + trigger toggle + plugin lifecycle)
Dashboard static files (public/) Static HTML/CSS/JS assets for the dashboard UI
Credential scrubbing + secrets ~150 scrubSecrets() + scrubCredentialsFromContent() in credential-scrubber.ts, SecretString class
System rules: tool-guidelines.md ~80 LLM decision framework for tool selection (see Tool-Guidelines)
System rules: task-workflow.md ~60 Structured task lifecycle phases (see Task-Workflow)
Shared utilities (types, errors, logging, config) ~300 Error handling, structured logging, TypeScript interfaces, YAML config validation

Key source files

The source tree is rooted at src/ with these major areas:

Area Key Files Purpose
Entry + bootstrap entrypoint.ts, index.ts, bootstrap-helpers.ts, health.ts Process setup, wiring, health checks
Channels channels/router.ts, ws-adapter.ts, discord-adapter.ts, trust-gate.ts Message routing, adapter protocol, sender trust
Sessions sessions/task-consumer.ts, message-handler.ts, ai-engine.ts, provider-registry.ts Task polling, session management, AI SDK integration
Session support sessions/subagent-factory.ts, prompt-builder.ts, skill-loader.ts, team-registry.ts, tool-assembler.ts Subagent creation, prompt assembly, skill/tool loading
Tools sessions/tools/org-tools.ts, vault-tools.ts, trigger-tools.ts, browser-tools.ts, web-fetch-tool.ts, guards.ts, plugin-loader.ts Inline tool definitions with security guards
Built-in tools sessions/tools/read.ts, write.ts, edit.ts, glob.ts, grep.ts, bash.ts File and shell tools with boundary enforcement
Handlers handlers/tools/*.ts Handler logic for spawn-team, delegate-task, trust management, etc.
Triggers triggers/engine.ts, dedup.ts, rate-limiter.ts, handlers/*.ts Trigger engine, dedup, rate limiting
Storage storage/database.ts, schema.ts, stores/*.ts SQLite + Drizzle ORM, 15+ tables
Config config/loader.ts, validation.ts, trust-policy.ts YAML loaders with Zod validation
Infrastructure logging/, secrets/, domain/, recovery/ Structured logging, credential scrubbing, types, recovery
Dashboard api/routes.ts, public/ HTTP endpoints (read + trigger toggle + plugin lifecycle), static assets

Guards are implemented as inline wrappers inside each tool's execute() function. Shared guard functions in guards.ts enforce hierarchy and capability invariants.


Design Decisions Summary

Decision Choice Rationale
Container strategy Single container AI SDK provides per-session isolation (cwd, tool guards). No Docker-per-team overhead.
Hierarchy mechanism Inline AI SDK organization tools Inline tool() definitions as control plane manage hierarchy, routing, audit. No HTTP transport. Unlimited depth.
Team config format config.yaml per team Declarative manifest: scope, tools, MCP servers, provider profile.
Tool restrictions Deny-by-default via activeTools Teams get explicit tool allowlists. Unlisted tools are excluded from activeTools.
Invariant enforcement Inline tool guards + tool handler validation + audit Defense in depth. Guards are inside each tool's execute(). No single layer trusted alone.
Behavior definition Markdown rules with cascade system prompt assembled from rule cascade + skills + memory.
Agent hierarchy Uniform recursive design with main-agent exception (ADR-13, ADR-40) Child teams are identical: orchestrator → subagents → skills → plugins. Main agent routes only (no subagents).
Isolation model Cooperative within trust boundary Tool guards enforce cwd boundaries, governance, credential protection. Not OS-level. Explicit trust model.
Secret management api_key in providers.yaml Providers declare api_key directly in /data/config/providers.yaml. No secret refs, no separate env files.
Provider governance Central approved profiles Teams select from /data/config/providers.yaml. Multi-provider support via AI SDK registry (Anthropic, OpenAI, etc.).
Workflow durability SQLite for all durable state Task queues, org tree, dedup state, memory — all in SQLite. AI SDK sessions are disposable.
Scheduling Trigger Engine (not bare cron) Formal system with schedule/message/keyword types, dedup, rate limiting, circuit breaker. SQLite-backed via trigger_configs table.
Session engine Vercel AI SDK 6 (ai@^6) Multi-provider support, explicit tool control via tool(), @ai-sdk/mcp for external MCP servers only, subagents via generateText().
Skill discovery Vercel skills ecosystem integration (ADR-26) Search skills.sh → download SKILL.md → tailor to OpenHive format → test. See Skill-Repository.
Conversation threading Topic-based parallel sessions (ADR-27) Server-side classification, per-topic streamText(), WebSocket multiplexing. See Conversation-Threading.
Inbound trust model TrustGate with per-sender policy (ADR-30) All inbound messages gated by sender trust level before classification. Policy in channels.yaml, state in SQLite. Append-only audit log.
Admin dashboard Operational views with limited mutations (ADR-31) Static + API dashboard for admins. Queries SQLite directly. Two mutations: trigger toggle, plugin lifecycle. No authentication bypass.
Autonomous learning Scope-based discovery with session corroboration (ADR-33, ADR-35) Learning topics derived from scope_keywords. Trigger-only control (nightly, active with readiness gates per ADR-35). Cross-domain corroboration replaces static trust tiers. See Self-Evolution.
Team data storage SQLite team_vault with write-access separation Single credential store, preserves read-only secret invariant, enables generic team state
Plugin-first invariant Skills orchestrate plugins only (ADR-39) Every external operation must be a registered plugin tool. Skills are pure orchestration — no raw API calls.
Subagent-only execution Orchestrators always delegate to subagents (ADR-40) All task execution flows through subagents. Learning/reflection at subagent level. Propose+confirm for self-evolution.
⚠️ **GitHub.com Fallback** ⚠️