SDK Integration - Z-M-Huang/openhive GitHub Wiki

SDK Integration

Session Engine

OpenHive uses the Vercel AI SDK 6 (ai@^6) as its session engine. Each team session is powered by streamText() with tool-loop support, inline tool definitions, and per-step callbacks.

Each team session is created via streamText() with: a model resolved from the provider registry, a system prompt assembled from the rule cascade plus active skill content, the task as the user message, a merged tool set (built-ins + org + trigger + browser + web-fetch + vault + subagent + external MCP + plugin tools), an activeTools filter that enforces the deny-by-default allowlist, and a configurable maxSteps limit (default 50). Progress is reported via onStepFinish callbacks.

Key components explained:

Component Purpose
streamText() AI SDK 6 streaming text generation with automatic tool loop
activeTools Deny-by-default enforcement. Only tools matching config.yaml allowed_tools are active. Plugin tools must also be declared by the active skill. Replaces the old canUseTool callback.
Built-in tools read, write, edit, glob, grep, bash — defined via tool() with inline guards (workspace boundary, governance, credential write protection, audit logging)
Skill repo tools search_skill_repository — searches Vercel skills ecosystem (skills.sh) for reusable skills. See Skill-Repository.
Org tools buildOrgTools(ctx) — 12 organization tools (spawn_team, delegate_task, etc.) defined inline via tool(). See Organization-Tools#Tool Categories.
Trigger tools buildTriggerTools(ctx) — 6 trigger management tools defined inline. See Triggers#Trigger Management Tools.
Browser tools buildBrowserTools(ctx) — 8 browser automation tools calling BrowserRelay directly. Conditionally included when @playwright/mcp is available.
Web fetch tool buildWebFetchTool(ctx) — lightweight HTTP GET/POST without Playwright. Reuses SSRF guards. See Browser-Proxy#Web Fetch Tool.
Vault tools buildVaultTools(ctx) — 4 vault tools (vault_set, vault_get, vault_list, vault_delete). Teams read secrets via vault_get; is_secret=1 entries are system-managed (teams cannot write or delete). See Organization-Tools#vault-tools.ts.
Subagent tools Created via AI SDK tool() wrapping generateText(). Each subagent runs with isolated context.
Plugin tools Loaded on demand from .run/teams/{name}/plugins/*.ts when an active skill declares them. A tool is usable only if its plugin_tools metadata row is active, its latest verification passed, and allowed_tools matches its namespaced key.
maxSteps Replaces maxTurns. Controls the maximum number of tool-use steps before the session completes.

Plugin Tool Loading

Plugin tools are team-local TypeScript tool() definitions stored in .run/teams/{name}/plugins/. Their source code lives on disk, while lifecycle state and verification results are persisted in the SQLite plugin_tools table inside .run/openhive.db. Tools are loaded ad-hoc based on active skill declarations.

Ad-hoc loading model:

  • Full skill markdown is loaded ad-hoc when that skill is activated
  • When a skill is activated, its ## Required Tools section specifies which plugin tools are candidates for loading
  • Plugin source files exist on disk regardless; lifecycle state (active, deprecated, failed_verification) and verification summaries persist in SQLite
  • Final exposure is the intersection of skill-declared tools and allowed_tools, matched against the namespaced runtime key {team_name}.{tool_name}

Loading flow:

  1. Skill activated — The selected skill's markdown is loaded into the prompt
  2. Parse Required Tools — Extract tool names from skill's ## Required Tools section
  3. Lookup persisted metadata — Read the plugin_tools row for each (team_name, tool_name)
  4. Enforce lifecycle state — Skip rows marked deprecated or failed_verification
  5. Load source file — Import .run/teams/{name}/plugins/{tool_name}.ts
  6. Namespace tools — Prefix with team name: {team_name}.{tool_name}
  7. Apply allowlist — Keep only namespaced tools matched by allowed_tools
  8. Merge into activeTools — Add the remaining tools to the session

The plugin loader iterates over the active skill's required tools, checks each tool's metadata in the plugin_tools SQLite table (skipping non-active or failed-verification entries), verifies the source file exists on disk, imports the module, validates the tool interface, namespaces it as {team_name}.{tool_name}, checks against the team's allowed_tools, and returns only the tools that pass all checks.

Namespace isolation (AC-9):

  • Each tool is namespaced: {team_name}.{tool_name}
  • loadPluginTools(teamName) returns only tools from that team's directory
  • No cross-team tool shadowing
  • allowed_tools must match the namespaced runtime key (engineering.deploy_service or engineering.*), not the unnamespaced skill entry (deploy_service)

Built-in tool collision prevention:

Built-in tool names (read, write, edit, glob, grep, bash) are reserved and cannot be used as plugin tool names. Non-built-in tools (org, trigger, vault) are structurally safe from collision because namespace prefixing prevents conflicts at load time.

Generic task behavior (AC-7):

  • Generic tasks (no active skill) receive the normal non-plugin tool set permitted by allowed_tools — built-in, org, trigger, vault, browser, external MCP, subagent, and similar tools as configured
  • Plugin tools are excluded from activeTools unless a skill declares them

Provider Registry

Provider profiles from providers.yaml are resolved into an AI SDK provider registry using createProviderRegistry(). Each profile's provider field (anthropic or openai) determines which SDK provider factory is used. Models are then resolved via the registry using the {providerName}:{modelId} format. For OpenAI-compatible proxies, the openai provider targets the Chat Completions API.

Directory Layout

Three separate root directories govern where configuration, data, and runtime state live:

Variable Default Contents
systemRulesDir baked into image System-level rules (immutable, operator-managed)
dataDir volume mount Admin org rules (rules/*.md), providers.yaml
runDir .run/ Per-team runtime dirs (teams/{name}/), SQLite database (openhive.db)

All team working directories are rooted under {runDir}/teams/{teamName}/.

Key Function Signatures

Function Purpose
handleMessage(msg, deps, opts?) Assembles session config inline: resolves provider, builds rule cascade + active skill + memory, assembles tools, runs streamText(), returns result
resolveProvider(profileName, providers) Reads provider profile from providers.yaml, returns provider name (for registry lookup), model ID, and secret values for credential scrubbing
assembleTools(teamConfig, teamName, deps, ...) Builds the complete tool set: built-in tools with guards, org tools, trigger tools, browser tools, web-fetch, vault, subagent, external MCP, and plugin tools

Session configuration is assembled directly within message-handler.ts — there is no separate config builder. The prompt is assembled by prompt-builder.ts from core instructions (including workspace path), tool availability, HTTP rules, rule cascade, skills, memory, and recent conversation history. Vault secrets are never part of prompt assembly — they are accessed at runtime via vault_get only.

Built-in Tools

Six built-in tools replace the Claude Agent SDK's preset tools. Each is defined via AI SDK tool() with inline security guards:

Tool Guards
Read Workspace boundary (path must be within team cwd or allowed dirs)
Write Workspace boundary + governance (blocks system-rules, admin-rules, config.yaml) + vault is_secret=1 write guard
Edit Workspace boundary + governance + vault is_secret=1 write guard
Glob Workspace boundary
Grep Workspace boundary
Bash Vault secret protection (blocks file writes containing vault secret values)

All tools are wrapped with withAudit() for structured logging with vault secret scrubbing.

Inline Tool Assembly

Organization, trigger, browser, web-fetch, and vault tools are defined as inline AI SDK tool() definitions — no HTTP transport or MCP bridge. Vault tools (vault_set, vault_get, vault_list, vault_delete) follow the same pattern. Each builder function receives an OrgToolContext constructed from session dependencies (see Organization-Tools#OrgToolContext):

Each tool category is built by a dedicated factory function (buildOrgTools, buildTriggerTools, buildBrowserTools, buildWebFetchTool, buildVaultTools) that receives the OrgToolContext. All tool definitions are wrapped with withAudit() for structured logging and credential scrubbing. Within each partition, tools are sorted alphabetically by key for prompt cache stability.

resolveActiveTools() is extracted as a standalone utility. It takes the full set of tool names and the team's allowed_tools config, returning only the tools that match (supports exact names, '*' wildcard, and glob patterns). Plugin tools are matched against their namespaced keys such as engineering.deploy_service or engineering.*.

Rule Cascade (4 Levels)

Rules are assembled from most general to most specific, then set as the system prompt. For the full loading order, cascade examples, and rule file format, see Rules-Architecture#Rule Cascade.

After the rule cascade, the activated skill's content (if any) is appended under a --- Skills --- header, and memory content from the SQLite memories table is appended under a --- Memory --- header (structured in blocks: identity, context, lesson, decision). See Memory-System#Injection.

Workspace Path Injection

The system prompt includes the team's absolute workspace path (e.g., /app/.run/teams/ops-team/). This tells agents exactly where their files live, preventing them from guessing incorrect paths like /workspace/. The cwd value from context-builder.ts is passed to buildCoreInstructions() in prompt-builder.ts.

Conversation Context Injection

Recent channel interactions are injected into the system prompt after memory. message-handler.ts uses BFS over orgTree.getChildren() to collect the team's ID and all descendant team IDs, then queries InteractionStore.getRecentByChannel(channelId, teamIds, 10) for the last 10 interactions on the originating channel. buildConversationHistorySection() in prompt-builder.ts formats these as timestamped entries showing who said what. This gives agents awareness of prior conversation on the same channel.

Interactions are stored in the channel_interactions SQLite table with 24-hour retention. Only inbound messages, final outbound replies, and async task notifications are logged — intermediate frames (ack/progress) are not.

Tool Path Resolution

Built-in file tools (Read, Write, Edit) resolve file_path against the team's cwd using resolve(cwd, file_path) before any I/O operation. This ensures relative paths resolve against the team workspace directory, not process.cwd(). The resolved path is then validated by assertInsideBoundary() and assertGovernanceAllowed().

Prompt Cache Boundary

The system prompt is split into a static prefix and a dynamic suffix to maximize prompt cache hit rates (see ADR-23 in Architecture-Decisions).

Segment Contents Cache behavior
Static prefix System rules + admin org-rules (/data/rules/*.md) + tool usage guide + HTTP rules Identical across all teams and sessions. Cached once, reused globally.
Dynamic suffix Core instructions (cwd) + tool availability + ancestor org-rules + team org-rules + team-rules + skills + memory + conversation history Varies per team and per message. Not cached.

The rule cascade (system rules through org-rules) and tool usage guide are deterministic for a given team — they only change when rule files on disk change. By placing them in the static prefix, subsequent requests reuse cached KV computations for the rules block. Skills, tool definitions, and memory are in the dynamic suffix because they change with skill activation and session state. Vault secrets are never part of prompt assembly — they are accessed at runtime via vault_get only.

Tool definitions are sorted alphabetically by key within each partition (org, trigger, browser) to ensure the serialized tool block is identical across sessions, further improving cache stability.

For the user-facing explanation of prompt caching, see Rules-Architecture#Prompt Cache Boundary.

Per-Topic Sessions

With conversation threading enabled, the main agent can have multiple concurrent streamText() sessions — one per active topic. This relaxes the "one session per team" constraint for the main agent only. Child teams remain unaffected: they have one session per team and receive tasks through the normal queue regardless of which topic initiated the work.

Each topic session shares the same system prompt (rules, skills) but maintains a separate conversation history filtered by topic_id from the channel_interactions store. For the full topic lifecycle and classification details, see Conversation-Threading.

Subagents

Subagent definitions from subagents/*.md are converted into AI SDK tool() definitions. Each subagent tool accepts a task string and wraps a generateText() call with isolated context — the parent session's tool loop calls the subagent tool, which runs its own inner session, and returns only the final result text. Intermediate tool calls within the subagent are not visible to the parent. See Subagents for the definition format.

On-Demand Team Spawning

Teams are spawned when work arrives:

  1. delegate_task targets a team via the task queue
  2. TaskConsumer dequeues the task and reads the team's config.yaml
  3. handleMessage() creates a streamText() session with the resolved config
  4. Session processes the task and returns a result
  5. Result is routed back to the originating channel via sourceChannelId

Sessions are disposable. No attempt is made to resume or persist AI SDK sessions. Continuity comes from durable state in SQLite (task queues, org tree, memory entries).


For the on-demand spawning flow, see Scenarios.