Organization Tools - Z-M-Huang/openhive GitHub Wiki

Organization Tools

Organization tools are inline AI SDK tool() definitions that execute in-process alongside the agent session. They are the control plane for team orchestration -- managing the org tree, task routing, escalation, and team lifecycle.

These tools are inline — no serialization, no HTTP transport, no separate process. Each tool's execute() function runs directly in the session's Node.js process with full access to shared stores via closure.

OrgToolContext

Every organization, trigger, browser, and vault tool receives an OrgToolContext object via closure. This is injected via closure at session creation.

Field Type Purpose
teamName string Identity of the calling team
sourceChannelId string | null Originating channel for notification routing. Null for schedule-triggered tasks (non-notifying by default).
orgStore OrgStore Org tree, scope keywords (SQLite)
taskQueueStore TaskQueueStore Priority task queue (SQLite)
triggerConfigStore TriggerConfigStore Trigger definitions (SQLite)
triggerEngine TriggerEngine Runtime trigger management
runDir string .run/ directory path
browserRelay BrowserRelay? Present only if team has browser: config
senderTrustStore SenderTrustStore Sender trust management (SQLite)
vaultStore VaultStore Team vault key-value store (SQLite)

The closure pattern: buildOrgTools(ctx: OrgToolContext) returns a record of tool() definitions. Each tool's execute() body closes over ctx, so it has direct access to stores without any serialization or transport layer. See SDK-Integration#Inline Tool Assembly for how context is constructed per session.

Inline Tool Pattern

Inline Call Path

The AI SDK calls execute() directly on the tool object, which runs the business logic inline. No serialization, no HTTP roundtrip — a single stack trace from session through tool to store.

Benefits:

  • Latency: eliminates HTTP roundtrip for every tool call
  • Debugging: single stack trace from session through tool to store
  • Type safety: input/output types checked at compile time (Zod schemas)
  • Code reduction: ~597 lines of transport infrastructure removed

Tool Categories

org-tools.ts

Tool Purpose Enforcement
spawn_team(name, config_path?, description?, scope_accepts?, init_context?, credentials?) Create a new team session from config. Optionally passes user context. Credentials are written directly to team_vault with is_secret=1 (system-managed — not through the agent-facing vault_set tool). Child retrieves them with vault_get. Bootstrap task is auto-queued asynchronously (type: bootstrap, priority: critical). Returns { status: 'queued', bootstrap_task_id, message_for_user } immediately — the caller MUST echo message_for_user to the user so they know the team is initialising. Bootstrap creates subagent definitions, skills, plugins, and active learning and reflection triggers per subagent (readiness gates per Architecture-Decisions#ADR-35, deterministic routing per ADR-40). TaskConsumer posts "[{name}] Team bootstrapped and ready." to the originating channel once the bootstrap session completes. Records parent-child in org tree. Validates config. Bootstrap task inherits originating channel.
delegate_task(team, task, priority?, overlap_policy?) Send a task to a child team. overlap_policy controls concurrency when the team already has an active task: skip (drop silently and return status: 'skipped'), replace (cancel the current task and enqueue the new one), confirm (return { requires_confirmation: true, message } so the parent can seek user approval before re-invoking). Response: { status: 'queued' | 'skipped' | 'in_flight', task_id? }. When overlap_policy: 'confirm', no task is enqueued until the parent re-calls with explicit approval. Validates caller is parent. Enqueues in priority queue. Threads originating channel for notification routing.
query_team(team, query) Synchronously query a single child team and return its response. Prefer query_teams when querying 2 or more peers with independent inputs. Validates caller is parent.
query_teams(targets[], default_timeout_ms?) Fan out queries to multiple children in parallel. targets = {team, query, timeout_ms?}[] (cap: 5). Returns {team, ok, result_or_error}[] with partial-result semantics — one child's failure does not fail the whole call. Wall-clock = max(child_duration), not sum(...). Cancellation: if the parent aborts mid-fan-out, outstanding child calls are cancelled and do not mutate children's queues. Validates caller is parent of every target. Each child call counts against that child's max_concurrent_daily_ops (ADR-41).
escalate(message, reason?) Notification only. Send an informational message up the hierarchy to the caller's parent — does NOT create work in the parent's queue. Chain: child -> parent -> main -> user. For work handoff (parent should act), use enqueue_parent_task. Delivers to immediate parent only. Logged with correlation ID.
enqueue_parent_task(task, priority, correlation_id?) Work handoff to parent. Write a task into the immediate parent's task_queue so the parent's orchestrator routes it (ADR-43). Payload carries context only, not subagent directives — the parent still decides routing, preserving ADR-40. correlation_id is an opaque string (engine generates one if omitted). Guard: per-child per-minute rate cap (default 10/min); same correlation_id within 60 s is deduplicated. Validates caller's parent exists.
send_message(target, message) Direct message to parent or child Validates target is caller's direct parent or child.
get_status(team?) Query status of managed teams. Returns { active_daily_ops, saturation, org_op_pending, queue_depth, current_task?, pending_tasks[] } per target (ADR-41). saturation is true when active_daily_ops >= max_concurrent_daily_ops. Returns only the caller's own children.
list_completed_tasks(team?, since?, limit?) Query completed tasks from task_queue. Default: caller's own team, last 7 days, limit 50. Returns: task_id, duration_ms, status (done/failed/cancelled), result_snippet (first 200 chars), created_at. Evidence source for self-reflection sessions. See Architecture-Decisions#ADR-37. Read-only. Validates caller is parent of target team (or own team).
list_teams(recursive?) List child teams with descriptions, scope keywords, and status for routing decisions Returns only the caller's own children (and optionally their descendants). Provides team name, description, keywords, status, queue depth.
update_team(team, scope_add?, scope_remove?) Add or remove scope keywords for a child team. No learning-specific fields — scope_keywords changes affect the next learning cycle automatically. Validates caller is parent. Cannot leave team with zero scope. Returns resulting scope.
shutdown_team(name, cascade?) End a child team's session and clean up all team data Terminates session. Deletes all team-scoped DB rows (including memories, memory_chunks). Removes .run/teams/{name}/ directory. With cascade: true, recursively shuts down all descendants depth-first.
add_trusted_sender(channel_type, sender_id, channel_id?, trust_level?) Grant trust to a sender on a channel. Writes to sender_trust table with granted_by="admin". Validates channel_type exists.
revoke_sender_trust(channel_type, sender_id, channel_id?) Remove trust or explicitly deny a sender. Revoking removes the row; denying sets trust_level="denied". Validates sender exists in DB.
list_trusted_senders(channel_type?, trust_level?) View trust status across channels. Read-only query on sender_trust table. None (read-only).
register_plugin_tool({ tool_name: string, source_code: string }) Register a team-local plugin tool with security scan + interface validation. Returns { success, tool } or { success: false, error }. Security scan (forbidden patterns, secret detection), interface validation (description, parameters, execute), reserved name check.

Note: The trust management tools (add_trusted_sender, revoke_sender_trust, list_trusted_senders) are available to the main agent only. Child teams cannot manage sender trust.

Note: list_completed_tasks returns historical task records for evidence gathering (e.g., self-reflection sessions). For real-time queue state, use get_status.

query_teams — Parallel Fan-out

Canonical fan-out pattern for querying multiple peers concurrently. Wall-clock becomes max(child_duration) instead of sum(...), the primary fix for the 19-minute serial query_team chain documented in Scenarios#19-Minute Trading Cycle Evidence.

sequenceDiagram
    participant Parent as Parent Team
    participant A as Child A
    participant B as Child B
    participant C as Child C

    Parent->>Parent: query_teams([A, B, C], default_timeout_ms=150000)
    par parallel fan-out
        Parent->>A: query
    and
        Parent->>B: query
    and
        Parent->>C: query
    end
    A-->>Parent: {team:"A", ok:true, result_or_error:"..."}
    B-->>Parent: {team:"B", ok:false, result_or_error:"timeout"}
    C-->>Parent: {team:"C", ok:true, result_or_error:"..."}
    Note over Parent: wall-clock = max(child_duration)<br/>partial-result tolerated
Loading

Result shape: each entry is {team: string, ok: boolean, result_or_error: string}. On success ok: true and result_or_error contains the child's response. On failure ok: false and result_or_error contains the error message.

Rules:

  • Fan-out cap: 5 children per call. Higher fan-outs must be split across multiple calls or reconsidered. Calling with 1 target is allowed but query_team (singular) is preferred.
  • Each child call consumes a slot in the target's daily-ops pool (ADR-41). If the target is saturated, it returns a saturation error for that child — siblings continue.
  • Per-child timeout_ms overrides default_timeout_ms. Expired children return {ok: false, error: "timeout"}.
  • Partial-result semantics: orchestrator decides whether to proceed with partial results or retry the failed subset. Documented in Tool-Guidelines#query_teams Partial Failure.
  • Cancellation: parent abort cancels all outstanding child calls; no child queue mutation for cancelled calls.

enqueue_parent_task — Work Handoff

Canonical work-handoff pattern from child to immediate parent (ADR-43). Preserves ADR-40 hierarchy: the payload carries context only; the parent's orchestrator still routes.

flowchart LR
    subgraph Child[Child Team - e.g., during window tick]
        CE[Event detected]
        CE --> CC{Signal type?}
    end

    CC -->|informational| Esc[escalate<br/>upward notification<br/>no parent queue insertion]
    CC -->|work handoff| EPT[enqueue_parent_task<br/>priority + context + correlation_id]

    subgraph Parent[Parent Team]
        PQ[task_queue<br/>priority admission - ADR-9]
        PO[Orchestrator<br/>routing decision - ADR-40]
        PS[Subagent executes]
        PQ --> PO
        PO --> PS
    end

    EPT --> PQ
    Esc -. info only .-> PO
Loading

Guards:

  • Rate cap: per-child per-minute enqueue cap (default 10 / min).
  • Dedup: same correlation_id within 60 s window is deduplicated.

When to choose which:

  • escalate — "I observed X, FYI." Parent decides whether to act.
  • enqueue_parent_task — "I observed X; please do Y. Priority: high." Work enters the parent's queue directly.

vault-tools.ts

Tool Purpose Enforcement
vault_set(key, value) Store a key-value pair in the team vault. Only handles team state — secret rows (is_secret=1) are system-managed (spawn_team, migration, admin API). Scoped to caller's team.
vault_get(key) Retrieve a value from the team vault Secret values scrubbed from audit logs. Returns raw value to caller.
vault_list(prefix?) List keys in the team vault Values omitted for is_secret=1 entries. Returns keys and metadata only for secrets.
vault_delete(key) Remove a key from the team vault Rejects if key has is_secret=1. Scoped to caller's team.

memory-tools.ts

Tool Purpose Enforcement
memory_save(key, content, type?, supersede_reason?) Save or supersede a memory entry Scoped to caller's team via OrgToolContext.teamName. Requires supersede_reason if active entry with same key exists. See Memory-System#Tool Specification.
memory_delete(key) Soft-delete a memory entry Scoped to caller's team. Sets is_active = 0. Deleted entries excluded from search.
memory_search(query, max_results?) Search team memory using hybrid FTS5 + vector similarity Scoped to caller's team. Returns active + superseded entries.
memory_list(type?) List active memory entries Scoped to caller's team. Optionally filtered by type.

trigger-tools.ts

Tool Purpose Enforcement
create_trigger(team, name, type, config, task, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) Create a new trigger in pending state. subagent is optional — if set, orchestrator routes directly; if null, orchestrator decides. skill requires subagent to be set (prevents direct skill addressing per ADR-40). Validates caller is parent. Stores in SQLite trigger_configs table.
enable_trigger(team, trigger_name) Activate a pending or disabled trigger Validates caller is parent. Sets state to active, registers handler. Resets overlap_count to 0.
disable_trigger(team, trigger_name) Deactivate a trigger Validates caller is parent. Sets state to disabled, unregisters handler. Resets overlap_count to 0.
list_triggers(team) List all triggers and their states Returns trigger name, type, state, failure count, overlap policy, overlap count, active task ID.
test_trigger(team, trigger_name, max_turns?, overlap_policy?) Fire a trigger once for testing. overlap_policy controls concurrency when the team is busy: skip (drop silently), replace (cancel current task and enqueue), confirm (return { requires_confirmation: true, message } for parent approval before enqueue). Returns { taskId, status: 'queued' | 'skipped' | 'in_flight' }. When overlap_policy: 'confirm', no task is enqueued until the parent re-calls with explicit approval. Enqueues task without changing trigger state. Does not set active_task_id (not subject to overlap tracking).
update_trigger(team, trigger_name, config?, task?, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) Update an existing trigger's config, task, targeting, or settings. subagent and skill follow the same constraint as create_trigger. Validates caller is parent. If trigger is active, handlers are re-registered automatically.

For trigger engine internals (circuit breaker, overlap policy, failure handling), see Triggers.

browser-tools.ts

Tool Purpose Enforcement
browser_navigate(url) Navigate to a URL Team must have browser: config. SSRF protection blocks private/reserved IPs. Domain allowlist enforced if allowed_domains specified.
browser_snapshot() Take an accessibility snapshot of the current page Returns YAML accessibility tree with element refs.
browser_screenshot() Take a visual screenshot Returns ImageContent (base64 PNG).
browser_click(element?, ref?) Click an element on the page Uses element text or ref from snapshot.
browser_type(text, element?, ref?) Type text into an element Targets element by text or ref.
browser_go_back() Navigate back in browser history
browser_go_forward() Navigate forward in browser history
browser_close() Close the browser tab

For SSRF protection, domain allowlist, and BrowserRelay lifecycle, see Browser-Proxy.

web-fetch-tool.ts

Tool Purpose Enforcement
web_fetch(url, method?, headers?, body?, rate_limit_key?) Fetch a URL and return the response body. Optional rate_limit_key selects a per-team per-domain token bucket configured in Team-Configuration — useful for window-trigger ticks that scan external sources on every tick. SSRF protection via validateBrowserUrl(). Domain allowlist enforced. Unknown rate_limit_key falls back to no extra throttling (domain allowlist still applies).

For the canonical specification, see Browser-Proxy#Web Fetch Tool.

skill-repo-tool.ts

Tool Purpose Enforcement
search_skill_repository(query, filters?) Search the Vercel skills ecosystem (skills.sh) for skills matching a natural language description. Returns matches with install counts, source reputation, and match scores. Only matches ≥60% are returned. Opt-in via allowed_tools. Read-only (no side effects). Sources pre-defined in container.

For the full adoption flow, format conversion, and trust signals, see Skill-Repository.

Guardrails

Guardrail Enforced In Implementation
Parent-child validation assertIsParentOf(orgTree, teamName, team) in execute() body Every tool that targets a child team checks the org tree before proceeding. Rejects with clear error if caller is not the parent.
Workspace boundary assertInsideBoundary(path, cwd) File-system tools validate that all paths resolve within the team's workspace. Prevents traversal attacks.
SSRF protection validateBrowserUrl(url, allowedDomains) Blocks private/reserved IP ranges, validates against domain allowlist. See Browser-Proxy#SSRF Protection.
Credential scrubbing withAudit() wrapper on all tools Wraps every tool call with audit logging. Scrubs credential values from logs.
Deny-by-default activeTools filter from config.yaml allowed_tools Only tools explicitly listed in the team's allowed_tools config are made available to the session. Unlisted tools are not registered.

Plugin Tool Guards

Plugin tools (team-local TypeScript tools in .run/teams/{name}/plugins/) require additional security guards beyond the standard guardrails.

Pre-Generation Guards

Guard Detection Enforcement
Forbidden pattern detection AST analysis for shell injection, dynamic code evaluation, unsandboxed network requests Reject tool creation with FORBIDDEN_PATTERN error
Secret scanning Regex + entropy analysis for hardcoded credentials Reject tool creation with SECRET_DETECTED error
Reserved name collision Check against built-in tool names Reject with error: "Tool name '{name}' is reserved"

Post-Creation Verification (AC-8)

Plugin tool verification checks three aspects:

Check What It Validates
TypeScript Syntax validity; reports any compilation errors
Interface Presence of description, parameters (Zod schema), and execute function
Security Forbidden patterns, detected secrets, and overall pass/fail

Runtime Guards

Guard Enforcement Point Error Code
Filesystem boundary assertInsideBoundary(runDir, path) in tool execute() PATH_OUTSIDE_BOUNDARY
Audit logging withAudit() wrapper on plugin tool calls N/A (logs all invocations)
Namespace isolation loadPluginTools(teamName) returns only team-local tools N/A (structural)

Error Message Format

All guard violations produce structured, actionable error messages containing: the error code (e.g., SECURITY VIOLATION: FORBIDDEN_PATTERN), the pattern name, line number, the offending line (redacted), and suggested remediation.

Audit Logging

Every plugin tool invocation logs:

  • SHA-256 hash of tool code
  • Prompt context (truncated)
  • Timestamp
  • Team/user context
  • Result: success/blocked

Communication Flow

sequenceDiagram
    participant User
    participant Main as Main Agent
    participant Eng as Engineering Orchestrator
    participant SA as react-dev Subagent
    participant SK as Skill
    participant PL as Plugin

    User->>Main: "Build a login page"
    Note over Main: Routes only. Identifies engineering team.
    Main->>Eng: delegate_task("engineering", "Build login page")
    Note over Eng: Reads subagent defs → picks react-dev
    Eng->>SA: invoke react-dev subagent
    Note over SA: Context: react-dev.md + skills + task
    SA->>SK: follow build-ui skill steps
    SK->>PL: scaffold_component.ts → generate component
    PL-->>SK: component created
    SK-->>SA: build complete
    SA-->>Eng: "Login page UI complete"
    Eng-->>Main: Result: "Login page complete"
    Main-->>User: "Login page is ready"
Loading

LLM-Driven Routing

Routing decisions are made by the LLM (parent agent), not by keyword-matching code. The list_teams tool provides the parent with all the information it needs to choose the right child team for a task.

Workflow

  1. Parent calls list_teams(recursive: true) to see its children (and optionally the full subtree)
  2. Each entry includes: team name, description, scope keywords (as routing hints and learning domain signal), status, and pending queue depth
  3. The parent LLM reads the team descriptions and keywords, then decides which child best fits the task
  4. Parent calls delegate_task or query_team targeting the chosen child

Scope keywords are stored in the scope_keywords SQLite table and serve dual purpose: routing hints for the LLM and learning domain signal for autonomous learning -- they help the parent agent understand what each team handles and define the domain the team's learning system explores. There is no automated keyword-matching gate; the LLM makes the routing decision using the full context of the task and the available teams.

When creating teams without a config_path, providing scope_accepts keywords is recommended so that list_teams can surface useful routing metadata to parent agents and the learning system can derive its search domain.


Concurrency Classification

Every tool is either daily-ops (can run concurrently with other daily-ops on the same team, up to max_concurrent_daily_ops) or org-ops (single-flight per team via the per-team mutex). See Architecture#Execution Model for the canonical pool + mutex diagram and Architecture-Decisions#ADR-41 Daily-ops vs Org-ops Concurrency for the six resource classes.

Tool Class Notes
delegate_task daily-ops Insert is append-only; task consumption is single-threaded per team.
query_team, query_teams daily-ops Read-heavy peer RPC. Each child call consumes a slot in the target's daily-ops pool.
escalate daily-ops Append-only notification.
enqueue_parent_task daily-ops Priority queue insert at parent.
send_message daily-ops Append-only.
get_status, list_teams, list_completed_tasks daily-ops Read-only.
list_trusted_senders daily-ops Read-only.
memory_save, memory_delete, memory_search, memory_list daily-ops Per-subject_key lock on writes (class 3); FTS index follows (class 4).
vault_get daily-ops Read-only.
vault_set (non-secret), vault_delete (non-secret), vault_list daily-ops Version-check upsert (class 3).
web_fetch daily-ops Network I/O only; per-domain rate buckets optional.
browser_* daily-ops Subject to team's BrowserRelay session; not a DB write.
create_trigger, enable_trigger, disable_trigger, test_trigger, list_triggers daily-ops Trigger engine ops; trigger_configs.active_task_id/overlap_count mutations are serialized by the engine itself.
search_skill_repository daily-ops Read-only.
spawn_team, shutdown_team, update_team org-ops Structural topology (class 5).
update_trigger org-ops Structural config change (class 5).
vault_set (secret rows, is_secret=1) org-ops Governance / security state (class 6).
register_plugin_tool org-ops Plugin lifecycle (class 6).
add_trusted_sender, revoke_sender_trust org-ops Security state (class 6).

Org-ops take the per-team mutex: in-flight daily-ops are allowed to drain, new daily-ops admission is blocked, then the org-op runs single-flight. No mid-flight abort.


Task Queue

Each team has a priority admission-order queue managed in SQLite.

Behavior

  • Daily-ops are concurrent up to max_concurrent_daily_ops (default 5) per team. Org-ops are single-flight per team via the mutex. See Architecture#Execution Model.
  • delegate_task adds tasks to the queue with an optional priority (default: normal)
  • Tasks are ordered by priority level, then FIFO within the same priority
  • No mid-task preemption -- only pending tasks can be reordered (ADR-9; applies to org-ops and structural work. Daily-ops admission is unordered within the cap by design)
  • get_status returns { active_daily_ops, saturation, org_op_pending, queue_depth, current_task?, pending_tasks[] }
  • Queue state is durable in SQLite -- survives container restarts

Priority Levels

Priority Use Case
critical Incidents, security alerts
high User-facing requests, escalations
normal Standard delegated work (default)
low Background maintenance, cleanup

Scenarios

For end-to-end operational scenarios, see Scenarios.

⚠️ **GitHub.com Fallback** ⚠️