Organization Tools - Z-M-Huang/openhive GitHub Wiki
Organization tools are inline AI SDK tool() definitions that execute in-process alongside the agent session. They are the control plane for team orchestration -- managing the org tree, task routing, escalation, and team lifecycle.
These tools are inline — no serialization, no HTTP transport, no separate process. Each tool's execute() function runs directly in the session's Node.js process with full access to shared stores via closure.
Every organization, trigger, browser, and vault tool receives an OrgToolContext object via closure. This is injected via closure at session creation.
| Field | Type | Purpose |
|---|---|---|
teamName |
string | Identity of the calling team |
sourceChannelId |
string | null | Originating channel for notification routing. Null for schedule-triggered tasks (non-notifying by default). |
orgStore |
OrgStore | Org tree, scope keywords (SQLite) |
taskQueueStore |
TaskQueueStore | Priority task queue (SQLite) |
triggerConfigStore |
TriggerConfigStore | Trigger definitions (SQLite) |
triggerEngine |
TriggerEngine | Runtime trigger management |
runDir |
string |
.run/ directory path |
browserRelay |
BrowserRelay? | Present only if team has browser: config |
senderTrustStore |
SenderTrustStore | Sender trust management (SQLite) |
vaultStore |
VaultStore | Team vault key-value store (SQLite) |
The closure pattern: buildOrgTools(ctx: OrgToolContext) returns a record of tool() definitions. Each tool's execute() body closes over ctx, so it has direct access to stores without any serialization or transport layer. See SDK-Integration#Inline Tool Assembly for how context is constructed per session.
The AI SDK calls execute() directly on the tool object, which runs the business logic inline. No serialization, no HTTP roundtrip — a single stack trace from session through tool to store.
Benefits:
- Latency: eliminates HTTP roundtrip for every tool call
- Debugging: single stack trace from session through tool to store
- Type safety: input/output types checked at compile time (Zod schemas)
- Code reduction: ~597 lines of transport infrastructure removed
| Tool | Purpose | Enforcement |
|---|---|---|
spawn_team(name, config_path?, description?, scope_accepts?, init_context?, credentials?) |
Create a new team session from config. Optionally passes user context. Credentials are written directly to team_vault with is_secret=1 (system-managed — not through the agent-facing vault_set tool). Child retrieves them with vault_get. Bootstrap task is auto-queued asynchronously (type: bootstrap, priority: critical). Returns { status: 'queued', bootstrap_task_id, message_for_user } immediately — the caller MUST echo message_for_user to the user so they know the team is initialising. Bootstrap creates subagent definitions, skills, plugins, and active learning and reflection triggers per subagent (readiness gates per Architecture-Decisions#ADR-35, deterministic routing per ADR-40). TaskConsumer posts "[{name}] Team bootstrapped and ready." to the originating channel once the bootstrap session completes. |
Records parent-child in org tree. Validates config. Bootstrap task inherits originating channel. |
delegate_task(team, task, priority?, overlap_policy?) |
Send a task to a child team. overlap_policy controls concurrency when the team already has an active task: skip (drop silently and return status: 'skipped'), replace (cancel the current task and enqueue the new one), confirm (return { requires_confirmation: true, message } so the parent can seek user approval before re-invoking). Response: { status: 'queued' | 'skipped' | 'in_flight', task_id? }. When overlap_policy: 'confirm', no task is enqueued until the parent re-calls with explicit approval. |
Validates caller is parent. Enqueues in priority queue. Threads originating channel for notification routing. |
query_team(team, query) |
Synchronously query a single child team and return its response. Prefer query_teams when querying 2 or more peers with independent inputs. |
Validates caller is parent. |
query_teams(targets[], default_timeout_ms?) |
Fan out queries to multiple children in parallel. targets = {team, query, timeout_ms?}[] (cap: 5). Returns {team, ok, result_or_error}[] with partial-result semantics — one child's failure does not fail the whole call. Wall-clock = max(child_duration), not sum(...). Cancellation: if the parent aborts mid-fan-out, outstanding child calls are cancelled and do not mutate children's queues. |
Validates caller is parent of every target. Each child call counts against that child's max_concurrent_daily_ops (ADR-41). |
escalate(message, reason?) |
Notification only. Send an informational message up the hierarchy to the caller's parent — does NOT create work in the parent's queue. Chain: child -> parent -> main -> user. For work handoff (parent should act), use enqueue_parent_task. |
Delivers to immediate parent only. Logged with correlation ID. |
enqueue_parent_task(task, priority, correlation_id?) |
Work handoff to parent. Write a task into the immediate parent's task_queue so the parent's orchestrator routes it (ADR-43). Payload carries context only, not subagent directives — the parent still decides routing, preserving ADR-40. correlation_id is an opaque string (engine generates one if omitted). Guard: per-child per-minute rate cap (default 10/min); same correlation_id within 60 s is deduplicated. |
Validates caller's parent exists. |
send_message(target, message) |
Direct message to parent or child | Validates target is caller's direct parent or child. |
get_status(team?) |
Query status of managed teams. Returns { active_daily_ops, saturation, org_op_pending, queue_depth, current_task?, pending_tasks[] } per target (ADR-41). saturation is true when active_daily_ops >= max_concurrent_daily_ops. |
Returns only the caller's own children. |
list_completed_tasks(team?, since?, limit?) |
Query completed tasks from task_queue. Default: caller's own team, last 7 days, limit 50. Returns: task_id, duration_ms, status (done/failed/cancelled), result_snippet (first 200 chars), created_at. Evidence source for self-reflection sessions. See Architecture-Decisions#ADR-37. |
Read-only. Validates caller is parent of target team (or own team). |
list_teams(recursive?) |
List child teams with descriptions, scope keywords, and status for routing decisions | Returns only the caller's own children (and optionally their descendants). Provides team name, description, keywords, status, queue depth. |
update_team(team, scope_add?, scope_remove?) |
Add or remove scope keywords for a child team. No learning-specific fields — scope_keywords changes affect the next learning cycle automatically. |
Validates caller is parent. Cannot leave team with zero scope. Returns resulting scope. |
shutdown_team(name, cascade?) |
End a child team's session and clean up all team data | Terminates session. Deletes all team-scoped DB rows (including memories, memory_chunks). Removes .run/teams/{name}/ directory. With cascade: true, recursively shuts down all descendants depth-first. |
add_trusted_sender(channel_type, sender_id, channel_id?, trust_level?) |
Grant trust to a sender on a channel. Writes to sender_trust table with granted_by="admin". |
Validates channel_type exists. |
revoke_sender_trust(channel_type, sender_id, channel_id?) |
Remove trust or explicitly deny a sender. Revoking removes the row; denying sets trust_level="denied". |
Validates sender exists in DB. |
list_trusted_senders(channel_type?, trust_level?) |
View trust status across channels. Read-only query on sender_trust table. |
None (read-only). |
register_plugin_tool({ tool_name: string, source_code: string }) |
Register a team-local plugin tool with security scan + interface validation. Returns { success, tool } or { success: false, error }. |
Security scan (forbidden patterns, secret detection), interface validation (description, parameters, execute), reserved name check. |
Note: The trust management tools (
add_trusted_sender,revoke_sender_trust,list_trusted_senders) are available to the main agent only. Child teams cannot manage sender trust.
Note:
list_completed_tasksreturns historical task records for evidence gathering (e.g., self-reflection sessions). For real-time queue state, useget_status.
Canonical fan-out pattern for querying multiple peers concurrently. Wall-clock becomes max(child_duration) instead of sum(...), the primary fix for the 19-minute serial query_team chain documented in Scenarios#19-Minute Trading Cycle Evidence.
sequenceDiagram
participant Parent as Parent Team
participant A as Child A
participant B as Child B
participant C as Child C
Parent->>Parent: query_teams([A, B, C], default_timeout_ms=150000)
par parallel fan-out
Parent->>A: query
and
Parent->>B: query
and
Parent->>C: query
end
A-->>Parent: {team:"A", ok:true, result_or_error:"..."}
B-->>Parent: {team:"B", ok:false, result_or_error:"timeout"}
C-->>Parent: {team:"C", ok:true, result_or_error:"..."}
Note over Parent: wall-clock = max(child_duration)<br/>partial-result tolerated
Result shape: each entry is {team: string, ok: boolean, result_or_error: string}. On success ok: true and result_or_error contains the child's response. On failure ok: false and result_or_error contains the error message.
Rules:
- Fan-out cap: 5 children per call. Higher fan-outs must be split across multiple calls or reconsidered. Calling with 1 target is allowed but
query_team(singular) is preferred. - Each child call consumes a slot in the target's daily-ops pool (ADR-41). If the target is saturated, it returns a
saturationerror for that child — siblings continue. - Per-child
timeout_msoverridesdefault_timeout_ms. Expired children return{ok: false, error: "timeout"}. - Partial-result semantics: orchestrator decides whether to proceed with partial results or retry the failed subset. Documented in Tool-Guidelines#query_teams Partial Failure.
- Cancellation: parent abort cancels all outstanding child calls; no child queue mutation for cancelled calls.
Canonical work-handoff pattern from child to immediate parent (ADR-43). Preserves ADR-40 hierarchy: the payload carries context only; the parent's orchestrator still routes.
flowchart LR
subgraph Child[Child Team - e.g., during window tick]
CE[Event detected]
CE --> CC{Signal type?}
end
CC -->|informational| Esc[escalate<br/>upward notification<br/>no parent queue insertion]
CC -->|work handoff| EPT[enqueue_parent_task<br/>priority + context + correlation_id]
subgraph Parent[Parent Team]
PQ[task_queue<br/>priority admission - ADR-9]
PO[Orchestrator<br/>routing decision - ADR-40]
PS[Subagent executes]
PQ --> PO
PO --> PS
end
EPT --> PQ
Esc -. info only .-> PO
Guards:
- Rate cap: per-child per-minute enqueue cap (default 10 / min).
-
Dedup: same
correlation_idwithin 60 s window is deduplicated.
When to choose which:
-
escalate— "I observed X, FYI." Parent decides whether to act. -
enqueue_parent_task— "I observed X; please do Y. Priority: high." Work enters the parent's queue directly.
| Tool | Purpose | Enforcement |
|---|---|---|
vault_set(key, value) |
Store a key-value pair in the team vault. Only handles team state — secret rows (is_secret=1) are system-managed (spawn_team, migration, admin API). |
Scoped to caller's team. |
vault_get(key) |
Retrieve a value from the team vault | Secret values scrubbed from audit logs. Returns raw value to caller. |
vault_list(prefix?) |
List keys in the team vault | Values omitted for is_secret=1 entries. Returns keys and metadata only for secrets. |
vault_delete(key) |
Remove a key from the team vault | Rejects if key has is_secret=1. Scoped to caller's team. |
| Tool | Purpose | Enforcement |
|---|---|---|
memory_save(key, content, type?, supersede_reason?) |
Save or supersede a memory entry | Scoped to caller's team via OrgToolContext.teamName. Requires supersede_reason if active entry with same key exists. See Memory-System#Tool Specification. |
memory_delete(key) |
Soft-delete a memory entry | Scoped to caller's team. Sets is_active = 0. Deleted entries excluded from search. |
memory_search(query, max_results?) |
Search team memory using hybrid FTS5 + vector similarity | Scoped to caller's team. Returns active + superseded entries. |
memory_list(type?) |
List active memory entries | Scoped to caller's team. Optionally filtered by type. |
| Tool | Purpose | Enforcement |
|---|---|---|
create_trigger(team, name, type, config, task, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) |
Create a new trigger in pending state. subagent is optional — if set, orchestrator routes directly; if null, orchestrator decides. skill requires subagent to be set (prevents direct skill addressing per ADR-40). |
Validates caller is parent. Stores in SQLite trigger_configs table. |
enable_trigger(team, trigger_name) |
Activate a pending or disabled trigger | Validates caller is parent. Sets state to active, registers handler. Resets overlap_count to 0. |
disable_trigger(team, trigger_name) |
Deactivate a trigger | Validates caller is parent. Sets state to disabled, unregisters handler. Resets overlap_count to 0. |
list_triggers(team) |
List all triggers and their states | Returns trigger name, type, state, failure count, overlap policy, overlap count, active task ID. |
test_trigger(team, trigger_name, max_turns?, overlap_policy?) |
Fire a trigger once for testing. overlap_policy controls concurrency when the team is busy: skip (drop silently), replace (cancel current task and enqueue), confirm (return { requires_confirmation: true, message } for parent approval before enqueue). Returns { taskId, status: 'queued' | 'skipped' | 'in_flight' }. When overlap_policy: 'confirm', no task is enqueued until the parent re-calls with explicit approval. |
Enqueues task without changing trigger state. Does not set active_task_id (not subject to overlap tracking). |
update_trigger(team, trigger_name, config?, task?, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) |
Update an existing trigger's config, task, targeting, or settings. subagent and skill follow the same constraint as create_trigger. |
Validates caller is parent. If trigger is active, handlers are re-registered automatically. |
For trigger engine internals (circuit breaker, overlap policy, failure handling), see Triggers.
| Tool | Purpose | Enforcement |
|---|---|---|
browser_navigate(url) |
Navigate to a URL | Team must have browser: config. SSRF protection blocks private/reserved IPs. Domain allowlist enforced if allowed_domains specified. |
browser_snapshot() |
Take an accessibility snapshot of the current page | Returns YAML accessibility tree with element refs. |
browser_screenshot() |
Take a visual screenshot | Returns ImageContent (base64 PNG). |
browser_click(element?, ref?) |
Click an element on the page | Uses element text or ref from snapshot. |
browser_type(text, element?, ref?) |
Type text into an element | Targets element by text or ref. |
browser_go_back() |
Navigate back in browser history | |
browser_go_forward() |
Navigate forward in browser history | |
browser_close() |
Close the browser tab |
For SSRF protection, domain allowlist, and BrowserRelay lifecycle, see Browser-Proxy.
| Tool | Purpose | Enforcement |
|---|---|---|
web_fetch(url, method?, headers?, body?, rate_limit_key?) |
Fetch a URL and return the response body. Optional rate_limit_key selects a per-team per-domain token bucket configured in Team-Configuration — useful for window-trigger ticks that scan external sources on every tick. |
SSRF protection via validateBrowserUrl(). Domain allowlist enforced. Unknown rate_limit_key falls back to no extra throttling (domain allowlist still applies). |
For the canonical specification, see Browser-Proxy#Web Fetch Tool.
| Tool | Purpose | Enforcement |
|---|---|---|
search_skill_repository(query, filters?) |
Search the Vercel skills ecosystem (skills.sh) for skills matching a natural language description. Returns matches with install counts, source reputation, and match scores. Only matches ≥60% are returned. | Opt-in via allowed_tools. Read-only (no side effects). Sources pre-defined in container. |
For the full adoption flow, format conversion, and trust signals, see Skill-Repository.
| Guardrail | Enforced In | Implementation |
|---|---|---|
| Parent-child validation |
assertIsParentOf(orgTree, teamName, team) in execute() body |
Every tool that targets a child team checks the org tree before proceeding. Rejects with clear error if caller is not the parent. |
| Workspace boundary | assertInsideBoundary(path, cwd) |
File-system tools validate that all paths resolve within the team's workspace. Prevents traversal attacks. |
| SSRF protection | validateBrowserUrl(url, allowedDomains) |
Blocks private/reserved IP ranges, validates against domain allowlist. See Browser-Proxy#SSRF Protection. |
| Credential scrubbing |
withAudit() wrapper on all tools |
Wraps every tool call with audit logging. Scrubs credential values from logs. |
| Deny-by-default |
activeTools filter from config.yaml allowed_tools
|
Only tools explicitly listed in the team's allowed_tools config are made available to the session. Unlisted tools are not registered. |
Plugin tools (team-local TypeScript tools in .run/teams/{name}/plugins/) require additional security guards beyond the standard guardrails.
| Guard | Detection | Enforcement |
|---|---|---|
| Forbidden pattern detection | AST analysis for shell injection, dynamic code evaluation, unsandboxed network requests | Reject tool creation with FORBIDDEN_PATTERN error |
| Secret scanning | Regex + entropy analysis for hardcoded credentials | Reject tool creation with SECRET_DETECTED error |
| Reserved name collision | Check against built-in tool names | Reject with error: "Tool name '{name}' is reserved" |
Plugin tool verification checks three aspects:
| Check | What It Validates |
|---|---|
| TypeScript | Syntax validity; reports any compilation errors |
| Interface | Presence of description, parameters (Zod schema), and execute function |
| Security | Forbidden patterns, detected secrets, and overall pass/fail |
| Guard | Enforcement Point | Error Code |
|---|---|---|
| Filesystem boundary |
assertInsideBoundary(runDir, path) in tool execute()
|
PATH_OUTSIDE_BOUNDARY |
| Audit logging |
withAudit() wrapper on plugin tool calls |
N/A (logs all invocations) |
| Namespace isolation |
loadPluginTools(teamName) returns only team-local tools |
N/A (structural) |
All guard violations produce structured, actionable error messages containing: the error code (e.g., SECURITY VIOLATION: FORBIDDEN_PATTERN), the pattern name, line number, the offending line (redacted), and suggested remediation.
Every plugin tool invocation logs:
- SHA-256 hash of tool code
- Prompt context (truncated)
- Timestamp
- Team/user context
- Result: success/blocked
sequenceDiagram
participant User
participant Main as Main Agent
participant Eng as Engineering Orchestrator
participant SA as react-dev Subagent
participant SK as Skill
participant PL as Plugin
User->>Main: "Build a login page"
Note over Main: Routes only. Identifies engineering team.
Main->>Eng: delegate_task("engineering", "Build login page")
Note over Eng: Reads subagent defs → picks react-dev
Eng->>SA: invoke react-dev subagent
Note over SA: Context: react-dev.md + skills + task
SA->>SK: follow build-ui skill steps
SK->>PL: scaffold_component.ts → generate component
PL-->>SK: component created
SK-->>SA: build complete
SA-->>Eng: "Login page UI complete"
Eng-->>Main: Result: "Login page complete"
Main-->>User: "Login page is ready"
Routing decisions are made by the LLM (parent agent), not by keyword-matching code. The list_teams tool provides the parent with all the information it needs to choose the right child team for a task.
- Parent calls
list_teams(recursive: true)to see its children (and optionally the full subtree) - Each entry includes: team name, description, scope keywords (as routing hints and learning domain signal), status, and pending queue depth
- The parent LLM reads the team descriptions and keywords, then decides which child best fits the task
- Parent calls
delegate_taskorquery_teamtargeting the chosen child
Scope keywords are stored in the scope_keywords SQLite table and serve dual purpose: routing hints for the LLM and learning domain signal for autonomous learning -- they help the parent agent understand what each team handles and define the domain the team's learning system explores. There is no automated keyword-matching gate; the LLM makes the routing decision using the full context of the task and the available teams.
When creating teams without a config_path, providing scope_accepts keywords is recommended so that list_teams can surface useful routing metadata to parent agents and the learning system can derive its search domain.
Every tool is either daily-ops (can run concurrently with other daily-ops on the same team, up to max_concurrent_daily_ops) or org-ops (single-flight per team via the per-team mutex). See Architecture#Execution Model for the canonical pool + mutex diagram and Architecture-Decisions#ADR-41 Daily-ops vs Org-ops Concurrency for the six resource classes.
| Tool | Class | Notes |
|---|---|---|
delegate_task |
daily-ops | Insert is append-only; task consumption is single-threaded per team. |
query_team, query_teams
|
daily-ops | Read-heavy peer RPC. Each child call consumes a slot in the target's daily-ops pool. |
escalate |
daily-ops | Append-only notification. |
enqueue_parent_task |
daily-ops | Priority queue insert at parent. |
send_message |
daily-ops | Append-only. |
get_status, list_teams, list_completed_tasks
|
daily-ops | Read-only. |
list_trusted_senders |
daily-ops | Read-only. |
memory_save, memory_delete, memory_search, memory_list
|
daily-ops | Per-subject_key lock on writes (class 3); FTS index follows (class 4). |
vault_get |
daily-ops | Read-only. |
vault_set (non-secret), vault_delete (non-secret), vault_list
|
daily-ops | Version-check upsert (class 3). |
web_fetch |
daily-ops | Network I/O only; per-domain rate buckets optional. |
browser_* |
daily-ops | Subject to team's BrowserRelay session; not a DB write. |
create_trigger, enable_trigger, disable_trigger, test_trigger, list_triggers
|
daily-ops | Trigger engine ops; trigger_configs.active_task_id/overlap_count mutations are serialized by the engine itself. |
search_skill_repository |
daily-ops | Read-only. |
spawn_team, shutdown_team, update_team
|
org-ops | Structural topology (class 5). |
update_trigger |
org-ops | Structural config change (class 5). |
vault_set (secret rows, is_secret=1) |
org-ops | Governance / security state (class 6). |
register_plugin_tool |
org-ops | Plugin lifecycle (class 6). |
add_trusted_sender, revoke_sender_trust
|
org-ops | Security state (class 6). |
Org-ops take the per-team mutex: in-flight daily-ops are allowed to drain, new daily-ops admission is blocked, then the org-op runs single-flight. No mid-flight abort.
Each team has a priority admission-order queue managed in SQLite.
-
Daily-ops are concurrent up to
max_concurrent_daily_ops(default 5) per team. Org-ops are single-flight per team via the mutex. See Architecture#Execution Model. -
delegate_taskadds tasks to the queue with an optional priority (default:normal) - Tasks are ordered by priority level, then FIFO within the same priority
- No mid-task preemption -- only pending tasks can be reordered (ADR-9; applies to org-ops and structural work. Daily-ops admission is unordered within the cap by design)
-
get_statusreturns{ active_daily_ops, saturation, org_op_pending, queue_depth, current_task?, pending_tasks[] } - Queue state is durable in SQLite -- survives container restarts
| Priority | Use Case |
|---|---|
critical |
Incidents, security alerts |
high |
User-facing requests, escalations |
normal |
Standard delegated work (default) |
low |
Background maintenance, cleanup |
For end-to-end operational scenarios, see Scenarios.