Self-Evolution

Agents can propose changes to their own rules, skills, and subagent definitions. This is the mechanism by which the system improves over time without human intervention for every change.

This page covers rule governance and the self-evolution protocol. For the overall rule system, see Rules-Architecture. For skill files, see Skills. For subagent definitions, see Subagents.

Rule Governance

Not all rules are equal. Different rule types have different modification policies.

Modification Permissions

Rule Type	Who Can Modify	Approval Required
`/app/system-rules/*.md`	Nobody (baked into Docker image)	Rebuild image required
`/data/rules/*.md`	Admin only (volume mount)	Admin approval (out-of-band)
`.run/teams/main/org-rules/*.md`	Main agent	Tool guards allow (own org-rules); audited
`.run/teams/{name}/org-rules/*.md`	Team orchestrator	Tool guards allow (own org-rules); audited; affects all descendants
`.run/teams/{name}/team-rules/*.md`	Team orchestrator	Tool guards allow (own team-rules); audited; team-scoped impact
`.run/teams/{name}/skills/*.md`	Team orchestrator	Tool guards allow (own directory); audited
`.run/teams/{name}/subagents/*.md`	Team orchestrator	Tool guards allow (own directory); audited

Governance Validation Layers

Rule governance is enforced by inline tool guards (tool-guards.ts, 229 lines) and audit wrappers (tool-audit.ts, 117 lines). When an agent attempts to write to a rule file, the guard:

Identifies the file type based on its path (global, org-rule, team-rule, skill, subagent).
Checks authorization. Is this agent allowed to modify this type of file?
Validates scope. A team can only modify files in its own directories. It cannot modify a parent's org-rules or a sibling's team-rules.
Logs the change. Every rule modification is logged with the team name, file path, and a summary of the change.
Blocks or allows the write based on the above checks.

What Governance Protects

A child team cannot modify its parent's org-rules (those cascade down and affect other teams)
No team can modify global rules (admin-only)
A team cannot modify another team's rules (scope boundary)
All rule modifications are audited

What Governance Allows

A team can modify its own team-rules/, skills/, and subagents/ (with logging)
A team can modify its own org-rules/ (with logging -- this affects descendants)
Agents can propose modifications through the self-evolution protocol (see below)

Bootstrap as Evolution Entry Point

Self-evolution begins at team creation. When a team is spawned, it runs a bootstrap task that creates its initial skills, memory, and configuration. This is the "seed" of the evolution cycle — the team starts with a baseline it creates itself, then refines those skills through the detect-propose-validate-apply-monitor cycle as it encounters real tasks.

Evolution Flow

Phase	Action
Detect	Subagent encounters a problem or inefficiency during normal work
Propose	Subagent drafts a specific change: which file, current content, proposed content, and rationale
Escalate	Subagent escalates the proposal to the orchestrator for confirmation (ADR-40)
Validate	Orchestrator reviews proposal. Tool guards check authorization and scope (see #Governance Validation Layers)
Apply	Orchestrator applies the approved change via the standard Edit tool with audit logging. Subagents do NOT apply changes directly.
Monitor	Subagent observes outcomes; proposes revert via escalation if results degrade

sequenceDiagram
    participant SA as Subagent
    participant Orch as Orchestrator

    SA->>SA: Detect issue in own skill/subagent/plugin file
    SA->>SA: Draft proposal (file, current, proposed, rationale)
    SA->>Orch: escalate(proposal)
    Note over Orch: Reviews proposal against boundaries + team goals
    alt Approved
        Orch->>Orch: Apply change via Edit tool (governance guards still enforce)
        Orch-->>SA: "Change applied"
    else Rejected
        Orch-->>SA: "Change rejected: [reason]"
    end

Subagents can identify issues in their own subagent.md, skills/*.md, and plugins/*.ts files. They propose changes but escalate to the orchestrator for confirmation before any write. The orchestrator applies the change (governance guards still enforce scope/authorization).

Step Details

1. Detect. The agent notices a problem during normal work. Examples:

A skill's steps are outdated (a command no longer works)
A rule conflicts with observed requirements
A procedure is missing a step that the agent keeps adding manually

2. Propose. The agent formulates a specific change. The proposal includes:

Which file to modify
The current content (relevant section)
The proposed new content
Why the change is needed

3. Validate. The tool guard checks:

Is this agent authorized to modify this file type?
Is the file within this agent's scope?
Is the change logged for audit?

Org-rules changes (which affect descendants) are allowed for the team's own org-rules directory. Changes to other teams' directories are blocked by assertGovernanceAllowed().

4. Apply. The orchestrator applies the approved change via the standard Edit tool. The withAudit() wrapper logs the modification. Subagents do NOT apply changes directly — they escalate to the orchestrator for confirmation (see #Evolution Flow).

5. Monitor. After applying, the subagent observes whether the change improves outcomes. If results degrade, the subagent can propose a revert via escalation and log what went wrong in its memory.

Skill Revisions Follow the Same Flow

Skills evolve through the same detect-propose-validate-apply-monitor cycle. An agent using a skill notices a problem (step fails, output is wrong), proposes a fix to the skill file, and the tool guard validates the modification.

Because skills are separate from agent identity, a skill revision benefits all agents that reference it without requiring changes to their subagent definitions.

When revising a skill, search the skill repository for updated or alternative patterns that may have been contributed to the Vercel skills ecosystem since the skill was last modified. See Skill-Repository.

Constraints on Self-Evolution

Agents cannot bypass governance guards. The guards are code-enforced (TypeScript inline in each tool's execute()), not rule-enforced.
System rules cannot be modified by anyone at runtime (baked into the Docker image). Admin org rules (/data/rules/) can only be changed by admins via the volume mount.
Self-evolution changes are always logged. The audit trail shows who changed what, when, and why.
If an agent's proposed change is rejected by the governance guard, the agent receives an error explaining why. It can escalate to its parent if it believes the change is necessary.

SDK Capability Reference Rule

The following is an example of the sdk-capabilities.md global rule. Its purpose is to document what the session engine provides out of the box, so agents do not reinvent built-in features or request capabilities they already have.

This rule lives at /app/system-rules/sdk-capabilities.md (baked into image) and is loaded into every agent's systemPrompt.

This rule documents the tools and features available through the AI SDK — built-in tools, subagents, organization tools, file system boundaries, session lifecycle, and communication patterns. It prevents agents from reinventing built-in features. For the actual tool definitions and capabilities, see Organization-Tools, Skills, and SDK-Integration.

Autonomous Learning

Autonomous learning is a scheduled self-improvement mechanism. Where the evolution flow above is reactive (a subagent encounters a problem and proposes a fix), autonomous learning is proactive: a subagent periodically seeks out new knowledge relevant to its domain and integrates validated findings into its skills and rules.

Learning runs at the subagent level (ADR-40). The parent creates and manages learning triggers per subagent — subagents cannot create their own learning triggers. Each trigger specifies the target subagent explicitly for deterministic routing. See Triggers for the trigger configuration.

sequenceDiagram
    participant TE as Trigger Engine
    participant TQ as Task Queue
    participant Orch as Team Orchestrator
    participant SA as learner subagent
    participant SK as learning-cycle skill

    Note over TE: 2:00 AM + jitter
    TE->>TQ: delegateTask("ops-team", task, subagent="learner", skill="learning-cycle")
    TQ->>Orch: dequeue
    Orch->>SA: invoke learner (deterministic routing)
    SA->>SK: 6-phase learning cycle
    Note over SK: Journal Read → Topic Analysis →<br/>Web Discovery → Validation →<br/>Storage → Journal Update
    SK-->>SA: cycle complete
    SA-->>Orch: "Learning complete. 2 findings stored."

    alt Significant finding
        SA->>Orch: escalate("Critical CVE found in loggly SDK")
        Orch->>Orch: evaluate significance
    end

Main agent has NO learning trigger, NO reflection trigger, NO subagents. It only routes.

6-Phase Learning Cycle

Each learning cycle follows a fixed sequence:

Phase	Purpose
Journal Read	Load prior learning state from vault to build on previous cycles
Topic Analysis	Identify knowledge gaps from recent tasks and current skills
Web Discovery	Search external sources scoped to the subagent's domain
Validation	Cross-reference findings against multiple independent sources
Storage	Persist validated findings as typed memory entries
Journal Update	Record topics explored, findings, deprioritized sources, next priorities

Duration checkpoints: Elapsed time is checked before each Phase 2 topic iteration and each Phase 3 URL fetch. If max_duration_minutes (default 30) is exceeded, the in-progress operation completes, then the cycle skips to Phase 6 (JOURNAL UPDATE) to persist state and exits gracefully. A session at 29:50 starting a fetch will complete that fetch, write the journal, and then exit — total elapsed time may slightly exceed the budget.

Journal Read loads the subagent's prior learning context so cycles build on each other rather than repeating work. Topic Analysis examines recent task history and existing skills to find gaps — what questions came up that the team could not answer confidently? Web Discovery searches external sources scoped to the subagent's domain. Validation determines confidence through cross-domain corroboration — findings are cross-referenced against sources from different root domains, with near-duplicate content (mirrors, syndication) counted as a single source. Sources on the deprioritized list are skipped. Confidence maps directly to independent source count: 3+ different root domains → high, 2 → medium, 1 → low. Storage persists validated findings as memory entries, with storage type determined by corroboration: 2+ independent root domains qualify a finding as a lesson, while single-source findings are stored as reference only. Journal Update records what was explored, what was found, updates the deprioritized sources list if contradictions were found, prunes expired entries, and sets next priorities.

Vault Journal for Progression

Learning progression is tracked in the subagent's vault journal, not in memory. Journal keys are namespaced per team and subagent (e.g., learning:{team}:{subagent}:journal) to make keys self-documenting — even though the team_vault table already scopes by team, the explicit team prefix ensures keys are unambiguous when inspecting the store directly. The vault journal is operational state — it records what topics have been explored, what findings were made, and what the next priorities are. This separation keeps memory focused on knowledge (facts, lessons, references) while the journal handles the learning process itself.

The journal enables continuity across sessions. When a new learning cycle starts, it reads the journal to understand where the last cycle left off, avoiding redundant exploration and building incrementally on prior work.

Parent-Only Trigger Management

Bootstrap creates active learning-cycle-{subagent} triggers per subagent, with readiness gates checked at runtime (see Architecture-Decisions#ADR-35). The per-subagent naming (e.g., learning-cycle-learner) avoids collision with the trigger name uniqueness constraint (unique per team). Subagents cannot create, enable, disable, or modify their own learning triggers. All trigger management flows through the parent:

create_trigger(team, "learning-cycle-learner", ..., subagent="learner", skill="learning-cycle") — creates a learning trigger targeting a specific subagent
enable_trigger(team, "learning-cycle-learner") — activates the nightly learning cycle
disable_trigger(team, "learning-cycle-learner") — stops future firings (current session completes)

This ensures:

Learning schedules are coordinated across the team hierarchy
Subagents cannot increase their own resource consumption by adjusting learning frequency
Parents maintain oversight of what their subagents are learning and how often

If a subagent determines that its learning cycle needs adjustment (different schedule, different focus areas), it uses escalate() to request the change from the orchestrator. The orchestrator evaluates the request and updates the trigger configuration if appropriate.

Interaction with `window` Triggers

If the team has an active window trigger (ADR-42) and a learning trigger fires while the window is open, the learning cycle is deferred to the next nightly firing. Rationale:

The window trigger holds the team's on-duty semantics (e.g., market hours). Displacing its tick cadence with a learning cycle would break the "continuous watch" contract.
Learning is a background activity (nightly, off-hours by default); its findings do not depend on real-time window state.
The deferral is a no-op run: the learning skill checks trigger_configs for an open window on the same team and exits gracefully with a deferred: window open log entry. No circuit-breaker increment.

If the team has no active window trigger, learning runs as scheduled. Teams whose watch_window overlaps the default 2 AM learning time should either adjust their watch_window or accept that learning will only run on nights when the window is closed.

Runtime Tool Bundle Check

When a learning-cycle-{subagent} trigger fires, the skill checks that all 6 tools in the required tool bundle are present in the team's allowed_tools before executing:

web_fetch
vault_set
vault_get
memory_save
memory_search
memory_list

If any tool is missing, the skill logs a warning naming the missing tool(s) and exits without error. The trigger remains enabled — tools may be added to the team later, and the next nightly firing will re-check. This runtime gate means triggers can be created at bootstrap regardless of whether the team has the required tools yet.

Self-Reflection

Self-reflection is a scheduled introspective mechanism. Where learning looks outward (external knowledge), reflection looks inward: a subagent reviews its own task outcomes to identify and fix systematic inefficiencies. Reflection runs at the subagent level (ADR-40). See Architecture-Decisions#ADR-37.

6-Phase Reflection Cycle

Phase	Purpose
Journal Read	Load reflection journal from vault
Evidence Gather	Query completed tasks via `list_completed_tasks` for outcome patterns
Diagnose	Identify the single highest-impact issue
Propose	Draft one skill or rule change (before/after)
Apply	Escalate proposal to orchestrator for confirmation, then apply via the evolution flow with governance enforcement
Journal Update	Record diagnosis, proposal, outcome, and next focus

One change per cycle. At most one modification per session. Changes target accuracy or efficiency only. Cooldown: after applying a change, no further reflection-originated changes until the next cycle.

Duration budget: Max 15 minutes. Duration checkpoint matches learning — in-progress ops complete, then skip to JOURNAL UPDATE.

Required Tools

vault_get, vault_set (journal), memory_save, memory_search, memory_list (context), list_completed_tasks (evidence). If any tool missing: log warning, exit. Trigger remains active per ADR-35 gate pattern. See Triggers#Reflection Trigger.

Self Evolution - Z-M-Huang/openhive GitHub Wiki

Self-Evolution

Rule Governance

Modification Permissions

Governance Validation Layers

What Governance Protects

What Governance Allows

Bootstrap as Evolution Entry Point

Evolution Flow

Step Details

Skill Revisions Follow the Same Flow

Constraints on Self-Evolution

SDK Capability Reference Rule

Autonomous Learning

6-Phase Learning Cycle

Vault Journal for Progression

Parent-Only Trigger Management

Interaction with `window` Triggers

Runtime Tool Bundle Check

Self-Reflection

6-Phase Reflection Cycle

Required Tools

⚠️ GitHub.com Fallback ⚠️

Self Evolution - Z-M-Huang/openhive GitHub Wiki

Self-Evolution

Rule Governance

Modification Permissions

Governance Validation Layers

What Governance Protects

What Governance Allows

Bootstrap as Evolution Entry Point

Evolution Flow

Step Details

Skill Revisions Follow the Same Flow

Constraints on Self-Evolution

SDK Capability Reference Rule

Autonomous Learning

6-Phase Learning Cycle

Vault Journal for Progression

Parent-Only Trigger Management

Interaction with window Triggers

Runtime Tool Bundle Check

Self-Reflection

6-Phase Reflection Cycle

Required Tools

⚠️ **GitHub.com Fallback** ⚠️

Interaction with `window` Triggers

⚠️ GitHub.com Fallback ⚠️