Triggers - Z-M-Huang/openhive GitHub Wiki
The Trigger Engine manages automated task dispatch via schedules, keyword detection, and message pattern matching. It uses a per-team keyed registry for dynamic add/replace/remove at runtime. Trigger configurations are stored in the SQLite trigger_configs table and managed via inline AI SDK tools.
| Type | Fires When | Example |
|---|---|---|
schedule |
Cron expression matches |
"0 9 * * *" (daily at 9am) |
message |
Message matches regex pattern in a channel | Regex on incoming Discord messages |
keyword |
Specific keyword detected |
"deploy", "incident"
|
window |
Inside a cron-defined window, fires on tick_interval_ms cadence |
watch_window: "30 9-16 * * 1-5", tick_interval_ms: 30000 (market hours ticks) |
Triggers are created and managed via 6 inline AI SDK tools (defined in trigger-tools.ts). There is no file-based trigger configuration.
| Tool | Purpose | Who Can Call |
|---|---|---|
create_trigger(team, name, type, config, task, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) |
Create a new trigger in pending state. subagent is optional — if set, orchestrator routes directly; if null, orchestrator decides. skill requires subagent to be set (prevents direct skill addressing). |
Parent of target team |
enable_trigger(team, trigger_name) |
Activate a pending or disabled trigger and register its handler | Parent of target team |
disable_trigger(team, trigger_name) |
Deactivate a trigger and unregister its handler | Parent of target team |
list_triggers(team) |
List all triggers and their states for a team | Parent of target team |
test_trigger(team, trigger_name, max_turns?) |
Fire a trigger once for testing without changing its state. Returns taskId. |
Parent of target team |
update_trigger(team, trigger_name, config?, task?, subagent?, skill?, max_turns?, failure_threshold?, overlap_policy?) |
Update trigger config, task, targeting, or settings without recreating | Parent of target team |
Triggers follow a three-state lifecycle:
stateDiagram-v2
[*] --> pending : create_trigger
pending --> active : enable_trigger
active --> disabled : disable_trigger
disabled --> active : enable_trigger
active --> disabled : circuit breaker (auto)
- pending — created but not yet firing. Must be explicitly enabled.
- active — registered in the engine and firing on its configured schedule/pattern.
-
disabled — deactivated. Can be re-enabled with
enable_trigger.
Use update_trigger to modify a trigger's config (e.g., cron expression), task text, max_turns, or failure_threshold without disabling and recreating it.
- If the trigger is active, handlers are atomically re-registered via
replaceTeamTriggers - If the trigger is pending or disabled, the update is stored and takes effect on next
enable_trigger - State and failure counters are NOT affected by updates
Triggers are persisted in the SQLite trigger_configs table:
| Column | Type | Purpose |
|---|---|---|
id |
INTEGER PK | Auto-increment row ID |
team |
TEXT | Owning team name |
name |
TEXT | Trigger name (unique per team) |
type |
TEXT |
schedule, keyword, message, or window
|
config |
TEXT (JSON) | Type-specific config (e.g., {"cron": "0 9 * * *"}) |
task |
TEXT | Task to enqueue when trigger fires |
subagent |
TEXT | Optional target subagent name. If set, orchestrator routes directly (deterministic). If null, orchestrator decides via LLM. skill requires subagent to be set. |
skill |
TEXT | Optional skill reference. May only be set when subagent is also set (prevents direct skill addressing per ADR-40). |
state |
TEXT |
pending, active, or disabled
|
max_turns |
INTEGER | Max SDK turns for the triggered task (default: 100) |
failure_threshold |
INTEGER | Consecutive failures before circuit breaker trips (default: 3) |
consecutive_failures |
INTEGER | Current failure count |
overlap_policy |
TEXT | Instance overlap behavior: skip-then-replace (default), always-skip, always-replace, allow
|
overlap_count |
INTEGER | Consecutive overlap counter (default: 0) |
active_task_id |
INTEGER | Soft reference to current task (pending or running) in task_queue (nullable) |
disabled_reason |
TEXT | Why the trigger was disabled (if applicable) |
created_at |
TEXT | ISO timestamp |
updated_at |
TEXT | ISO timestamp |
At bootstrap, the engine loads active triggers from SQLite:
-
initTriggerEngine()inbootstrap-helpers.tscreates the engine and callsloadFromStore() -
loadFromStore()reads all rows fromtrigger_configswherestate = 'active' - Active triggers are grouped by team and registered with their handlers (cron jobs, pattern matchers)
- The engine starts — schedule handlers begin firing
There is no file scanning at startup. All trigger state lives in SQLite and survives restarts automatically.
- Engine loads active triggers from SQLite at startup via
loadFromStore() - Registers handlers per trigger type:
node-cronfor schedules, pattern matchers for keywords/messages - When a trigger fires, the engine calls
delegateTask(team, task, subagent?, skill?)to enqueue in SQLite task queue. If the trigger has asubagentfield, it is passed through to the task. - The orchestrator dequeues the task. If
subagentis set, it routes directly to that subagent (deterministic, no LLM cost). Ifsubagentis null, the orchestrator reads subagent definitions and selects via LLM reasoning. - Deduplication is enforced via SQLite — event IDs with TTLs prevent duplicate processing
- Rate limiting per trigger source prevents runaway execution
sequenceDiagram
participant TE as Trigger Engine
participant TQ as Task Queue (SQLite)
participant Orch as Team Orchestrator
participant SA as Target Subagent
participant SK as Skill
participant PL as Plugin
TE->>TQ: delegateTask(team, task, subagent="learner", skill="learning-cycle")
TQ->>Orch: dequeue task (includes subagent + skill)
Note over Orch: subagent is pre-determined — no LLM routing needed
Orch->>SA: invoke "learner" subagent with task prompt
Note over SA: Context: learner.md + learning-cycle.md + task prompt
SA->>SK: follow learning-cycle.md steps
SK->>PL: web_fetch, vault_set, memory_save...
PL-->>SK: results
SK-->>SA: cycle complete
SA-->>Orch: result text
-
teamremains required (routing container).subagentis a nullable addition — it does NOT replaceteam. - If
subagentis set → orchestrator routes directly to that subagent (deterministic, no LLM cost) - If
subagentis null → orchestrator reads subagent definitions and decides via LLM reasoning -
Constraint:
skillmay only be provided whensubagentis also provided (prevents direct skill addressing per ADR-40) - Learning/reflection triggers always specify
subagent(deterministic nightly routing) - Keyword/message triggers may leave
subagentnull (orchestrator decides based on content)
The engine uses a Map<string, TeamHandlerSet> keyed by team name. This enables:
-
replaceTeamTriggers(team, triggers)— atomically replace all triggers for a team (stops old schedule handlers, installs new ones, starts them if engine is running) -
removeTeamTriggers(team)— stop and remove all triggers for a team -
getTeamTriggerCount(team)— count triggers for a specific team - Team isolation — removing one team's triggers does not affect another
On shutdown_team, triggers for that team are automatically removed — both the in-memory handlers (via removeTeamTriggers) and the persistent trigger_configs rows in SQLite (via triggerConfigStore.removeByTeam).
Each trigger has a configurable failure_threshold (default: 3). When a triggered task fails consecutively, the circuit breaker trips:
-
reportTaskOutcome(team, triggerName, taskId, success)is called when a triggered task completes. ThetaskIdidentifies the specific task for overlap state management (see #Overlap State Lifecycle). - If the task is already marked
cancelled(by overlap replacement), the outcome is coerced tocancelled— no failure count increment, no status overwrite. - For non-coerced failures:
consecutive_failuresis incremented intrigger_configs - When
consecutive_failures >= failure_threshold, the trigger is set todisabledstate - The trigger is unregistered from the engine (stops firing)
-
onTriggerDeactivatedcallback notifies the system (logged as a warning) - A non-coerced success resets the failure counter to 0
Disabled triggers can be re-enabled via the enable_trigger tool.
When a trigger fires while a previous instance of the same trigger is still running, the engine applies a graduated overlap policy instead of unconditionally enqueuing a new task. This prevents resource waste from duplicate work and handles stuck instances gracefully.
| Policy | Behavior |
|---|---|
skip-then-replace (default) |
First overlap: skip firing, alert user. Second consecutive overlap: cancel old instance, start new, alert user. |
always-skip |
Every overlap is skipped. The old instance always runs to completion. |
always-replace |
Every overlap immediately cancels the old instance and starts a new one. |
allow |
No overlap detection. active_task_id is not tracked (stays NULL). Pre-ADR-34 behavior. |
The overlap check runs after deduplication and rate limiting but before delegateTask. For non-replacement paths (skip, normal fire), the check-and-act executes within a single SQLite transaction. Replacement is a multi-step sequence (see #Cancellation Mechanism). A skipped overlap consumes the firing event — it is not replayed later.
flowchart TD
A[Trigger fires] --> B{overlap_policy = allow?}
B -->|Yes| C["delegateTask — no overlap check, active_task_id stays NULL"]
B -->|No| D[Read active_task_id from trigger_configs]
D --> E{active_task_id is NULL?}
E -->|Yes| F[No overlap — reset overlap_count to 0]
F --> G["delegateTask → store new task ID in active_task_id"]
E -->|No| H{Referenced task still active in task_queue?}
H -->|"No (done/failed/cancelled)"| I["Stale reference — clear active_task_id, reset overlap_count"]
I --> G
H -->|"Yes (pending or running)"| J{Apply overlap_policy}
J -->|skip-then-replace| K{overlap_count == 0?}
K -->|Yes| L["Increment overlap_count to 1, SKIP firing, send alert"]
K -->|No| M["CANCEL old task, reset overlap_count to 0"]
M --> G
J -->|always-skip| L
J -->|always-replace| M
Active vs. stale. The overlap check treats a referenced task as active if its status is pending or running — both indicate the trigger's previous work is still in the pipeline. Only terminal states (done, failed, cancelled) are treated as stale references. This prevents duplicate enqueuing when a trigger fires twice before the first task starts running.
Overlap cancellation introduces a new terminal task status. The task state machine becomes:
Tasks follow a lifecycle from pending to running, terminating as done, failed, or cancelled.
-
cancelledis set by the engine when the overlap policy forces replacement of a running task -
cancelledtasks are terminal — they are NOT reset topendingon restart -
cancelleddoes not count as a failure for circuit breaker purposes (cancellation ≠ failure)
When the overlap policy forces replacement, the engine performs three ordered steps:
-
Mark the old task as
cancelledintask_queue(SQLite).active_task_idis not cleared in this step — it still references the old task until step 3 overwrites it. -
Abort the old session via
session.abort()— this is an in-memory best-effort operation that terminates the AI SDKstreamText()call, stopping token generation and tool execution. -
Enqueue the new task via
delegateTaskand overwriteactive_task_idwith the new task's ID (SQLite).
Steps 1 and 3 are SQLite operations. Step 2 is in-memory. If step 2 fails (abort throws), the old task is already marked cancelled in the DB — the session will be cleaned up on next idle timeout or restart. If step 3 fails (enqueue throws), the old task is cancelled and active_task_id still references it (now terminal) — the next trigger firing will see a stale reference, clear it, and proceed normally.
Stale outcome guard. Once a task is marked cancelled, any subsequent reportTaskOutcome call for that task ID is coerced to cancelled regardless of the reported outcome. A stale done or failed from a session that completed in the narrow window before session.abort() took effect does not overwrite the cancelled status and does not increment consecutive_failures. This preserves the "cancelled is terminal and not a failure" contract.
Relation to ADR-9. ADR-9 prohibits priority-based preemption of different tasks in the queue. Overlap cancellation is a narrow carve-out: the trigger engine may abort its own trigger's stale session when the overlap policy requires replacement. This is not one task preempting another — it is the engine reclaiming a stuck resource of the same trigger. ADR-9's no-preemption guarantee for cross-task priority ordering remains intact.
-
Task completion (
reportTaskOutcomefor done/failed/cancelled): clearsactive_task_idto NULL and resetsoverlap_countto 0 only ifactive_task_idmatches the finishing task's ID. If the IDs don't match (replacement already occurred), overlap state is not modified. If the task is already markedcancelled, the reported outcome is coerced tocancelled— no status overwrite, noconsecutive_failuresincrement. Circuit breaker logic applies independently to non-coerced outcomes only. -
Trigger disabled (via
disable_triggeror circuit breaker):overlap_countis reset to 0.active_task_idis not cleared — if a task is still running, it completes normally (consistent with learning trigger behavior). The stale reference check at next fire handles expired refs. -
Trigger re-enabled (via
enable_trigger):overlap_countstarts at 0.active_task_idis not cleared — if a task from before the disable is still running, overlap detection applies correctly on the next firing; if it finished, the stale reference check handles it. -
Restart recovery: all
active_task_idvalues are cleared andoverlap_countreset to 0 (see Durability-Recovery). Sessions are destroyed on restart, so no running instances exist to overlap with.
-
test_triggerdoes NOT setactive_task_idand does not participate in overlap tracking. It is a one-off diagnostic firing — it neither triggers overlap detection nor blocks subsequent scheduled firings. -
Recovered tasks (reset to
pendingafter restart) are not tracked. Overlap state is cleared on restart, and the first normal firing after restart setsactive_task_idfresh.
Overlap events generate engine-level system notifications (not LLM-driven decisions). The alert includes: trigger name, team, action taken (skipped or replaced), and how long the old instance has been running.
Routing by trigger type:
-
Keyword/message triggers — alert sent to
sourceChannelIdandtopic_id(if conversation threading is active), consistent with standard trigger notification routing (see Conversation-Threading) -
Schedule triggers — no
sourceChannelIdortopic_id; alert delivered viaescalate()to the parent team, consistent with how the learning cycle handles schedule trigger notifications
When a trigger fires, the originating channel ID is threaded through to the task queue via sourceChannelId. When conversation threading is active, a topic_id is also carried through for routing notifications to the correct topic (see Conversation-Threading). When the task completes, the notification is sent to the originating channel and topic.
- Keyword/message triggers: notification goes to the channel (and topic, if available) where the triggering message was received
-
Schedule triggers: no originating channel or topic;
sourceChannelIdandtopic_idare null. Results are stored in the task queue but not pushed to any channel. If the triggered task determines its result warrants attention, it usesescalate()to notify the parent team (see learning trigger for the canonical example). This is not an error — schedule triggers are inherently non-notifying unless the agent escalates. -
Tasks without a source channel from non-schedule sources: logged as an error — keyword and message triggers should always have a
sourceChannelId
When a triggered task completes, the LLM decides whether to notify the originating channel. The task prompt includes a notification instruction asking the agent to evaluate whether the result is worth reporting. The agent includes a JSON block in its response:
-
{"notify": true}— the result is sent to the channel -
{"notify": false}— the result is stored in the task queue but not pushed to any channel
Fail-safe: If the LLM response does not contain a valid {"notify": ...} JSON block, the system defaults to delivering the notification. This ensures that malformed or unexpected responses never silently suppress important results.
The JSON block is extracted and stripped from the stored result text, so channel notifications and the task queue contain only the substantive content.
A learning trigger drives the autonomous learning cycle for a specific subagent within a team. Bootstrap creates active learning triggers per subagent, with readiness gates checked at runtime (see Architecture-Decisions#ADR-35). The parent creates and manages learning triggers per subagent — subagents cannot create their own learning triggers. Each trigger specifies the target subagent explicitly for deterministic routing (ADR-40).
Bootstrap creates a learning trigger per subagent, named learning-cycle-{subagent} (e.g., learning-cycle-learner), configured with a nightly cron at 2 AM base time and per-team jitter (0–30 minutes, derived deterministically from a hash of the team name). The per-subagent naming avoids collision with the trigger name uniqueness constraint (unique per team). Main agent has NO learning trigger, NO subagents — it only routes.
Before executing, the learning-cycle skill checks readiness gates (see Architecture-Decisions#ADR-35):
-
Tool bundle present -- all 6 required tools (
web_fetch,vault_set,vault_get,memory_save,memory_search,memory_list) in the team'sallowed_tools -
Team bootstrapped --
bootstrapped=1in the org tree (team initialization done) -
Scope keywords present -- at least one
scope_keywordsentry exists for topic derivation
If any gate fails, the skill logs a warning and exits without error. The trigger remains active — gates are re-checked on the next firing.
This creates a nightly trigger (every day at 2am base time) with per-team jitter. The jitter is a 0–30 minute offset derived deterministically from a hash of the team name, so each team fires at a stable but slightly different time. This prevents all teams from firing simultaneously and creating a burst of web requests.
Parent-only management. The parent creates, enables, disables, and updates learning triggers per subagent. Subagents cannot create or modify their own learning triggers. If a subagent determines that its learning cycle needs adjustment (different schedule, different focus areas), it uses escalate() to request the change from the orchestrator.
-
create_trigger(team, "learning-cycle-learner", ..., subagent="learner", skill="learning-cycle")— creates a learning trigger targeting a specific subagent -
enable_trigger(team, "learning-cycle-learner")— activates the trigger so it fires on the next scheduled time -
disable_trigger(team, "learning-cycle-learner")— stops future firings; if a learning session is currently in progress, it completes, but the next scheduled firing does not occur
Non-notifying by default. Schedule triggers have no sourceChannelId, so learning cycle results are not pushed to any channel. Results are stored in the task queue and the vault journal. This prevents routine learning activity from generating noise in communication channels.
Escalation for significant findings. When the learning cycle discovers something that warrants attention (a critical update to a dependency, a security advisory, a significant change in the team's domain), the agent calls escalate() to notify its parent. The parent then decides whether to propagate the finding further or take action. Routine findings are stored silently — only significant discoveries trigger escalation.
The window trigger (ADR-42) delivers continuous-watch semantics — the user-facing experience of "this team is on duty" during a specific window (e.g., market hours). Each window occurrence opens on a cron expression, fires ticks at tick_interval_ms cadence while open, and closes on the exit cron.
Why not one long-lived session? Vercel AI SDK's streamText has no pause/resume primitive, and Anthropic times out idle streams at ~10 minutes — a literal long-lived session is architecturally impossible on our stack. The window trigger instead delivers functional continuity via periodic ticks + memory cursors + no-op returns + window boundaries. See Tool-Guidelines#Why window ticks feel long-running for the full explainer.
stateDiagram-v2
[*] --> WindowClosed
WindowClosed --> WindowOpen : watch_window cron enters
WindowOpen --> TickPending : tick_interval_ms elapsed
TickPending --> SkipTick : prior tick still running<br/>(overlap_policy applies)
SkipTick --> WindowOpen
TickPending --> DispatchTick : spawn fresh session<br/>(disposable per ADR-10)
DispatchTick --> WindowOpen
WindowOpen --> WindowClosed : watch_window cron exits
WindowClosed --> [*]
Stored in trigger_configs.config (JSON):
| Field | Purpose |
|---|---|
watch_window |
cron expression defining when polling is active |
tick_interval_ms |
cadence within the window (default 30000) |
max_tokens_per_window |
hard cap on total token consumption per window occurrence |
max_ticks_per_window |
hard cap on number of ticks per window occurrence |
overlap_policy |
reuses the existing trigger overlap policy — applies when a prior tick is still running when the next tick fires |
Each tick spawns a fresh disposable session per Architecture-Decisions#ADR-10. Tick idempotency is the team's responsibility — subagents persist progress keys in memory (e.g., last_scan_cursor, last_event_id) so repeat ticks do not duplicate work. The trigger engine does not inject cursor state; it is a subagent-prompt responsibility (see Subagents#Window-Trigger Subagents).
- A tick in flight when the window closes completes — the engine does not start new ticks past the close.
- DST transitions and holiday calendars are plugin concerns;
watch_windowexpressions are evaluated in the server's timezone (see #Timezone Handling). -
max_ticks_per_windowandmax_tokens_per_windoware hard kill switches enforced by the engine.
A window-tick subagent MUST return a structured no-op marker — canonical shape { action: "noop", reason: string } — when its scan finds nothing actionable. The engine treats a no-op return as success with empty output: no downstream notification, no parent-queue insertion, no memory mutation beyond the cursor update. See Tool-Guidelines#No-op Tick Contract.
window triggers reuse the existing overlap_policy (skip / replace / allow). Because ticks are intended to be short and idempotent, always-skip is the recommended default — the next tick re-reads the cursor and catches up. Teams that need single-flight-per-tick semantics can use skip-then-replace as documented in #Instance Overlap Policy.
If the tick detects an event warranting the parent's attention, it uses enqueue_parent_task (ADR-43) — not escalate. See Organization-Tools#enqueue_parent_task.
Scheduling remains unified under the Trigger Engine (ADR-7). The window type is an additional handler inside the engine, not a parallel scheduler.
A reflection trigger drives the self-reflection cycle for a specific subagent within a team. Bootstrap creates active reflection triggers per subagent. See Architecture-Decisions#ADR-37, Architecture-Decisions#ADR-40.
Bootstrap creates a reflection trigger per subagent, named reflection-cycle-{subagent} (e.g., reflection-cycle-learner), configured with a nightly cron at 3 AM base time (one hour after learning) with the same per-team jitter pattern. Main agent has NO reflection trigger.
This creates a nightly trigger (every day at 3 AM base time) with per-team jitter, one hour after the learning trigger. Readiness gates apply identically to the learning trigger (tool bundle + bootstrapped=1). The reflection skill requires: vault_get, vault_set, memory_save, memory_search, memory_list, list_completed_tasks. Max duration: 15 minutes. See Self-Evolution#Self-Reflection.
The dead-letter-scan schedule trigger was removed in Architecture-Decisions#ADR-38. Task stall detection is now an engine-level infrastructure check in task-consumer.ts, not a trigger. See Durability-Recovery#Stall Detection.
All cron expressions use the server's local timezone. The default is America/New_York. To change the timezone, set the TZ environment variable before starting the container. Trigger jitter offsets are added on top of the timezone-adjusted cron time. See Architecture-Decisions#ADR-18.