Sigil Harvesting: Developer Reference

Development guide for sigilharvest.lic (v2.0.0) Tests: spec/sigilharvest_spec.rb (129 examples) Analyzer: sigilharvest_analyzer.rb

1. Quick Reference

Files

File	Purpose
`sigilharvest.lic`	Script source, v2.0.0
`sigilharvest_analyzer.rb`	Log analyzer (parses banners + per-sigil summaries)
`spec/sigilharvest_spec.rb`	RSpec test suite (129 examples, current with v2.0.0)
`session_splitter.rb`	Log session extractor (handles split sessions across files)
`spec/spec_helper.rb`	Shared test mocks (Lich, XMLData, Script, etc.)
`data/sigils.yaml`	Room lists by `[City][SigilType][Season]`

Commands

# Run tests
bundle exec rspec spec/sigilharvest_spec.rb

# Lint after changes
rubocop sigilharvest.lic
rubocop spec/sigilharvest_spec.rb

# Extract sessions from game logs (handles split sessions)
ruby session_splitter.rb --version X.Y.Z path/to/*.log
ruby session_splitter.rb --dry-run --version X.Y.Z path/to/*.log

Version

Always bump VERSION in sigilharvest.lic:6 when changing the script. Current: VERSION = '2.0.0'

Script Invocation

;sigilharvest <city> <sigil> <precision> [minutes] [debug]
;sigilharvest Shard permutation 90 60 debug

2. Code Architecture

Class: `SigilHarvest`

Single class. initialize (line 8) sets up all state and runs the main loop. Tests bypass initialize entirely using SigilHarvest.allocate + instance_variable_set.

Method Map

Method	Line	Visibility	Purpose
`initialize`	8	public	Setup state, parse args, start main loop
`find_sigils(city, sigil)`	89	public	Outer loop: iterate rooms, call `harvest_sigil`
`harvest_sigil(sigil)`	114	public	Find sigil in room, run improvement loop
`check_sigil(sigil)`	162	public	Verify sigil type matches target
`improve_sigil(precision)`	178	public	Core algorithm: one iteration of action selection
`sigil_info(command)`	296	public	Send `perc sigil <cmd>`, parse response via serial `waitfor`
`scribe_sigils`	371	public	Scribe sigil onto scroll, manage inventory
`get_season`	392	public	Query game for current season
`get_techniques`	397	private	Detect active harvesting techniques
`get_scrolls`	416	public	Buy blank scrolls if below stock level
`log_startup_banner`	457	private	Print session config + techniques
`log_sigil_summary(sigil, result)`	475	private	Log per-sigil result line (with total elapsed)
`log_exit_summary`	484	private	Print session statistics
`format_techniques(techniques)`	513	private	Format technique array for display
`elapsed_minutes`	519	private	Minutes since session start
`sigil_elapsed_minutes`	523	private	Minutes since current sigil started
`time_expired?`	527	private	True when elapsed >= time limit
`contest_stat_for(resource)`	532	private	Map resource name to level ivar
`precision_action_viable?(action, contest_stat, precision)`	544	private	Viability gate for precision actions
`select_repair_action(action, ...)`	559	private	Check if action qualifies as resource repair; yields if yes

Call Flow

initialize
  → get_techniques()         # detect Inspired/Enlightened
  → log_startup_banner()
  → find_sigils(city, sigil)
      → harvest_sigil(sigil)
          → sigil_info('improve')     # first call: parse initial menu
          → improve_sigil(precision)   # loop: returns true to continue, false to stop
              → precision_action_viable?()
              → select_repair_action()
              → sigil_info(verb)       # execute chosen action
              → scribe_sigils()        # if target reached
          → log_sigil_summary(sigil, result)
  → log_exit_summary()

Instance Variables (State)

Variable	Type	Default	Set By	Purpose
`@sigil_precision`	Integer	0	`sigil_info`	Current sigil precision (0-100)
`@sigil_clarity`	Integer	0	`sigil_info`	Current sigil clarity (0-100)
`@danger_lvl`	Integer	0	`sigil_info`	Danger meter (0-20 stars)
`@sanity_lvl`	Integer	15	`sigil_info`	Sanity resource (0-20 stars)
`@resolve_lvl`	Integer	15	`sigil_info`	Resolve resource (0-20 stars)
`@focus_lvl`	Integer	15	`sigil_info`	Focus resource (0-20 stars)
`@num_iterations`	Integer	0	`harvest_sigil`	Iterations used this sigil (cap: 15)
`@num_aspect_repairs`	Integer	0	`improve_sigil`	Repair actions taken this sigil
`@sigil_improvement`	Array[Hash]	[]	`sigil_info`	Current action menu (3-8 actions)
`@sigil_count`	Integer	0	`harvest_sigil`	Total sigils encountered
`@sigil_results`	Array[Hash]	[]	`log_sigil_summary`	Per-sigil outcome records
`@scribed_in_session`	Boolean	false	`find_sigils`	True after first successful scribe
`@rooms_visited`	Integer	0	`find_sigils`	Rooms traversed
`@start_time`	Time	Time.now	`initialize`	Session start
`@sigil_start_time`	Time	Time.now	`harvest_sigil`	Current sigil start
`@time_limit`	Integer	30	`initialize`	Minutes before auto-stop

Action Hash Structure

Each entry in @sigil_improvement is a Hash:

{
  "difficulty" => Integer,  # 1-5 (trivial..formidable)
  "resource"   => String,   # "sanity" | "resolve" | "focus"
  "impact"     => Integer,  # 1-3 (taxing..destroying)
  "verb"       => String,   # game verb to execute (e.g., "analyze", "study")
  "target"     => String,   # "sigil" (improve) | "your" (repair)
  "aspect"     => String,   # "precision" | "quality" | resource name (repair)
  "risk"       => Integer   # difficulty + impact
}

3. Core Algorithm: `improve_sigil` (line 178)

Called once per iteration. Returns true to continue loop, false to stop.

v1.5.0 algorithm = v1.2.0 algorithm (reverted from v1.4.1, see §11 for why). Key differences from v1.4.1: no resource floor check, no high_target mode, allows formidable actions, risk-based action selection, no move budget check.

Decision Tree

improve_sigil(precision)
│
├─ Phase 1: Pre-scan (lines 187-203)
│  Scan @sigil_improvement for precision actions with tight margin (stat - diff < 2)
│  and difficulty >= 3. Store as best_repair_aspect / second_best_repair_aspect.
│  Purpose: identify which resource to repair proactively.
│
├─ Phase 2: Select action (lines 205-244)
│  For each action in @sigil_improvement:
│  │
│  ├─ Precision action? (aspect == "precision")
│  │  ├─ precision_action_viable?() → check margin, accept formidable
│  │  └─ Selection priority (lines 217-229):
│  │     ├─ First viable action → stored unconditionally
│  │     ├─ precision < (target - 20) → prefer LOWEST risk (far from goal)
│  │     └─ precision >= (target - 20) → prefer HIGHEST risk (close to goal)
│  │
│  └─ Repair candidate? → select_repair_action() (lines 234-243)
│
├─ Phase 3: Bail-out checks (lines 248-288)
│  ├─ Scribe check (line 251): precision >= target, OR near cap + within 5
│  ├─ Iteration cap (line 261): iterations >= 15
│  ├─ Resource exhaustion (line 270): (san+res+foc)*2.25 + prec < target-5
│  └─ Move budget (line 283): (14-iters)*15 < (target-prec-5), when prec <= 80
│
├─ Phase 4: Execute or refresh (lines 277-291)
│  ├─ Repair override (lines 278-284): use repair if no precision action found
│  └─ Execute action or refresh (lines 287-291)
│
└─ return true (continue loop)

Selection Priority Matrix

Given two precision actions that both pass viability:

Condition	Prefers	Rationale
`precision < target - 20`	Lowest risk	Far from goal, conserve resources
`precision >= target - 20`	Highest risk	Close to goal, sprint to finish

Note: No danger-based switching. No high_target mode. This simpler strategy outperforms the v1.4.1 approach in production data (see §11).

4. Key Formulas

Viability Filter (line 544)

def precision_action_viable?(action, contest_stat, precision)
  difficulty = action['difficulty'].to_i
  margin = contest_stat - difficulty

  # Path 1: comfortable margin (>1) — accept any difficulty including trivial
  return true if margin > 1

  # Path 2: any margin (>0) AND challenging+ difficulty
  return true if margin > 0 && difficulty > 2

  false
end

Key properties:

Allows formidable (difficulty=5) — unlike v1.4.1 which blocked them.
Accepts trivial actions (since v1.5.4/EXP-2) — any action with margin > 1 is viable.
Path 2 still requires challenging+ for tight margins.
precision parameter retained for API stability but not currently used by either path.

Bail-Out Formulas

Check	Formula	Applies When
Scribe	`precision >= target` OR `(iters >= 15 OR (iters == 14 AND no action)) AND precision >= target - 5`	Always
Iteration cap	`iterations >= 15`	Always
Resource exhaustion	`(san + res + foc) * 2.25 + precision < target - 5`	`precision <= 80`
Move budget	`(14 - iterations) * 13 < (target - precision - 5)`	`precision <= 80`

Move budget was tested for removal in v1.5.0 — results were worse. Tightened from * 15 to * 13 in v1.5.8/EXP-5 (kept). Original removal results: (mishap rate 49.9% → 74.8%, min per 80+ sigil 58 → 114). The check was restored in v1.5.2. It acts as protective early bail-out: sigils falling behind pace are cut loose before they accumulate danger and mishap. See EXP-1 in §13 for full data.

Repair Qualification (line 559)

def select_repair_action(action, contest_stat, precision, repair_target, current_repair)
  return unless action['difficulty'].to_i <= 3
  return unless repair_target.key?("difficulty")
  return unless (contest_stat - action['difficulty'].to_i) >= 2
  return unless @sigil_precision >= (precision - 15)
  return unless action['aspect'] == repair_target['resource']
  # ...
end

Repair is only considered when:

Action difficulty <= 3 (not difficult/formidable)
A repair target was identified in Phase 1
Comfortable margin (>= 2)
Close to target (within 15)
Action's aspect matches the repair target's resource

5. Minigame Mechanics

Game Flow

perc sigil — search for sigils (repeats until found)
perc sigil improve — begin improvement / reroll action menu
perc sigil <VERB> — execute chosen action
scribe sigil — scribe when target precision reached

Resources (0-20 stars each)

Resource	Start	Direction	Role
Danger	0	Increases	Mishap probability. Rises ~1-2 per 3 iterations
Sanity	15	Decreases	Consumed by sanity-cost actions
Resolve	15	Decreases	Consumed by resolve-cost actions
Focus	15	Decreases	Consumed by focus-cost actions

Resources are parsed by counting * in game output: "***-----" → 3

Action Properties

Property	Values	Parsed From
Difficulty	trivial(1), straightforward(2), challenging(3), difficult(4), formidable(5)	1st word
Resource	sanity, resolve, focus	2nd word
Impact	taxing(1), disrupting(2), destroying(3)	3rd word
Verb	game-specific (e.g., FORM, METHOD, STUDY)	4th word
Target	"your" (repair) or "sigil" (improve)	5th word
Aspect	precision, quality, or resource name	6th word

Risk = difficulty + impact. Impact also equals resource drain in stars.

Precision Gains (Empirical, N=142)

Difficulty	Avg Gain	Min	Max	Zero-Gain Rate
trivial(1)	2.0	0	3	15.4%
straightforward(2)	3.9	0	6	13.5%
challenging(3)	9.6	0	14	3.2%
difficult(4)	13.2	11	16	0.0%
formidable(5)	N/A	N/A	N/A	(never taken)

Critical finding: Gains are constant regardless of current precision level. A difficult action at precision 10 gains the same ~13 as at precision 50.

Mishap System

Mishaps end improvement prematurely. They are stochastic — they occur at all danger levels (observed at danger 0 through 11). Danger increases probability but does not guarantee safety at any level.

Type	Pattern	Effect
Stumble	"About the area you wander"	Sigil lost
Lose Track	"You lose track of your surroundings"	Sigil lost
Sneeze	"A sudden sneeze"	Sigil lost
Chills	"Chills creep down your spine"	Sigil lost
Resource Collapse	"Your resolve/sanity/focus collapses"	Sigil lost + stun
Other-planar	"rouse the attention of some other-planar entity"	Action fails, 0 gain (non-terminal)
Vanished	"The sigil has vanished"	Sigil despawned (non-algorithmic)
Combat	"You are too distracted"	Combat interrupted session

Resource collapse is a distinct failure mode — it happens when an individual resource is critically depleted, even at danger 0.

Starting Precision

Sigils spawn with random starting precision (observed range: 1-15, roughly uniform). Skip filters: target >= 80 → skip if prec < 10; target >= 65 → skip if prec < 5.

Iteration Budget

Hard cap: 15 iterations. Each action OR refresh costs 1 iteration. Typical productive path: ~9-10 actions + ~5-6 refreshes.

Path to 90 (Theoretical)

Starting at precision 12 (typical when filtering >= 10), need ~78 gain. At 13.2 avg per difficult action: 6 difficult actions required. With refreshes, that's ~12 iterations. Achievable but requires favorable RNG.

Iter 1:  improve (reroll) → get menu with difficult action
Iter 2:  difficult action → precision=25 (+13)
Iter 3:  improve (reroll)
Iter 4:  difficult action → precision=38 (+13)
Iter 5:  repair (restore resource)
Iter 6:  difficult action → precision=51 (+13)
Iter 7:  improve (reroll)
Iter 8:  difficult action → precision=64 (+13)
Iter 9:  repair
Iter 10: difficult action → precision=77 (+13)
Iter 11: improve (reroll)
Iter 12: difficult action → precision=90 (+13) → SCRIBE

90+ is a viable target. The math works — 6 difficult actions fit within the 15-iteration cap with room for refreshes and repairs. Success rate is governed by game randomness (action menu RNG, mishap rolls), not by an algorithmic ceiling. The script's job is to maximize the probability by making optimal decisions with whatever actions the game offers.

6. Known Weaknesses & Improvement Opportunities

6A. High Refresh Rate (37% of iterations wasted)

Problem: The viability filter rejects actions too aggressively, causing refreshes that yield zero precision. A trivial action (+2) is always better than a refresh (+0).

Location: precision_action_viable? (line 544), and the fallback chain.

Improvement: Consider loosening the viability filter for low-difficulty actions.

6B. Low-Value Actions Not Compared to Repair Value

Problem: When only trivial/straightforward precision actions are available, the script takes them rather than repairing a resource to enable a difficult action next turn. +2 now vs enabling +13 later is always worse.

Improvement: Compare expected value: selected_action.difficulty * ~3.3 vs best_repair_aspect.difficulty * ~3.3 (next turn). If repair enables a substantially better action, prefer repair.

6C. No Composite Resource Health Check

Problem: The script checks danger level but not the minimum across individual resources. Observed: Sigil #84 collapsed at danger=0 because one resource was critically depleted.

Improvement: Add a guard: if [sanity, resolve, focus].min <= 1 → bail out or avoid actions consuming the depleted resource.

6D. Danger Thresholds May Be Misdirected

Problem: Mishaps occurred at danger 0-11 (not just high danger). Conservative play at high danger doesn't reliably prevent mishaps.

Data: Mishaps observed at danger 0-3 (3 events), 5 (3), 7 (3), 11 (3).

Improvement: Consider whether conservative/aggressive modes should be driven by remaining iterations and distance to target rather than danger level alone.

6E. Skip Filter Overhead

The starting-precision skip filter (< 10 for target >= 80) is mathematically correct — a sigil starting at precision 3 cannot reach 90 in 15 iterations. The 61% skip rate is an inherent property of the game's random starting precision distribution, not an algorithm deficiency. The script correctly discards unwinnable sigils early.

7. Testing Guide

Test Setup Pattern

Tests bypass initialize (which calls game APIs) using allocate:

obj = SigilHarvest.allocate
obj.instance_variable_set(:@sigil_precision, 50)
obj.instance_variable_set(:@danger_lvl, 5)
# ... set all required ivars

The helper build_sigilharvest (spec line 241) handles all default ivars. Override only what your test needs:

let(:obj) { build_sigilharvest }

before do
  allow(obj).to receive(:sigil_info).and_return(false)
  allow(obj).to receive(:scribe_sigils)
end

it 'does something' do
  obj.instance_variable_set(:@sigil_precision, 80)
  obj.instance_variable_set(:@danger_lvl, 7)
  obj.instance_variable_set(:@sigil_improvement, [action1, action2])
  obj.send(:improve_sigil, 90)
  expect(obj).to have_received(:sigil_info).with('analyze')
end

Default Resource Levels in `build_sigilharvest`

The helper sets resources to 5 (not the game's starting 15):

obj.instance_variable_set(:@sanity_lvl, 5)
obj.instance_variable_set(:@resolve_lvl, 5)
obj.instance_variable_set(:@focus_lvl, 5)

This is low. Always set explicit resource levels in tests that involve action selection.

Building Actions

action = build_improvement(
  "difficulty" => 4,
  "resource"   => "sanity",
  "impact"     => 2,
  "verb"       => "analyze",
  "target"     => "sigil",
  "aspect"     => "precision",
  "risk"       => 6
)

Defaults: difficulty=3, resource="sanity", impact=2, verb="analyze", target="sigil", aspect="precision", risk=5.

Mock Modules

Tests define these mock modules at top level (not via spec_helper):

Module	Mocks	Key Global
`DRC`	`message`, `bput`	`$sigil_messages`, `$sigil_bput_log/responses`
`DRCA`	`do_buffs`	`$sigil_actions`
`DRCI`	`stow_hands`, `get_item?`, `stow_item?`, `count_item_parts`	`$sigil_actions`, `$sigil_scroll_count`
`DRCC`	`get_crafting_item`, `stow_crafting_item`	`$sigil_actions`
`DRCT`	`walk_to`, `order_item`	`$sigil_walks`, `$sigil_actions`
`DRCM`	`ensure_copper_on_hand`	`$sigil_actions`
`DRStats`	`trader?`, `circle`	Internal `@trader`, `@circle`
`Flags`	`add`, `delete`, `reset`, `[]`, `[]=`	Internal `@flags`
`Room`	`current`, `[]`	Internal `@current_id`

reset_test_state! (spec line 301) clears all globals before each test.

Script Loading

source = File.read(SIGILHARVEST_SOURCE_PATH)
source = source.sub(/\A=begin.*?=end\s*/m, '')           # strip doc block
source = source.sub(/^before_dying do.*?end\s*SigilHarvest\.new\s*\z/m, '')  # strip entry point
eval(source, TOPLEVEL_BINDING, SIGILHARVEST_SOURCE_PATH, 1)

The =begin/=end block and the final SigilHarvest.new + before_dying are stripped so the class is loaded without executing.

Test Compatibility

Current tests (151) are written for v1.4.1 and will fail against v1.5.0. v1.4.1-specific behaviors tested include: formidable blocking, resource floor check, high_target mode, danger_threshold switching, scribe at iteration 8, skip threshold 13, batch get capture. All of these were removed/reverted in v1.5.0. Tests need rewriting to match the v1.2.0 algorithm.

8. Log Analysis Infrastructure

Analyzer: `sigilharvest_analyzer.rb`

Parses structured log output from sigilharvest sessions. Works with v1.2.0+ log format.

Key data structures:

SessionInfo — metadata from the startup banner (version, city, sigil, precision target, techniques)
SigilRun — per-sigil outcome record with fields:
- number, sigil_type, result, target_precision, final_precision
- starting_precision, iterations, final_danger, room, elapsed_minutes
- precision_history, actions_taken, refresh_count, repair_count
- failed_action_count, mishap_type, stop_reason, danger_history
- resource_snapshots, session_index, session_elapsed_minutes

Log format parsed:

== SigilHarvest v1.5.0 ==
[Sigil #1] type=permutation result=mishap precision=42/90 iterations=8 danger=7 room=3 elapsed=2.1m total=5.3m
== End SigilHarvest v1.5.0 ==

The total= field (session elapsed at sigil completion) is optional for backward compatibility.

Log File Collection Procedure

Game logs are stored at: /Users/grocha/angua/lich-5-mine/logs/DR-<CharName>/<year>/<month>/

Steps to collect session logs:

Identify which characters ran sessions — the user provides character names, start times, and whether sessions span log file boundaries (log rotation at midnight or size threshold).
Handle multi-file sessions — some sessions span two log files (e.g., started in 2026-02-01-0627.log, continued in 2026-02-01-0712.log). These must be concatenated before analysis: cat file1.log file2.log > CharName_HHMM.log

Copy to version-specific directory — store extracted logs under ~/SH_logs/<version>/:

~/SH_logs/v1.2.0/   — 10 v1.2.0 session logs
~/SH_logs/v1.4.1/   — 10 v1.4.1 session logs
~/SH_logs/v1.5.3/   — 10 v1.5.3 baseline session logs
~/SH_logs/v1.5.4/   — 9 v1.5.4 EXP-2 session logs
~/SH_logs/v1.5.5/   — 10 v1.5.5 EXP-3 session logs
~/SH_logs/v1.5.6/   — 10 v1.5.6 EXP-4 session logs
~/SH_logs/v1.5.7/   — 10 v1.5.7 baseline restore session logs
~/SH_logs/v1.5.8/   — 10 v1.5.8 EXP-5 session logs
~/SH_logs/v1.5.9/   — 10 v1.5.9 Illuminated technique test logs (4 truncated)
~/SH_logs/permutation/ — 23 v1.3.x baseline logs

Naming convention: CharName_HHMM.log (start time of session, 24h format).
Verify completeness — each log file should contain both == SigilHarvest v<X> == (banner) and == End SigilHarvest v<X> == (exit summary). If the exit summary is missing, the session was interrupted.

Run analyzer — write a Ruby script that iterates over the log directory and calls the analyzer's parse methods. Example pattern:

require_relative 'sigilharvest_analyzer'
Dir.glob('/Users/grocha/SH_logs/v1.5.0/*.log').each do |f|
  analyzer = SigilHarvestAnalyzer.new
  analyzer.parse_file(f)
  # ... aggregate results
end

Gotchas:

find command can hang/timeout on large log directory trees. Use explicit ls with known directory paths instead.
Some sessions show techniques as ["Inspired", "and Enlightened"] — the "and" isn't stripped by the split regex. Cosmetic only; does not affect analysis.
The session_index field tracks which session a run belongs to within a multi-session file. When analyzing across files, track the filename alongside each run for per-character breakdown.

9. Observed Session Statistics

v1.2.0 vs v1.4.1 Head-to-Head (2026-02-01)

Both groups: Shard / permutation / target=90 / 60 minutes / Inspired+Enlightened techniques. 10 sessions each, across 10 different characters.

Metric	v1.2.0 (10 sessions)	v1.4.1 (10 sessions)
Sigils worked	339	194
Avg precision	54.3	47.8
Max precision	86	86
Scribed (>=90)	1	0
Sigils >= 80	9 (2.7%)	2 (1.0%)
Mishap rate	49.9%	76.8%
Avg danger at mishap	8.2	12.1
Minutes per 80+ sigil	67	300

v1.2.0 outperforms v1.4.1 on every metric.

Key Findings

v1.4.1's filtering over-restricted action selection. The resource floor check, formidable blocking, and high_target mode collectively forced more refreshes and lower precision outcomes. The simpler v1.2.0 selection strategy works better.
v1.4.1 pushes danger to ceiling then mishaps. Danger-at-mishap distribution: v1.4.1 clusters at 17-18 (mean 12.1), while v1.2.0 keeps danger distributed (mean 8.2) and reaches higher precision before mishapping.
Move budget check prematurely terminated 37.5% of v1.2.0 sigils at avg precision 55.3. These sigils had remaining iterations that could have gained more precision. Removed in v1.5.0.
Mishaps are stochastic at all danger levels. Conservative danger-based strategy switching provides less value than expected. Aggressive play that reaches high precision quickly (before mishap occurs) appears more effective.

v1.3.2 Baseline (23 sessions, earlier data)

Metric	Value
Minutes per 80+ sigil	~241
Compared to v1.2.0 (75 min reported, 67 min measured)	3.2x regression

10. Version History

Version	Algorithm	Key Changes
v1.2.0	Original "best"	Risk-based selection, allows formidable, serial waitfor, move budget check
v1.3.2	Modified	Various changes from upstream — 3.2x regression vs v1.2.0
v1.4.0	Redesigned	Formidable blocking, resource floor, high_target mode, scribe at iter 8, skip threshold 13, no move budget
v1.4.1	Patched v1.4.0	Added sigil_vanished/combat_distracted stop reasons, per-sigil elapsed time
v1.5.0	Reverted to v1.2.0	v1.2.0 algorithm + v1.4.1 logging infrastructure + move budget removed
v1.5.1	Patch	Added `validate_tools` pre-flight check (burin/bag/settings + inventory)
v1.5.2	Baseline candidate	Restored move budget check (v1.5.0 data proved removal harmful)
v1.5.3	Clean baseline	Version tick only — separates clean runs from v1.5.2 burin-retry noise
v1.5.4	EXP-2 (kept)	Accept trivial actions (Path 1 change) + burin validate_tools fix
v1.5.5	EXP-3 (reverted)	Prefer repair over trivial/straightforward precision actions
v1.5.6	EXP-4 (reverted)	Composite resource health guard (skip actions on near-depleted resources)
v1.5.7	Baseline restore	EXP-4 reverted, back to v1.5.4 algorithm
v1.5.8	EXP-5 (kept)	Tighten move budget formula (15→13 precision/move)
v1.5.9	Technique test	Illuminated Sigil Comprehension enabled (no algorithm change)
v1.5.10	EXP-6	Fix difficulty ordering, filter ACTION verb
v1.5.11	EXP-10+11 (reverted)	Raise skip threshold < 13 + velocity bail-out < 4/iter after 5
v1.5.12	EXP-10 (kept)	Skip threshold < 13 (standalone retest)
v1.5.13	EXP-7 (kept)	Difficulty-based action selection (decouple risk from cost)
v1.5.14	EXP-12 (reverted)	Loosen viability margin (accept margin=0 for challenging+)
v1.5.15	EXP-9 (kept)	Recalibrate resource exhaustion coefficient (2.25→1.75)
v1.5.16	EXP-12r (reverted)	Loosen viability margin (retest with corrected baseline)
v1.5.17	TECH-AWK (kept)	Awakened Sigil Comprehension — >=80 improvement observed, mechanism unknown
v1.5.18	EXP-13 (revert)	Remove iteration cap + move budget; resource-only bail-out; skip <15 for target 90
v1.5.19	EXP-14 (revert)	Equalize action costs per Urbaj (all cost labels = 1) — mishap rate +11.4pp vs EXP-13 base
v1.5.20	Baseline restore	Revert to v1.5.17 algorithm + C1 fix (`@actually_scribed`). Missing EXP-9 resource check.
v1.5.21	Corrected baseline	Restores EXP-9 resource exhaustion check. Baseline confirmed: all metrics match v1.5.17.
v1.5.22	EXP-15 (revert)	Align move budget max with iteration cap (14→15). Mishap +11.7pp (Z=2.67), 0 scribes, >=80 7→6.
v1.5.23	EXP-16 (reverted)	Tighten resource exhaustion coefficient (1.75→1.5). Mathematically impossible at target 90.
v1.5.24	EXP-14r (kept)	Cost equalization clean retest ({1,2,3}→{1,1,1}). Neutral (mishap -1.6pp, n.s.).
v1.5.25	EXP-17 (kept)	Resource-aware tiebreaker on tied difficulty+cost. Prefer most-available resource.
v1.5.26	D7 fix (infra)	Unconditional repair logging. Repairs=0 in 2606 iters. Phase 3/4 CLOSED.
v1.5.27	EXP-18 (kept)	Min difficulty threshold. Skip trivial, refresh for better menu. Avg gain 7.18→8.54.
v2.0.0	Release	v1.5.27 promoted to v2.0.0 for upstream PR. 22 experiments validated, 100% decision agreement.

Infrastructure (all versions v1.5.0+)

Common infrastructure from v1.4.1, present in all v1.5.x versions:

Time-based sessions with @time_limit, time_expired?
Structured logging: log_startup_banner, log_sigil_summary, log_exit_summary
Per-sigil timing (elapsed= and total= in summary line)
Technique detection via get_techniques / format_techniques
Analyzer-compatible output format
Pre-flight tool validation (validate_tools)

Infrastructure Changes (v1.5.18+)

Scribe counting (v1.5.18): scribe_sigils now counts individual scribes and logs "Scribes: N". Analyzer parses this via SCRIBES_COUNT regex and populates scribe_count field on SigilRun. Falls back to counting raw SCRIBE_SUCCESS lines for older logs. Enables per-sigil scribe yield tracking for Awakened analysis.
Terminology rename (v1.5.18→v1.5.19): scroll_count → scribe_count across all files (sigilharvest.lic, sigilharvest_analyzer.rb, reanalyze_all.rb). "Scrolls scribed" → "Scribes" in log output. Aligns terminology with game mechanics (scribing, not scrolling).
Banner cleanup (v1.5.18): Removed belt line from startup banner.
Cross-version analysis (v1.5.17+): reanalyze_all.rb uses all_session_runs(version) to handle merged log files with multiple sessions of the same version (e.g., batch 1 + batch 2 in v1.5.17). Tracks scribe yield metrics: total scribes, avg scribes/sigil, scribes/session.
Session splitter (v1.5.18+): session_splitter.rb extracts SigilHarvest sessions from full game logs. Handles the key problem of sessions split across two log files from the same character (reconnects mid-session). Groups files by character, sorts chronologically, reads as a virtual concatenation so split sessions are seamlessly joined. Usage: ruby session_splitter.rb --version 1.5.18 path1.log path2.log ... Output goes to ~/SH_logs/vX.Y.Z/DR-CharName_timestamp.log. Options: --dry-run, --output DIR, --keep-temp, --version VER.

11. Game Context

Seasonal / City Factors

Sigils appear based on city, sigil type, and season. Room lists loaded from data/sigils.yaml keyed by [City][SigilType][Season].

Cities: Crossing, Riverhaven, Shard. Seasons: spring, summer, autumn, winter.

Scroll Management

Stacks of 25. Auto-buys when below stock level (default 25).
Prices: Crossing=125kr, Riverhaven=100lr, Shard=90dok.

Trader Luck Mechanic

For Trader guild at circle 65+: speculate luck on first iteration when starting precision >= 14 (line 422). May improve RNG outcomes.

Precision / Clarity Flavor Text

Precision	Description	Clarity	Description
0-29	broad strokes	85-89	exquisite
30-49	thick strands	90-94	flawless
50-69	many fibers	95-97	flawless
70-89	thin lines	98-99	immaculate
90+	(scribing target)

Mishap Patterns (for `@mishaps` regex)

@mishaps = /Chills creep down your spine|About the area you wander|A sudden sneeze|
            You lose track|You prepare yourself for continued exertion|You are too distracted/

12. Development Practices

Version Bumping

Every change to sigilharvest.lic must include a version bump to the VERSION constant (line 6). The analyzer parses this from log banners, so version changes are how we distinguish data collected under different script behavior.

Patch bump (e.g., 1.5.0 → 1.5.1): Bug fixes, new metadata/logging, minor tuning of thresholds or constants, adding new banner fields.

Minor bump (e.g., 1.5.x → 1.6.0): Algorithm changes that affect sigil outcomes (action selection logic, danger thresholds, skip filters, resource management), new game command integrations, structural refactors.

Major bump (e.g., 1.x → 2.0.0): Fundamental redesign of the improvement loop, breaking changes to log format that require analyzer updates, new operating modes.

Data-Driven Development

All algorithm changes require empirical validation:

Run 10 sessions (same params, same techniques, Inspired+Enlightened) with the change
Analyze with sigilharvest_analyzer.rb — filter by version when log files contain multiple sessions
Compare head-to-head against current baseline
Key metric: minutes per 80+ precision sigil (lower is better)
Supporting metrics: mishap rate, avg precision, sigils worked per session
Only one algorithm change per version — isolate variables
If worse: revert the change, restore baseline, document the result
If better: the new version becomes the baseline for the next experiment

Analysis Script Notes

When writing analysis scripts, use these patterns (learned from prior bugs):

Filter by version: parser.sessions.each_with_index → skip sessions where session.version != target
Classify skipped: iterations == 0 (not result == 'SKIPPED' — parser sets result to "FAILED" for all)
Classify mishaps: stop_reason == :mishap (not result == 'mishap')
Per-character breakdown: extract character name from filename, track file alongside each run

13. Experimental Testing Plan

Protocol

Test params: Shard / permutation / target=90 / 60 minutes / Inspired+Enlightened
Sample size: 10 sessions per experiment (9 minimum if a character has a config issue)
Baseline: v1.5.2 (= v1.2.0 algorithm + v1.4.1 infrastructure + tools check)
Procedure: One algorithm change per version. Run 10 sessions. Analyze. Decide keep/revert.
Log storage: ~/SH_logs/v<version>/

Current Baseline: v1.5.4 (EXP-2)

v1.2.0 algorithm with EXP-2 (accept trivial actions), move budget, v1.4.1 logging infrastructure, pre-flight tool validation with multi-attempt burin resolution.

v1.5.4 measured performance (9 sessions, 391 worked, 561 skipped):

Avg precision: 52.4 | Max: 85 | Scribed: 1
Sigils >= 80: 5 (1.3%) | Mishap rate: 55.0%
Avg danger at mishap: 8.5 | Total minutes: 546
Worked/session: 43.4 | >=80/session: 0.6
Min per 80+ sigil: 109

v1.5.3 previous baseline (10 sessions, 335 worked, 495 skipped):

Avg precision: 53.5 | Max: 90 | Scribed: 2
Sigils >= 80: 4 (1.2%) | Mishap rate: 52.8%
Worked/session: 33.5 | >=80/session: 0.4
Min per 80+ sigil: 137

Note: Min-per-80+ has high variance at these sample sizes (~1-3% of sigils reach 80). Per-session normalized metrics (worked/session, >=80/session) are more stable.

Completed Experiments

EXP-1: Remove move budget check (v1.5.0)

Hypothesis: The move budget formula (14 - iters) * 15 < (target - prec - 5) bails out too early. 37.5% of v1.2.0 sigils hit moves_exhausted at avg precision 55.3. Removing it lets those sigils play their full iterations.
Change: Deleted the move budget check entirely.
Result: WORSE. Mishap rate jumped 49.9% → 74.8%. Min per 80+ sigil: 58 → 114. Without the early bail-out, doomed sigils kept playing, accumulated danger, and mishapped. The move budget was protective, not wasteful.
Action: REVERTED in v1.5.2. Move budget restored.

Metric	v1.2.0 baseline	v1.5.0 (no budget)	Delta
Worked	339	333	-2%
Avg precision	54.3	54.6	+0.6%
Sigils >= 80	9 (2.7%)	5 (1.5%)	-44%
Mishap rate	49.9%	74.8%	+50%
Min per 80+	58	114	+97%

EXP-2: Accept trivial actions when margin > 1 (v1.5.4)

Hypothesis: 37% of iterations are refreshes (zero precision gain). The viability filter rejects trivial actions (difficulty=1) unless danger > 17 or within 5 of target. A trivial action (+2 avg) is always better than a refresh (+0).
Change: In precision_action_viable?, Path 1 simplified from margin > 1 && (difficulty > 1 || @danger_lvl > 17 || @sigil_precision >= (precision - 5)) to just margin > 1. Also includes multi-attempt burin resolution fix in validate_tools.
Sessions: 9 (Barrask, Byd, Fidon, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.4/

Metric	v1.5.3 baseline	v1.5.4 (EXP-2)	Delta
Sessions	10	9	-1
Worked	335	391	+56
Worked/session	33.5	43.4	+29.6%
Avg precision	53.5	52.4	-1.1
Max precision	90	85	-5
Avg iterations	10.8	10.7	-0.1
Sigils >= 80	4 (1.2%)	5 (1.3%)	+1
>=80/session	0.4	0.6	+50%
Mishap rate	52.8%	55.0%	+2.2pp
Avg danger@mishap	8.0	8.5	+0.5
Min per 80+	137	109	-20%
Scribed (>=90)	2	1	-1

Verdict: KEEP — throughput up ~30%, efficiency up ~20%, quality flat within noise. The extra sigils worked per session and improved min-per-80+ more than compensate for the marginal precision delta (-1.1) which is within statistical variance.
Action: v1.5.4 becomes new baseline for EXP-3.

EXP-3: Prefer repair over trivial/straightforward precision actions (v1.5.5)

Hypothesis: Taking a trivial (+2) or straightforward (+4) action when a repair could enable a difficult (+13) action next turn is suboptimal. Expected value of repair → difficult is ~13 over 2 turns (6.5/turn) vs trivial's 2/turn.
Change: Between Phase 2 and Phase 3, if selected precision action has difficulty <= 2 and a repair action is available that would enable a harder action, prefer the repair. Same guards as Phase 4: repair budget (< 2 without override), danger <= 18.
Sessions: 10 (Barrask, Byd, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.5/

Metric	v1.5.4 baseline	v1.5.5 (EXP-3)	Delta
Sessions	9	10	+1
Worked	391	436	+45
Worked/session	43.4	43.6	+0.2
Avg precision	52.4	51.9	-0.5
Max precision	85	85	0
Avg iterations	10.7	10.8	+0.1
Sigils >= 80	5 (1.3%)	9 (2.1%)	+4
>=80/session	0.6	0.9	+0.3
Mishap rate	55.0%	49.1%	-5.9pp
Avg danger@mishap	8.5	8.3	-0.2
Min per 80+	109	121	+12
Scribed (>=90)	1	0	-1

Verdict: REVERT — Mishap rate improvement (-5.9pp) is borderline significant (z≈1.7, p≈0.09). However, resource_exhausted stop reason jumped from 2 to 8, and min-per-80+ regressed from 109 to 121. Fundamental assumption flawed: repair doesn't guarantee the same difficult action next turn because menus are re-rolled each iteration.
Action: REVERTED in v1.5.6. EXP-3 code removed, baseline restored to v1.5.4.

EXP-4: Composite resource health guard (v1.5.6)

Hypothesis: Resource collapse (a single resource hitting 0) causes mishaps even at danger 0. Adding a guard that skips actions consuming a near-depleted resource (<=1 star) could prevent these collapses.
Change: In Phase 2 action selection, add next if contest_stat <= 1 before considering any action.
Sessions: 10 (Barrask, Byd, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.6/

Metric	v1.5.4 baseline	v1.5.6 (EXP-4)	Delta
Sessions	9	10	+1
Worked	391	346	-45
Worked/session	43.4	34.6	-20%
Avg precision	52.4	51.1	-1.3
Max precision	85	84	-1
Avg iterations	10.7	10.9	+0.2
Sigils >= 80	5 (1.3%)	2 (0.6%)	-3
>=80/session	0.6	0.2	-67%
Mishap rate	55.0%	47.4%	-7.6pp
Avg danger@mishap	8.5	8.1	-0.4
Min per 80+	109	302	+177%
Scribed (>=90)	1	0	-1

Verdict: REVERT — Mishap rate improved (-7.6pp) but at severe cost. Throughput collapsed (-20% worked/session), quality collapsed (-67% >=80/session), min-per-80+ nearly tripled (109→302). The guard blocks too many actions, forcing iterations into refreshes (zero precision gain). moves_exhausted rose (133→142) despite fewer total sigils. The existing resource floor (contest_stat <= risk) already handles this more surgically.
Action: REVERTED in v1.5.7. EXP-4 code removed, baseline restored to v1.5.4.

EXP-5: Tighten move budget formula (v1.5.8)

Hypothesis: The current formula uses 15 precision/move which is optimistic (difficult actions average 13.2). The 37.5% bail-out rate at avg precision 55.3 suggests the formula is roughly calibrated, but tightening to 13 precision/move might bail out slightly earlier on truly hopeless sigils, saving time for new sigils.
Change: (14 - @num_iterations) * 13 < (precision - @sigil_precision - 5)
Sessions: 10 (Barrask, Byd, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.8/

Metric	v1.5.4 baseline	v1.5.8 (EXP-5)	Delta
Sessions	9	10	+1
Worked	391	473	+82
Worked/session	43.4	47.3	+9%
Avg precision	52.4	51.6	-0.8
Max precision	85	88	+3
Avg iterations	10.7	10.7	0.0
Sigils >= 80	5 (1.3%)	7 (1.5%)	+2
>=80/session	0.6	0.7	+0.1
Mishap rate	55.0%	47.4%	-7.6pp
Avg danger@mishap	8.5	8.1	-0.4
Min per 80+	109	87	-20%
Scribed (>=90)	1	2	+1

Verdict: KEEP — Mishap rate dropped 7.6pp (55.0% → 47.4%) as tighter budget bails out earlier, converting would-be mishaps into moves_exhausted exits (133→189). Throughput up 9% (47.3 worked/session). Efficiency improved 20% (87 min per 80+ sigil vs 109). Top-end quality slightly improved (max 88 vs 85, 2 scribes vs 1). Avg precision delta (-0.8) within noise — expected since bailing earlier on low-potential sigils lowers the average. No regression on any key metric.
Action: KEPT. v1.5.8 becomes new baseline for technique tests.

In-Progress Experiments

EXP-13 (v1.5.18) — complete, REVERT. 11 sessions analyzed. Removed iteration cap + move budget, resource-only bail-out, skip <15 for target 90. All key metrics regressed: worked/session 25.2→9.2, >=80/session 0.64→0.18, scribes 3→0, mishap rate 42%→63.4%. The aggressive skip threshold (<15) eliminated too many sigils, and removing the iteration cap increased mishap exposure without producing higher precision outcomes. Post-simulation analysis (SIM-7) additionally proved skip <15 is catastrophic: ALL 3 v1.5.17 scribes started at precision 13. See full results and simulation analysis below.

EXP-14 (v1.5.19) — complete, REVERT. 11 sessions analyzed. Equalized action costs (all cost labels = 1) per Urbaj's observation that difficulty determines resource cost. Ran on EXP-13 code base (inherits removed iteration cap, resource-only bail-out, skip <15). Results vs v1.5.18 (isolating cost equalization): mishap rate 63.4%→74.8% (+11.4pp), 0 real scribes (1 C1 fake), >=80/session 0.18→0.27 (noise). Cost equalization removed the disincentive for dangerous actions, causing more mishaps without compensating gains. See full results below.

v1.5.20 — complete, partial baseline. 11 sessions. Reverted EXP-13+14 to v1.5.17 algorithm. C1 bug fixed (@actually_scribed flag). D7 analyzer fix deployed. However, EXP-9 resource exhaustion check was accidentally dropped during revert — 0% resource_exhausted vs 12.1% in v1.5.17 baseline. Mishap rate 46.9% (vs 42.0% baseline), moves_exhausted 46.1% (vs 37.7% — absorbed the missing resource exits). 1 real scribe (Mahtra, prec=88, 2 scrolls). C1 fix validated: zero fake SCRIBEDs. D7 fix validated: code correct but 0 repairs in sample.

v1.5.21 — complete, BASELINE CONFIRMED. 11 sessions. Restores EXP-9 resource exhaustion check. All metrics match v1.5.17 within normal variance:

Worked/session: 23.8 (vs 25.2) — within variance
=80/session: 0.64 (vs 0.64) — exact match
Mishap rate: 40.8% (vs 42.0%) — match
resource_exhausted: 13.4% (vs 12.1%) — restored (was 0% in v1.5.20)
moves_exhausted: 39.7% (vs 37.7%) — match
0 scribes in 262 worked (expected ~0.5 at baseline rate — within Poisson variance)
7 sigils >=80: 4 stopped by moves_exhausted at 80-84, 3 by mishap at 82-87

EXP-15 (v1.5.22) — complete, REVERT. 11 sessions. Aligned move budget max with iteration cap (14→15). Mechanically worked: moves_exhausted 39.7%→18.1%, 4 sigils reached iteration_cap (3 at precision 83, gap=2 from scribe). But cost was severe: mishap rate 40.8%→52.5% (+11.7pp, Z=2.67, p<0.01), sigil_vanished 6.1%→12.4%. The ~57 freed sigils mostly ended in mishaps (29) or vanishes (16). >=80 count 7→6 (noise). 0 scribes. Key insight: the old formula's off-by-one was functioning as a safety guardrail. Extending sigils into the high-danger zone costs more than it gains.

v1.5.26 — complete, Phase 3 CLOSED. 11 sessions, 255 worked sigils. D7 fix validated: repair logging is unconditional, but zero repairs in 2606 iterations. The repair code path never triggers because difficulty-first selection (v1.5.20+) always finds a viable precision action. Historical v1.5.17 data (5 repairs) shows repairs are actively harmful (60% mishap rate). Phases 3 and 4 (repair experiments) are closed — repairs are a non-factor. Confirmed neutral vs v1.5.25: mishap +8.1pp (n.s. p=0.06), 1 real scribe (Fidon, 3 scrolls).

v1.5.27 — complete, KEPT. EXP-18: minimum difficulty threshold. Skip trivial-difficulty precision actions, refresh for better menu. Avg gain 7.18→8.54 (+1.36, exceeded v1.2.0's 8.39). Trivial-range gains 25.8%→3.6%. Effective gain/iter 3.67→3.75. 60+ rate +6.9pp. Resource exhausted halved (11.8%→5.9%). Most effective change since EXP-6.

Status: v2.0.0 released. Promoted to upstream PR as sigilharvest_overhaul.

Gain distribution now matches v1.2.0 — gain optimization lever exhausted.
Remaining levers: mishap reduction (1.3x at -50%), moves_exhausted optimization.
Retrospective simulation: 100% decision agreement (15/15). Net: v1.2.0 2.9% → v2.0.0 3.0%.

EXP-6: Difficulty fix + ACTION filter (v1.5.10) — Completed, KEPT

Sessions: 10 (all complete 60min)
Logs: ~/SH_logs/v1.5.10/

Metric	v1.5.8 (baseline)	v1.5.10 (EXP-6)	Delta
Worked	473	432	-41
Avg precision	51.6	50.8	-0.8
Max precision	88	85	-3
Sigils >= 80	7 (1.5%)	3 (0.7%)	-0.8pp
Mishap rate	47.4%	37.5%	-9.9pp
moves_exhausted	189	224	+35
Min per 80+	87	202	+115
Refresh rate	40.6%	44.2%	+3.6pp

EXP-6 verification:
- ACTION verb usage: v1.5.8=292, v1.5.10=0. Filter working perfectly.
- Difficulty ordering confirmed: median gains trivial(2) < straight(5) < formidable(7) < challenging(9) < difficult(13).
- Per-difficulty gains unchanged — fixes affect selection, not outcomes per action.
Verdict: KEEP. Both bug fixes confirmed working. Mishap rate dropped 9.9pp (largest single-experiment improvement). Sigils that previously mishapped now exhaust move budget instead. 80+ count dip (7→3) is statistically insignificant at these sample sizes. The fixes are objectively correct and required for all downstream experiments.

EXP-10+11: Skip threshold + velocity bail-out (v1.5.11) — Completed, REVERTED

Sessions: 10 (all complete 60min)
Logs: ~/SH_logs/v1.5.11/

Metric	v1.5.10 (baseline)	v1.5.11 (EXP-10+11)	Delta
Worked	432	298	-134
Skipped	700	1092	+392
Skip rate	61.8%	78.6%	+16.8pp
Avg precision	50.8	38.2	-12.6
Max precision	85	82	-3
Avg iterations	10.8	7.0	-3.8
Scribed (>=90)	1	0	-1
Sigils >= 80	3 (0.7%)	1 (0.3%)	-2
Mishap rate	37.5%	13.4%	-24.1pp
>=80/session	0.3	0.1	-0.2
Min per 80+	202	602	+400

EXP-10 (skip threshold): 1035 triggers. Zero false positives on baseline data. Mechanically correct, but effect swamped by EXP-11.
EXP-11 (velocity bail-out): 220 bail-outs out of 298 worked sigils (74%). The simulation predicted ~3.7% (16/432). Root cause: simulation only checked velocity at iteration 5; live code checked at every iteration >= 5. A sigil passing at iter 5 can dip below 4.0/iter at iters 6-12, triggering late bail-outs. The unknown stop reason (221 sigils, avg_prec=31.8) = velocity bail-outs.
Low mishap rate is misleading: Sigils are bailed before reaching high enough danger/precision to mishap. Not a real safety improvement.
Verdict: REVERTED. EXP-11 catastrophically over-triggered due to flawed simulation methodology. EXP-10 (skip threshold alone) remains viable for standalone testing. EXP-11 killed — continuous velocity check is fundamentally broken. A single-check-at-iter-5 variant could be revisited but the effect size is small (16/432 = 3.7%, all avg final 38) and would need fresh simulation.
Action: Reverted to v1.5.10 algorithm. Version bumped to v1.5.12. 121 tests passing.
Lesson: Always simulate the exact check logic (every-iteration vs single-check).

EXP-10: Skip threshold < 13 standalone (v1.5.12) — Completed, KEPT

Background: EXP-10 was bundled with EXP-11 in v1.5.11 but EXP-11 catastrophically over-triggered, swamping EXP-10's effect. Retested standalone after reverting EXP-11.
Change: Raise skip threshold for target >= 80 from < 10 to < 13.
Sessions: 10 (Barrask, Byd, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.12/

Metric	v1.5.10 (baseline)	v1.5.12 (EXP-10)	Delta
Sessions	10	10	0
Worked	432	326	-106
Skipped	700	1169	+469
Skip rate	61.8%	78.2%	+16.4pp
Avg precision	50.8	48.3	-2.5
Max precision	85	84	-1
Sigils >= 80	3 (0.7%)	3 (0.9%)	0
>=80/session	0.3	0.3	0.0
Mishap rate	37.5%	37.1%	-0.4pp
Min per 80+	202	201	-1
Scribed (>=90)	1	0	-1
Encountered/session	113.2	149.5	+36.3

EXP-10 verification:
- Skip triggers: 875 "below 13" in v1.5.12 vs 593 "below 10" in baseline. Working correctly.
- False positives: 0 of 229 baseline sigils starting at 10-12 reached 80+. Zero FPs confirmed across all cumulative data (1050+ eligible sigils).
- Time saved: 229 baseline sigils × 10.8 avg iters = ~2464 iterations eliminated per 10 sessions.
- Encountered +36 more sigils per session from faster skipping.
Avg precision drop (-2.5): Unexpected but explained by session variance. The 13-14 band dropped from 51.1 to 47.0 (removing 10-12 starters should have raised the average). Natural character/session variation, not a threshold effect.
Verdict: KEEP. Neutral on the key metric (>=80/session = 0.3 in both). Mechanically correct with zero false positives across all data ever collected. Eliminates ~25 wasted iterations per session on provably unproductive sigils. Safe, conservative filter.
Action: KEPT. v1.5.12 becomes new baseline. EXP-7 staged as v1.5.13.

EXP-7: Difficulty-based action selection (v1.5.13) — Completed, KEPT (neutral)

Background: EXP-6 fixed difficulty ordering; EXP-7 replaces risk-based action selection with difficulty-first, cost-as-tiebreaker. Data shows gain determined entirely by difficulty (trivial=2.3 to difficult=13.3), cost has zero correlation.
Change: Phase 2 action selection: prefer highest difficulty, break ties by lowest impact.
Sessions: 10 (Barrask, Fidon, Throve + 7 others)
Logs: ~/SH_logs/v1.5.13/

Metric	v1.5.12 (baseline)	v1.5.13 (EXP-7)	Delta
Sessions	10	10	0
Worked	326	244	-82
Avg precision	48.3	53.2	+4.9
Avg iterations	9.7	11.0	+1.3
Sigils >= 80	3 (0.9%)	2 (0.8%)	-1
>=80/session	0.3	0.2	-0.1
Mishap rate	37.1%	41.0%	+3.9pp
Scribed	0	1	+1

Note: v1.5.13 worked count corrected from 327→244 by session-filtered re-analysis (Feb 2026). The v1.5.13 logs contained sessions from v1.5.11/v1.5.12/v1.5.13; only v1.5.13 sessions are now counted. v1.5.12 baseline (326) was already correct.

Key finding — viability filter is the binding constraint: Difficulty distribution did NOT shift between versions (~20% each difficulty in both). The viability filter typically leaves only 1 precision action viable per iteration, making selection preference irrelevant. The algorithm change is correct (better heuristic), but the practical effect is masked by the filter bottleneck.
Worked count drop (-82): v1.5.13 worked fewer sigils due to session variance (fewer encountered: 130.7 vs 149.5/session). The avg precision increase (+4.9) and higher avg iterations (11.0 vs 9.7) are consistent with spending more time per sigil.
Avg precision artifact: 65 "unknown" stop reasons in v1.5.12 (avg_prec=33.2) disappeared in v1.5.13 (parser improvement). Adjusting for these, the real v1.5.12 avg was ~52.1, making the actual delta ~+1.1 (within noise).
Verdict: KEEP (neutral). Correct heuristic with no downside. Practical effect masked by viability filter constraint. The important outcome is the architectural insight: the filter, not selection, controls outcomes. This redirects optimization to EXP-12 (viability loosening).
Action: KEPT. v1.5.13 becomes new baseline.

Viability Filter Analysis (post-EXP-7 investigation)

EXP-7 revealed the viability filter as the binding constraint. Full analysis across 8,165 iterations (v1.5.10 + v1.5.12 + v1.5.13):

Menu composition: 93.5% of menus have 1+ precision actions (6.5% have none — game constraint). 69.4% have 2+ precision actions. Menus are NOT the bottleneck.
Viability filter acceptance: 91.9% of precision actions pass viability (using post-action resource values). Only 1,408 rejections across 8,165 iterations.
IMPORTANT timing bias: The analysis used POST-action resource values. For IMPROVE iterations, these are inflated (IMPROVE restores resources). The actual script checks viability with PRE-action (depleted) values. This means the 91.9% acceptance rate overstates reality. The true acceptance rate during refresh iterations is much lower.
Refresh rate breakdown (44.5% = 3,630 / 8,165):
- ~530 (14.6%): Menu had zero precision actions (game constraint, unfixable)
- ~3,100 (85.4%): Menu had precision, but viability rejected all (filter constraint)
- The viability filter IS the primary bottleneck, not a separate "recover mode."
Rejection reasons (of 1,408 post-action rejections):
- 87.1% margin <= 0 (stat too low for difficulty)
- 12.9% margin=1 with trivial/straightforward (filter disallows low-difficulty at tight margin)
Counterfactual scenarios (biased low due to timing issue):
- Scenario A (accept margin=1 for all difficulties): converts 106 refreshes, +0.5 prec/sigil
- Scenario B (accept margin=0 for challenging+): converts 163 refreshes, +1.8 prec/sigil
- Scenario C (both): converts 262 refreshes, -3.3pp refresh rate
- Real impact likely 2-3x these estimates after correcting for timing bias.
Implication: EXP-12 (loosen viability to margin >= 0 for challenging+) is the next highest-leverage experiment. EXP-8 (repair window) dropped — repairs are too rare (0-4 per 10 sessions) and the viability filter is the real bottleneck.

EXP-12: Loosen viability margin (v1.5.14) — Completed, REVERTED

Change: precision_action_viable? Path 2: margin > 0 → margin >= 0 for challenging+. Accepts actions where stat == difficulty for formidable/challenging/difficult.
Sessions: 10 (Throve/Refia/Byd split)
Logs: ~/SH_logs/v1.5.14/

Layer 2 metrics (raw log parsing, reliable):

Metric	v1.5.13 (baseline)	v1.5.14 (EXP-12)	Delta
Refresh rate	63.8%	62.9%	-0.9pp
Gain/action	7.86	8.21	+0.35
Mishap/iter	3.84%	4.25%	+0.41pp
Difficulty shift	—	+3pp difficult, +1.4pp challenging, -2.6pp formidable	—

Layer 1 metrics (session-filtered, corrected Feb 2026):

Metric	v1.5.13 (baseline)	v1.5.14 (EXP-12)	Delta
Sessions	10	10	0
Worked	244	233	-11
Scribed	1	2	+1
>=80	2	6	+4
>=80/session	0.2	0.6	+0.4
Avg precision	53.2	54.6	+1.4
Mishap rate	41.0%	47.6%	+6.6pp
Avg iterations	11.0	10.8	-0.2

Previous analysis used overcounted v1.5.13 baseline (593 worked, all versions in directory). Corrected analysis filters to v1.5.13 sessions only (244 worked).

Analysis (corrected): Session-filtered Layer 1 reveals a stronger positive signal than originally assessed. >=80 tripled (2→6), >=80/session tripled (0.2→0.6), scribes doubled (1→2), avg precision up +1.4. Layer 2 confirms per-iteration metrics are near-flat (refresh rate -0.9pp, gain/action +0.35, mishap/iter +0.41pp). The per-sigil mishap rate increased +6.6pp (41.0→47.6%), but this was partially masked in the original analysis by the inflated baseline denominator (593 worked → artificial 39.3% mishap rate).
Revert decision context: The revert was made based on the overcounted analysis showing +8.3pp mishap with only modest positive signals. With corrected data, the >=80 improvement is substantial (+4 sigils, 3x improvement) and the mishap delta is +6.6pp. Consider re-testing EXP-12 with corrected measurement infrastructure.
Verdict: REVERTED (decision made with overcounted data). Corrected analysis suggests the experiment may warrant re-testing.
Action: REVERTED. v1.5.13 remains baseline. EXP-9 staged as v1.5.15.

EXP-12 Retest: Loosen viability margin (v1.5.16) — Completed, REVERTED

Change: Identical to v1.5.14: precision_action_viable? Path 2: margin > 0 → margin >= 0 for challenging+. Re-test with session-filtered baseline after corrected analysis suggested the original positive signal (>=80 tripled) may have warranted keeping.
Baseline: v1.5.15 (includes EXP-9 resource exhaust coeff 1.75)
Sessions: 10
Logs: ~/SH_logs/v1.5.16/

Metric	v1.5.15 (baseline)	v1.5.16 (EXP-12r)	Delta
Sessions	10	10	0
Worked	265	271	+6
Skipped	1025	1104	+79
Scribed	1	0	-1
>=80	1	0	-1
>=80/session	0.1	0.0	-0.1
Avg precision	51.2	51.6	+0.4
Max precision	85	79	-6
Avg iterations	10.4	10.4	0.0
Mishaps	114	129	+15
Mishap rate	43.0%	47.6%	+4.6pp

Stop reasons:

Reason	v1.5.15	v1.5.16	Delta
mishap	114	129	+15
moves_exhausted	107	92	-15
resource_exhausted	26	32	+6
scribed	1	0	-1
sigil_vanished	17	18	+1

Analysis: The retest against the correct baseline (v1.5.15, which includes EXP-9) confirms the original revert decision. The positive signal seen in v1.5.14 (>=80 tripled 2→6 vs v1.5.13) does not reproduce against v1.5.15: zero sigils reached 80+, max precision dropped to 79, and mishap rate increased +4.6pp (43.0→47.6%). The relaxed viability margin allows marginal actions that produce more mishaps without compensating precision gains.
Verdict: REVERTED. Confirmed harmful. The original v1.5.14 positive signal was likely noise or an artifact of the v1.5.13 baseline lacking EXP-9's resource exhaustion change.
Action: REVERTED. v1.5.15 remains current baseline for future experiments.

Technique Test: Illuminated Sigil Comprehension (v1.5.9) — Completed

Background: Per Elanthipedia, all Sigil Comprehension technique bonuses are "globally disabled." There are 4 technique levels: Inspired, Enlightened, Illuminated, Awakened. Inspired and Enlightened have been enabled throughout all experiments (base effects only, bonuses disabled). Illuminated and Awakened are listed as "NOT enabled" on the wiki.
Goal: Determine if enabling Illuminated Sigil Comprehension has any measurable effect on sigil harvesting outcomes.
Change: Version tick only (v1.5.8 → v1.5.9). No algorithm change. All characters trained Illuminated Sigil Comprehension before running.
Sessions: 20 (10 original + 10 additional)
Logs: ~/SH_logs/v1.5.9/ (20 files: *_1353.log + *_1507.log)

Metric	v1.5.8 (10 sess)	v1.5.9 (20 sess)	Delta
Worked	473	790	+317
Worked/session	47.3	39.5	-7.8
Avg precision	51.6	52.0	+0.4
Max precision	88	89	+1
Avg iterations	10.7	10.7	0.0
Scribed (>=90)	2	5	+3
Sigils >= 80	7 (1.5%)	11 (1.4%)	-0.1pp
Mishap rate	47.4%	45.4%	-2.0pp
Min per 80+	87	107	+20

Verdict: No effect. Illuminated Sigil Comprehension is confirmed disabled, as the wiki states. Doubled sample size (20 sessions, 790 worked sigils) confirms all key metrics within noise of v1.5.8. No systematic shift attributable to the technique.

Session-Filtering Fix (Feb 2026)

Log files contain output from entire game sessions, which may include multiple SigilHarvest invocations across different versions. For example, the v1.5.13 logs contain sessions from v1.5.11, v1.5.12, and v1.5.13. Analysis scripts must filter to only the correct version's session per file.

Fix applied: Added last_session_runs(version) helper to LogParser. All analysis scripts updated to use session filtering. Re-analysis of all experiments with corrected methodology.

Validated (numbers unchanged): EXP-6, EXP-10+11, EXP-10, EXP-5 — earlier experiments used analysis scripts that already had session filtering (or had clean single-version logs).

Corrected: EXP-7 test (v1.5.13 worked: 327→244), EXP-12 baseline+test (v1.5.13 593→244, v1.5.14 364→233), EXP-9 baseline (v1.5.13 593→244). The overcounting originated from flat_map(&:sigil_runs) without version filtering, counting all sessions in the log file regardless of version.

Impact on decisions: EXP-12 revert decision was made with inflated baseline (593 worked, artificial 39.3% mishap rate). Corrected data showed >=80 tripled (2→6) with +6.6pp mishap (41.0→47.6%), suggesting a possible positive signal. Re-test completed (v1.5.16): the positive signal did not reproduce against the correct baseline (v1.5.15). Zero sigils reached 80+, max precision dropped to 79, mishap rate +4.6pp. Original revert decision confirmed.

Queued Experiments

Ordered by expected impact and dependency chain. One experiment per version, no bundling.

Version	Experiment	Description	Status
v1.5.10	EXP-6 (kept)	Fix difficulty ordering + filter ACTION verb	Complete
v1.5.11	EXP-10+11 (reverted)	Skip threshold + velocity bail-out (bundled)	Complete — over-triggered
v1.5.12	EXP-10 (kept)	Skip threshold < 13 (standalone)	Complete
v1.5.13	EXP-7 (kept)	Difficulty-based action selection (decouple risk)	Complete
v1.5.14	EXP-12 (reverted)	Loosen viability margin (accept margin=0 for challenging+)	Complete — revert may need re-evaluation (see corrected data)
v1.5.15	EXP-9 (kept)	Recalibrate resource exhaustion coefficient (2.25→1.75)	Complete — KEPT (neutral)
v1.5.16	EXP-12 retest (reverted)	Loosen viability margin (re-test with session-filtered baseline)	Complete — confirmed harmful
v1.5.17	Awakened technique (kept)	Confirm if technique is active (target 90 test)	Complete — >=80 improvement observed (22 sessions), mechanism unknown
v1.5.18	EXP-13 (revert)	Remove iteration cap + move budget check (resource-only bail-out)	Complete — all metrics regressed (0 scribes, mishap 63%, worked/sess 9.2)
v1.5.19	EXP-14 (revert)	Equalize action costs (Urbaj)	Complete — REVERT, but confounded (tested on EXP-13 broken base, not v1.5.17). Needs clean retest as v1.5.24.
v1.5.20	Baseline restore	Revert EXP-13+14, add C1 fix	Complete — missing EXP-9 resource check (0% resource_exhausted vs 12.1% baseline). C1 fix validated.
v1.5.21	Corrected baseline	Restore EXP-9 resource check	Complete — baseline confirmed, all metrics match v1.5.17
v1.5.22	EXP-15 (revert)	Align move budget max with iteration cap (14→15)	Complete — REVERT. Mishap +11.7pp, >=80 7→6, 0 scribes
v1.5.23	EXP-16 (revert)	Tighten resource exhaustion coefficient (1.75→1.5)	Complete — REVERT. Coefficient mathematically impossible: need prec>=18 at max resources. 0 worked, 100% skip.
v1.5.24	EXP-14 retest (kept)	Equalize action costs — clean standalone test ({1,2,3}→{1,1,1}) vs v1.5.21 baseline	Complete — KEPT. Neutral: mishap -1.6pp (n.s.), 2 real scribes, original "harmful" verdict was confounded
v1.5.25	EXP-17 (kept)	Resource-aware tiebreaker — when 2+ actions share highest difficulty, prefer action draining most-available resource	Complete — KEPT (neutral). Mishap -5.3pp (n.s. p=0.21), resource_exhausted -1.6pp, mechanically coherent stop-reason shift
v1.5.26	D7 fix (infrastructure)	Make repair logging unconditional — repairs confirmed non-existent	Complete — Phase 3 CLOSED
v1.5.27	EXP-18 (kept)	Minimum difficulty threshold — skip trivial (difficulty=1) precision actions, refresh for better menu	Complete — KEPT. Avg gain +1.36 (7.18→8.54), trivial-range 25.8%→3.6%, 60+ rate +6.9pp

EXP-6: Fix difficulty ordering + filter ACTION verb (v1.5.10)

Hypothesis: Two confirmed bugs compound to reduce precision gains.
1. Difficulty ordering bug: formidable is ranked 5 (highest) when it should be 3. Measured median gain: formidable=6, challenging=8, difficult=12. The algorithm selects formidable over difficult when close to target (precision >= 70), losing ~6 median precision per affected iteration at the most critical stage. Confirmed consistent across all v1.5.2–v1.5.8 data (8-9% of iterations affected per version).
2. ACTION verb bug: Per Elanthipedia, ACTION "there is a good chance nothing will happen but the danger level will rise." Confirmed: 21% zero-gain rate vs 0.0% for all other verbs. ~550 ACTION executions per 10 sessions, ~120 completely wasted (zero gain + danger increase). No other verb ever produces zero gain.
Changes:
- @action_difficulty: formidable => 3, challenging => 4, difficult => 5
- Skip actions where verb == "ACTION" during action selection
Risk: Low. Both are bug fixes backed by empirical data. Combined because neither is an algorithm hypothesis — they're corrections to known-wrong behavior.
Expected impact: Better end-game precision (difficult selected over formidable when close to target), ~120 fewer wasted iterations per 10 sessions, lower cumulative danger.

EXP-7: Difficulty-based action selection (decouple risk) — v1.5.13, STAGED

Hypothesis: The current risk = difficulty + cost composite conflates reward potential with resource drain. The algorithm picks ~20% each difficulty level regardless of distance. Decoupling reveals: gain is determined entirely by difficulty (trivial=2.3 to difficult=13.3), cost has zero correlation with gain (taxing=7.0, disrupting=6.8, destroying=7.0). This holds across all distances and all difficulty x cost combinations.
Calibration data (v1.5.10, 1971 actions post-difficulty-fix):
- Gain by difficulty: trivial=2.3, straightforward=4.5, formidable=6.8, challenging=9.6, difficult=13.3
- Gain by cost: taxing=7.0, disrupting=6.8, destroying=7.0 (no signal)
- Current selection: ~20% each difficulty (nearly uniform, ineffective)
- Theoretical uplift: +6.27 gain/iter if always picking difficult (13.2 vs 7.0 current avg)
- Over 10 iterations: +62.7 precision (obviously bounded by resource constraints)
Change: Replace risk-based comparison in Phase 2 action selection:
- Always prefer highest difficulty (maximize precision gain per iteration)
- Break ties by lowest cost/impact (conserve resources when gain is equal)
```
# Before (risk composite):
if far_from_target: prefer lowest risk
if close_to_target: prefer highest risk
# After (EXP-7):
prefer highest difficulty, then lowest impact as tiebreaker
```
Risk: Low-medium. Changes the core selection heuristic. Data strongly supports the change across all 1971 observed actions. EXP-6 difficulty fix must be in place (it is).
Depends on: EXP-6 (satisfied)
Status: STAGED in v1.5.13 code (123 tests passing). Ready to run after v1.5.12 analysis.

EXP-12: Loosen viability margin for challenging+ (v1.5.14, REVERTED)

Hypothesis: The viability filter is the primary bottleneck controlling the 44.5% refresh rate. Currently, Path 2 requires margin > 0 (stat > difficulty) for challenging+ actions. Loosening to margin >= 0 (stat >= difficulty) extends the productive phase by 1 resource point, allowing precision actions when resources are at the difficulty threshold instead of forcing IMPROVE. EXP-7's viability analysis showed ~85% of refreshes occur when the menu has precision actions that the filter rejects — this is the filter, not menu RNG.
Change: In precision_action_viable?, Path 2:
```
# Before:
return true if margin > 0 && difficulty > 2
# After:
return true if margin >= 0 && difficulty > 2
```
One-character change: > to >= in the margin comparison. Accepts margin=0 (stat == difficulty) for formidable, challenging, and difficult actions.
What this means practically:
- Formidable (difficulty=3): viable at stat >= 3 (was >= 4)
- Challenging (difficulty=4): viable at stat >= 4 (was >= 5)
- Difficult (difficulty=5): viable at stat >= 5 (was >= 6)
- Trivial/straightforward: unchanged (still require margin > 1, i.e., stat >= difficulty + 2)
Risk: Low-medium. Accepting margin=0 reduces the safety buffer.
Depends on: EXP-7 (satisfied)
Result (v1.5.14): Modest positive signals (refresh -0.9pp, gain +0.35/action, +3pp difficult shift) but per-sigil mishap rate 47.6%. Originally reverted due to overcounted baseline showing only modest gains. Session-filtered re-analysis revealed >=80 tripled (2→6). REVERTED then RE-TESTED as v1.5.16 with corrected measurement infrastructure.
Status: Re-staged as v1.5.16 (125 tests passing). Identical code change to v1.5.14.

EXP-8: Tune repair eligibility window

Hypothesis: Repairs currently only trigger when @sigil_precision >= (precision - 15) (within 15 of target). After the difficulty fix, repairs target the resource consumed by difficult actions (reward 12-15) instead of formidable (reward 4-9). The window could be expanded to start repairs earlier (enabling more difficult actions sooner) or tightened to reserve iterations for direct precision work.
Change: Adjust the precision - 15 threshold in select_repair_action. Test values: precision - 20 (wider) or precision - 10 (narrower).
Risk: Low. Only affects when repairs are attempted, not core precision selection.
Depends on: EXP-6

EXP-9: Recalibrate resource exhaustion coefficient (v1.5.15) — Completed, KEPT (neutral)

Hypothesis: The resource exhaustion check uses (san + res + foc) * 2.25 + precision < target - 5. The 2.25 coefficient assumes each resource star is worth ~2.25 precision.
Calibration data (v1.5.10, 762 actions with resource consumption data):
- Actual overall gain/star: 1.60 (current coefficient 2.25 is at P90)
- By difficulty: trivial=1.16, straightforward=1.48, formidable=1.55, challenging=1.70, difficult=1.72
- By cost: taxing=1.61, disrupting=1.56, destroying=1.62 (no significant variation)
- Distribution: P25=1.17, P50=1.50, P75=2.00, P90=2.25
- Current 2.25 is extremely optimistic — only 10% of iterations achieve this rate
Change: Lower coefficient from 2.25 to 1.75. This is between P50 (1.50) and P75 (2.00), and aligns with the difficult-action gain/star of 1.72 (which dominates under EXP-7's difficulty-first selection). At 1.75, the check exits sigils where even median-to-good performance per remaining star can't reach the target. At 2.25, it only exited when P90+ performance couldn't reach target — far too optimistic.
Risk: Low. Only affects bail-out timing. More sigils exit earlier (redirecting time to fresh sigils), potentially lower avg precision but higher throughput to 80+.
Depends on: EXP-7 (satisfied) — coefficient calibrated to post-EXP-7 difficulty preference.
Sessions: 10 (Barrask, Byd, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve)
Logs: ~/SH_logs/v1.5.15/

Layer 1 metrics (session-filtered):

Metric	v1.5.13 (baseline)	v1.5.15 (EXP-9)	Delta
Sessions	10	10	0
Worked	244	265	+21
Scribed	1	1	0
>=80	2	1	-1
>=80/session	0.2	0.1	-0.1
Avg precision	53.2	51.2	-2.0
Mishap rate	41.0%	43.0%	+2.0pp
Avg iterations	11.0	10.4	-0.6

Layer 2 metrics (raw log parsing):

Metric	v1.5.13 (baseline)	v1.5.15 (EXP-9)	Delta
Refresh rate	63.8%	63.1%	-0.7pp
Gain/action	7.13	7.09	-0.04
Mishap/iter	1.62%	2.09%	+0.47pp

EXP-9 specific metrics:

Metric	v1.5.13	v1.5.15	Delta
Resource exhaustion exits	1	26	+25
moves_exhausted	120	107	-13
Available stars median	109	92.5	-16.5

Primary mechanism confirmed: Resource exhaustion exits increased from 1→26 (+25). The stop-reason shift (moves_exhausted -13, resource_exhausted +25) shows the tighter coefficient is catching sigils that would have exhausted moves anyway.
Per-iteration metrics flat: Layer 2 confirms the coefficient change doesn't affect per-iteration behavior — refresh rate, gain/action, mishap/iter are all within noise. EXP-9 only changes when to give up, not how to play.
Outcome neutral: >=80 slightly down (2→1), avg precision -2.0, but these are within sample variance for 10 sessions. The +2.0pp mishap rate is noise (100→114 out of ~2700 iterations). No regression strong enough to justify revert.
Verdict: KEEP (neutral). Correct heuristic — aligns coefficient with observed gain/star distribution. No measurable benefit yet but no regression. The mechanism is sound (exits happening where predicted) and enables future experiments to build on a more realistic resource model.
Action: KEPT. v1.5.15 becomes new baseline.

EXP-10+11: Skip threshold + velocity bail-out (v1.5.11, bundled, STAGED)

Hypothesis: Sigils starting at precision 10-12 reach 80+ at only 0.45-0.61% rate vs 1.01-3.38% for starting precision 13+. These low-start sigils consume iterations (avg 9.7 per sigil) but almost never succeed. Skipping them frees those iterations for fresh sigils with higher expected value. Across 4095 worked sigils (v1.5.2-v1.5.9), raising the skip threshold from < 10 to < 13 yields an estimated net +15.6 additional 80+ sigils from redirected time.

Change: In improve_sigil (line 324), raise the skip threshold for target >= 80 from < 10 to < 13:

# Before:
if @args.precision.to_i >= 80 && @sigil_precision < 10
# After:
if @args.precision.to_i >= 80 && @sigil_precision < 13

Data basis: Starting precision distribution and outcomes (all versions pooled):
- Start 10: 1040 sigils, avg final 40.1, 0.58% reach 80+
- Start 11: 880 sigils, avg final 41.5, 0.45% reach 80+
- Start 12: 621 sigils, avg final 44.3, 0.61% reach 80+
- Start 13: 396 sigils, avg final 45.4, 1.01% reach 80+
- Start 14: 254 sigils, avg final 49.8, 1.57% reach 80+
- Breakpoint at 12→13 is consistent across all 8 versions individually.
Risk: Low. Only affects which sigils are attempted, not the algorithm itself. Lost 80+ sigils (those starting 10-12 that would have made it) are offset ~3:1 by new 80+ sigils from the saved iterations.
Depends on: EXP-6 (so difficulty fix is in place; the data holds regardless, but testing should be sequential)
Bundling rationale: EXP-10 and EXP-11 are bundled because they affect orthogonal code paths (start-of-sigil skip vs mid-run bail-out), neither changes the core algorithm, and both had zero false positives across 4097 sigils. Combined simulation: net +21.0.
EXP-11 component — Precision velocity bail-out: After 5 iterations, sigils with average gain per iteration < 4 have a 0.0% rate of reaching 80+ (0 out of 1399 across all v1.5.2-v1.5.9 data). These "slow grinder" sigils start above the skip threshold but never gain momentum. Code adds @start_precision tracking and a velocity check after the move budget check:
```
if @num_iterations >= 5 && @start_precision
  velocity = (@sigil_precision - @start_precision).to_f / @num_iterations
  if velocity < 4.0
    return false
  end
end
```
Combined data basis (4097 sigils, v1.5.2-v1.5.9):
- EXP-10 alone: skip 1990, lose 11 80+, save 19333 iters, net +15.6
- EXP-11 alone: bail 1399, lose 0 80+, save 6933 iters, net +9.5
- Combined: skip+bail 2784, lose 11 80+, save 23298 iters, net +21.0
Status: TESTED and REVERTED. Velocity bail-out (EXP-11) over-triggered at 74% of worked sigils due to continuous checking (every iter >= 5) vs simulation's single check at iter 5. EXP-10 (skip threshold) mechanically correct but untested in isolation. EXP-11 moved to killed ideas. See results above.

External Feedback: Game Mechanic Corrections (Feb 2026)

Feedback from experienced player (Urbaj) identified three incorrect assumptions in Matt's original script (our starting point). All three claims validated against our existing calibration data (762 actions from EXP-9, 566 worked sigils from v1.5.15+v1.5.17).

1. No hard iteration cap in the game. The game allows unlimited PERC SIGIL IMPROVE iterations as long as you have resources. The script's hard cap of 15 iterations (line 271) and move budget check (line 291, which uses 14 - iterations as remaining moves) are artificial limits. Validation: only 10/566 sigils (1.8%) reach iteration 14-15, but the move budget check stops 215/566 (38%) of worked sigils. 59 of those 215 had precision >= 60, and 22 had >= 70. These are sigils still climbing that get cut off by an assumption that doesn't match the game.

2. Action cost labels describe WHICH resource, not HOW MUCH. The labels "destroying"/"disrupting"/"taxing" are static descriptors for which resource is consumed: sanity→destroying, focus→disrupting, resolve→taxing. They do NOT predict how much resource is consumed. The current @action_cost = { "taxing" => 1, "disrupting" => 2, "destroying" => 3 } is wrong. Validation: gain/star by cost label is flat (taxing=1.61, disrupting=1.56, destroying=1.62). If destroying consumed 3x more than taxing, gain/star for destroying would be ~2.33 and taxing would be ~7.0.

3. Difficulty is the sole predictor for both gain AND resource consumption. Both precision gain and resource consumption are determined by difficulty, not cost label. Validation from EXP-7/EXP-9 calibration data:

Gain by difficulty: trivial=2.3, straightforward=4.5, formidable=6.8, challenging=9.6, difficult=13.3 (strong monotonic signal)
Gain by cost: taxing=7.0, disrupting=6.8, destroying=7.0 (no signal)
Gain/star by difficulty: 1.16→1.72 (higher difficulty = more efficient per star)
Gain/star by cost: 1.61, 1.56, 1.62 (no signal)

Higher difficulty actions are actually MORE resource-efficient (gain/star increases with difficulty), meaning EXP-7's preference for highest difficulty is even more correct than originally justified — it maximizes both precision gain and resource efficiency.

EXP-13: Remove iteration cap + move budget check

Hypothesis: The hard iteration cap of 15 and the move budget check (which assumes 14 max useful iterations) are artificial limits not imposed by the game. 38% of worked sigils are stopped by these iteration-based limits. Removing them and relying solely on the resource exhaustion check (EXP-9) allows sigils with remaining resources to continue gaining precision. The resource exhaustion check directly measures whether remaining resources can reach the target — it doesn't need an iteration count proxy.
Changes:
1. Remove hard cap at iteration 15 (line 271-274)
2. Remove move budget check (14 - @num_iterations) * 13 < ... (line 291-295)
3. Adjust scribe-near-cap logic (line 261): remove @num_iterations >= 15 condition, keep scribing at target or target-5 based on resource exhaustion proximity
What the resource exhaustion check already handles: (san + res + foc) * 1.75 + precision < target - 5 exits when remaining resources can't plausibly reach target. This is a direct measurement, not an iteration-count proxy.
Data basis: 181/566 (32%) stopped by move budget. Monte Carlo projection (5000 sims per sigil using observed gain/productive-rate/mishap-rate distributions):
- 37 of 181 (20%) have >50% probability of reaching 85 with more iterations
- 15 of 181 (8%) have >50% probability of reaching 90
- Sigils at 70+ with 10+ remaining stars have 55-72% chance of reaching 90
- No infinite loop risk: 76% of refreshes produce viable action next iteration; max observed refresh streak was 6. Resource drain provides natural termination.
Additional change: Raise skip threshold to <15 for target 90. Analysis shows start=13 (209 sigils, max=83, 0 scribes) and start=14 (162 sigils, max=78, 0 scribes) never reach 85. Only start=15 has scribe potential (1 scribe, 6 >=80 in 194 sigils). Saves 3,851 iterations with zero lost scribes.
Risk: Medium. Without iteration limits, sigils burn through more resources per run. Fewer sigils attempted per session (each takes longer). Net effect depends on whether extended sigils convert to 80+ at a higher rate than fresh sigils would. Resource exhaustion check provides the safety net. No loop risk — confirmed by refresh analysis.
Depends on: EXP-9 (satisfied — resource exhaustion check in place)
Expected impact: HIGHEST of any remaining experiment. 32% of sigils currently stopped may continue. Simulation projects 15-37 additional scribes per 566 worked. Combined with skip threshold (fewer wasted attempts), net throughput should increase.
Sessions: 11 (all complete 60min, 11 characters)
Logs: ~/SH_logs/v1.5.18/ (extracted via session_splitter.rb; 4 split sessions)

Metric	v1.5.17 (22 sess)	v1.5.18 (11 sess)	Delta
Worked	555	101	-454
Worked/session	25.2	9.2	-16.0
Skipped	2249	1428	-821
Scribed	3	0	-3
Scribes/session	0.18	0.0	-0.18
>=80	14	2	-12
>=80/session	0.64	0.18	-0.46
Avg precision	52.8	55.0	+2.2
Max precision	88	80	-8
Avg iterations	10.5	11.0	+0.5
Max iterations	15	18	+3
Iters > 15	0	1	+1
Mishap rate	42.0%	63.4%	+21.4pp

Stop reason	v1.5.17	v1.5.18	Delta
mishap	233 (42%)	64 (63%)	-169
moves_exhausted	209 (38%)	0 (0%)	-209
resource_exhausted	67 (12%)	17 (17%)	-50
scribed	3	0	-3
sigil_vanished	41	20	-21

Per-character breakdown (all 11 characters, 0 scribes universally):

Character	Sigils	Worked	Skipped	Mishap%
Barrask	142	8	134	62.5%
Byd	143	8	135	75.0%
Christus	141	8	133	50.0%
Fidon	139	10	129	70.0%
Gnarta	146	8	138	50.0%
Jazriel	143	7	136	57.1%
Kythkani	133	9	124	66.7%
Mahtra	137	9	128	55.6%
Nelis	133	11	122	63.6%
Refia	133	12	121	75.0%
Throve	139	11	128	63.6%

Note: ~92-93% skip rate across all characters (vs ~80% in v1.5.17 baseline). Mishap rate ranges 50-75% per character (vs 42% baseline), with no character showing improvement.

Analysis: Every key metric regressed. Two compounding problems:
1. Skip threshold <15 for target 90 — Eliminated 63% of worked sigils (25.2→9.2 per session). The Monte Carlo projection was correct that start <15 rarely reaches 90, but the throughput cost is devastating: far fewer sigils attempted means far fewer chances at any high-precision outcome.
2. Removed iteration cap — Sigils now run up to 18 iterations, but the extra iterations mostly produce mishaps. Mishap rate jumped 42%→63.4%. The simulation's projection of 15-37 additional scribes did not materialize — extended sigils hit mishaps before converting. Max precision actually dropped (88→80). The slight avg precision increase (+2.2) is a selection artifact: only high-starting sigils (15+) are worked, so the floor is higher. But this doesn't compensate for the catastrophic loss of throughput and high-precision outcomes.
Post-mortem: The Monte Carlo model overestimated scribe potential because it assumed uniform mishap probability per iteration. In practice, mishap risk likely compounds as danger accumulates over extended runs. This experiment bundled THREE changes (remove iteration cap, remove move budget, raise skip threshold to <15 for target 90), violating the one-change-per-version protocol. It is impossible to isolate which change caused which portion of the regression. Worse, EXP-14 was then tested on top of this broken base, making its results confounded as well (see EXP-14 confounding note).
Verdict: REVERT. Zero scribes, 63% mishap rate, 0.18 >=80/session. All changes from EXP-13 must be reverted. The individual components (skip threshold alone, cap removal alone) could be re-tested as separate experiments if desired.
Action: REVERT. v1.5.17 remains baseline. EXP-14 (v1.5.19, cost equalization) also REVERT — see EXP-14 results below.

EXP-14: Equalize action costs (v1.5.19) — Complete, REVERT

Hypothesis: The @action_cost mapping is wrong (Urbaj's claim 2). Confirmed with 100% correlation from 2,416 action iterations: destroying=sanity (771/772), disrupting=focus (843/844), taxing=resolve (800/800). Each consumes ~4.2 stars of exactly one resource. The labels are static resource descriptors, not cost predictors. Difficulty is the sole independent variable for both gain and resource consumption.
Change: Equalize @action_cost from { taxing: 1, disrupting: 2, destroying: 3 } to { taxing: 1, disrupting: 1, destroying: 1 }. This is the minimal, isolated change.
Downstream effects:
- impact field is now always 1 (no cost differentiation between actions)
- risk composite = difficulty + 1 (same ordering as difficulty alone)
- EXP-7 tie-breaking at equal difficulty becomes a no-op (first encountered wins)
- Repair selection (line 671) becomes purely difficulty-based
What was NOT changed (deferred to future experiments if warranted):
- The resource-aware tiebreaker concept (prefer action consuming most-available resource) was considered but deferred. One change per version. The viability filter typically leaves only 1 option per iteration anyway, making tiebreakers rarely exercised.
- The risk composite still computed as difficulty + cost but with equal costs it equals difficulty + 1, which preserves correct ordering without a code change.
Risk: Low. Viability filter typically leaves only 1 option per iteration.
Depends on: EXP-7 (satisfied)
Expected impact: LOW. Correct in principle but rarely exercised in practice.
Tests: 131 examples, 0 failures. Updated @action_cost setup, build_improvement defaults, 12 fixture impact/risk values, reworked tie-breaking test to verify first-encountered behavior when costs are equal.
Sessions: 11 (all complete 60min, 11 characters)
Logs: ~/SH_logs/v1.5.19/ (extracted via session_splitter.rb; 4 split sessions)
Note: v1.5.19 inherits ALL EXP-13 changes (removed iteration cap, resource-only bail-out, skip <15 for target 90). Since EXP-13 is REVERT, these results reflect both the catastrophic EXP-13 base AND the cost equalization. Compare vs v1.5.18 to isolate EXP-14's effect, and vs v1.5.17 for true baseline.

Cross-version comparison (11 sessions each for fair comparison):

Metric	v1.5.17 (baseline)	v1.5.18 (EXP-13)	v1.5.19 (EXP-14)	EXP-14 delta
Worked	555	101	103	+2
Worked/session	50.5	9.2	9.4	+0.2
Skipped	2249	1428	1429	+1
Real scribes	1	0	0	0
C1 fake scribes	2	0	1	+1
>=80	14	2	3	+1
>=80/session	1.27	0.18	0.27	+0.09
Avg precision	52.8	55.0	52.8	-2.2
Mishap rate	42.0%	63.4%	74.8%	+11.4pp

Note on v1.5.17 session count: The v1.5.17 directory has 11 files containing 22 sessions (2 batches merged). Per-file metrics show 50.5 worked/file but the per-session baseline used for Z-score calculations elsewhere uses 22 sessions (25.2 worked/session). This comparison uses per-file numbers for apples-to-apples vs the 11-session EXP-14 data.

Stop reason	v1.5.17	v1.5.18	v1.5.19	EXP-14 delta
mishap	233 (42%)	64 (63%)	77 (75%)	+13
resource_exhausted	67 (12%)	17 (17%)	6 (6%)	-11
sigil_vanished	41 (7%)	20 (20%)	19 (18%)	-1
scribed	3	0	1 (C1 fake)	+1
moves_exhausted	209 (38%)	0	0	0

The C1 fake scribe: Mahtra Sigil#127, precision 87, scribe_count=nil. Reached 87 but did not actually scribe (mishap or vanish), classified as SCRIBED by C1 bug.

>=80 sigils detail (all 3 ended in failure):

Character	Sigil#	Start	Final	Iters	Stop
Barrask	#97	15	84	18	mishap
Gnarta	#103	15	84	12	sigil_vanished
Mahtra	#127	15	87	13	C1 fake "scribed"

All worked sigils start at exactly precision 15 (the skip <15 threshold from EXP-13).

Analysis: Cost equalization appeared to make things WORSE, not neutral:
1. Mishap rate 74.8% — highest of any version. The cost equalization removed the penalty for "destroying" actions (formerly cost=3). With all costs = 1, the algorithm no longer discriminates against resource-intensive actions, allowing more aggressive action selection. The result: more mishaps without compensating gains.
2. Resource exhaustion dropped (17→6) — fewer sigils run out of resources because they mishap before reaching resource exhaustion. This is not an improvement.
3. No precision improvement: avg precision 52.8 (= v1.5.17 baseline), >=80 count 3 vs v1.5.18's 2 (noise range, all failed anyway).
4. The hypothesis was correct but the effect is harmful: Urbaj's observation that cost labels are resource descriptors (not cost predictors) is confirmed. But the old cost weighting {1,2,3} provided an accidental benefit — it penalized "destroying" actions, which happen to consume sanity. This implicit conservation was better than no conservation.
Post-mortem: The expected "LOW impact" assessment was wrong. While the viability filter usually leaves 1 option, equalizing costs affects the RISK composite (difficulty + cost) used in action selection. With costs equalized, RISK = difficulty + 1 for all actions, making the algorithm select purely by difficulty. In cases where multiple actions have the same difficulty, the tiebreaker changes. More importantly, the repair selection logic (line 671) becomes purely difficulty-based, potentially accepting riskier repair attempts.
Verdict: REVERT (but see confounding note below).

⚠ CONFOUNDING NOTE (Feb 2026 retrospective)

EXP-14 was tested on EXP-13's broken code base (removed iteration cap, removed move budget, skip <15 for target 90). It was never tested against the real v1.5.17 baseline. This means the +11.4pp mishap increase attributed to cost equalization is confounded with EXP-13's already-catastrophic 63.4% mishap rate. The conclusion that cost equalization is "harmful" is not supported by clean data.

Why this matters:

EXP-13 removed the iteration cap, so sigils ran 15-18 iterations where mishap probability compounds. Cost equalization on extended runs has a different effect than cost equalization on capped runs.

The "accidental diversification" theory (point 4 above) is unvalidated post-hoc reasoning. The viability filter "typically leaves only 1 option per iteration" — a rarely-exercised tiebreaker cannot plausibly cause +11.4pp mishap increase on baseline code where runs are capped at 15 iterations.

Urbaj's data is validated by ours (100% correlation, 2,416 actions). The {1,2,3} mapping is provably wrong. It deserves a standalone test against the confirmed v1.5.21 baseline.

Action: Schedule v1.5.24 as a clean standalone cost equalization test ({1,2,3} → {1,1,1}) against v1.5.21 baseline. See queued experiments.

v1.5.20 Baseline Restore — Complete (partial)

Changes: Reverted EXP-13+14 to v1.5.17 algorithm. C1 fix (@actually_scribed flag). D7 analyzer fix (repair_count increment). Version 1.5.20.
Sessions: 11 (all complete 60min, 11 characters)
Logs: ~/SH_logs/v1.5.20/

v1.5.20 vs v1.5.17 Comparison (11 sessions each for fair comparison):

Metric	v1.5.17 (22 sess)	v1.5.20 (11 sess)	Delta
Worked/session	25.2	23.5	-1.8
Skip rate	80.2%	79.8%	-0.4pp
Real scribes	1	1	0
>=80/session	0.64	0.27	-0.36
Avg precision	52.8	53.5	+0.7
Max precision	88	88	0
Avg iterations	10.5	10.8	+0.3
Mishap rate	42.0%	46.9%	+4.9pp

Stop Reasons (key finding — resource_exhausted = 0%):

Reason	v1.5.17	v1.5.20	Delta
mishap	42.0%	46.9%	+4.9pp
moves_exhausted	37.7%	46.1%	+8.4pp
resource_exhausted	12.1%	0.0%	-12.1pp
sigil_vanished	7.4%	6.6%	-0.8pp

Bug discovered: EXP-9 resource exhaustion check ((san+res+foc)*1.75 + prec < target-5) was accidentally omitted during the EXP-13→v1.5.20 revert. The check lives in sigil_info (after resource parsing), not in the Phase 3 bail-out block that was restored. The 12.1% of sigils that should exit via resource_exhausted instead continued to moves_exhausted (+8.4pp) or mishap (+4.9pp).

Bug fix validations:

C1: 1 SCRIBED result, 1 real (scribe_count=2). Zero fakes. VALIDATED.
D7: Code correct, but 0 repairs observed in this batch (repairs are rare).

>=80 detail (3 sigils):

Barrask Sigil#67: start=14, final=84, iters=12, stop=mishap
Mahtra Sigil#33: start=14, final=88, iters=15, stop=scribed (2 scrolls) — REAL scribe
Mahtra Sigil#62: start=15, final=81, iters=12, stop=mishap

Verdict: Partial baseline. Core algorithm matches v1.5.17 but missing resource check inflates mishap rate by ~5pp and moves_exhausted by ~8pp. Fixed in v1.5.21.

v1.5.21 Corrected Baseline — Complete, BASELINE CONFIRMED

Changes: Restores EXP-9 resource exhaustion check. No other changes. Algorithm identical to v1.5.17 + C1 fix + D7 analyzer fix.
Sessions: 11 (all complete 60min, 11 characters)
Logs: ~/SH_logs/v1.5.21/

v1.5.21 vs v1.5.17 vs v1.5.20 Comparison:

Metric	v1.5.17 (22 sess)	v1.5.20 (11 sess)	v1.5.21 (11 sess)	21 vs 17
Worked/session	25.2	23.5	23.8	-1.4
Skip rate	80.2%	79.8%	79.6%	-0.6pp
Real scribes	1	1	0	-1
>=80	14	3	7
>=80/session	0.64	0.27	0.64	0.0
Avg precision	52.8	53.5	51.6	-1.2
Max precision	88	88	87
Avg iterations	10.5	10.8	10.2	-0.3
Mishap rate	42.0%	46.9%	40.8%	-1.2pp

Stop Reasons (resource_exhausted restored):

Reason	v1.5.17	v1.5.20	v1.5.21	21 vs 17
mishap	42.0%	46.9%	40.8%	-1.2pp
moves_exhausted	37.7%	46.1%	39.7%	+2.0pp
resource_exhausted	12.1%	0.0%	13.4%	+1.3pp
sigil_vanished	7.4%	6.6%	6.1%	-1.3pp

>=80 detail (7 sigils — 4 budget-stopped, 3 mishapped):

Character	Sigil	Start	Final	Iters	Danger	Stop
Barrask	#60	13	83	14	18	moves_exhausted
Fidon	#23	13	84	14	18	moves_exhausted
Gnarta	#112	14	80	14	18	moves_exhausted
Mahtra	#102	15	82	14	18	moves_exhausted
Jazriel	#37	14	87	12	17	mishap
Kythkani	#12	15	84	10	7	mishap
Throve	#27	15	83	10	7	mishap

Key observation: 4 of 7 >=80 sigils were stopped by the move budget at iteration 14 with 1 iteration remaining before the cap. At precision 80-84 (gap of 1-5 to scribe threshold of 85), a single additional iteration at avg 7.3 gain would likely scribe them. This is the strongest signal yet for the "loosen move budget" experiment.

Verdict: BASELINE CONFIRMED. All metrics match v1.5.17 within normal variance. v1.5.21 is the corrected baseline with working C1/D7/EXP-9 instrumentation.

Assumption Audit: Comprehensive Data Analysis (Feb 2026)

Systematic audit of all script assumptions using 4,608 iterations from 566 worked sigils (v1.5.15 + v1.5.17 combined). Script: scratchpad/assumption_audit.rb.

Finding 1: Cost label → resource mapping is 100% confirmed

Cost Label	Consumes	Hit Rate	Avg Stars Consumed
destroying	sanity	771/772 (100%)	-4.32
disrupting	focus	843/844 (100%)	-4.23
taxing	resolve	800/800 (100%)	-4.15

Each action consumes ~4.2 stars of exactly one resource. The @action_cost mapping {taxing:1, disrupting:2, destroying:3} is provably wrong. Urbaj's claim confirmed with perfect correlation. Refreshes consume zero resources but increase danger by ~1.0.

Finding 2: Clarity is a degrading hidden variable

Clarity ALWAYS decreases during a sigil (474/485 sigils), never increases
Per-iteration: 38.3% of iterations decrease clarity, 0% increase it, mean -1.17
Refreshes degrade clarity 25x faster than actions (-2.52 vs -0.1 per iteration)
Weak positive correlation: clarity 70-79 → 6.7 avg gain; clarity 90-99 → 7.3 avg gain
Starting clarity: range 88-99, mean 96.3 (no signal from starting clarity binning)
Implication: Refreshes have a hidden cost — they degrade clarity much faster than actions. This reinforces the value of minimizing refreshes (already our strategy). Unclear if clarity directly affects game outcomes or is cosmetic.

Finding 3: Resource level directly affects gain per iteration

Total Stars	N	Avg Gain	Median	Zero%
0-5	3	5.0	4	33.3%
11-15	17	4.8	5	5.9%
16-20	92	6.9	6	0.0%
21-30	626	6.5	5	0.0%
31+	1831	7.5	7	0.1%

Per-resource: consuming a resource at level 3-5 gives ~3-5 gain; at level 10+ gives ~7-8 gain. Danger shows no signal (gain flat across 0-17). Resource depletion doesn't just limit iterations — it reduces per-iteration effectiveness. The resource exhaustion coefficient (1.75) may understate the impact because it doesn't account for diminishing returns at low resource levels.

Finding 4: Skip threshold should be 15 for target 90 (INVALIDATED — see SIM-7 below)

Start Precision	Count	Avg Final	Max	>=80	>=85	Scribed
13	209	50.9	83	1	0	0
14	162	51.3	78	0	0	0
15	194	53.5	85	6	1	1

All 566 worked sigils start at 13-15 (current threshold skips <13). Start=13 and start=14 never reach 85 in 371 attempts. Raising threshold to <15 saves 3,851 iterations (~37 sigils × 10.4 avg iters) with zero lost scribes. Only start=15 has any scribe potential. This should be part of EXP-13 or a standalone micro-experiment.

CORRECTION (Feb 2026): This analysis used combined v1.5.15+v1.5.17 data (566 worked). The v1.5.15 data was collected WITHOUT Awakened technique. SIM-7 (below), using v1.5.17 data only (555 worked, WITH Awakened), shows ALL 3 scribes started at precision 13. Awakened provides enough of a boost that start=13 CAN reach 90. The pre-Awakened data diluted this effect in the combined dataset. Skip <15 is DEAD for post-Awakened testing. The current skip <13 threshold is correct.

Finding 5: Iteration cap prevents 15-37 potential scribes per 566 worked

181 sigils (32% of worked) were stopped by the move budget check. Monte Carlo projection (5,000 simulations each, using observed gain distribution):

37 of 181 (20%) have >50% probability of reaching 85 with more iterations
15 of 181 (8%) have >50% probability of reaching 90
Several sigils at 70+ with 10-15 remaining stars have 55-72% chance of reaching 90
Confirms EXP-13 (remove cap) as highest-impact change

Finding 6: No infinite loop risk without cap

76% of refreshes produce a viable action on the next iteration
Refresh streaks: mean 1.2, max 6. Only 3.9% are 3+ consecutive, 0.2% are 5+.
Resource drain provides natural termination: refreshes cost 0 resources but +1 danger, and resource exhaustion check catches depleted sigils
Safe to remove iteration cap with resource-based exit as primary guard

Finding 7: <= 80 guard — minor impact at current volumes

6 sigils reached 80-84 but couldn't scribe (dead ends). However, all 6 had viable resource projections when crossing 80 — they died to mishaps (4/6) or iteration cap. The resource check with <= 80 guard wouldn't have caught any of them earlier. 11 wasted iterations above 80 with zero gain. Low priority fix — mishaps, not the guard, are the primary cause of dead ends at 80+.

Queue updates from audit (updated post-EXP-13 results):

Priority	Change	Experiment	Status
1	Remove iteration cap + move budget	EXP-13	REVERTED — mishap rate 63.4%, 0 scribes
2	Raise skip threshold to <15 for target 90	Was in EXP-13	REVERTED as bundle — data valid but must be isolated
3	Equalize costs	EXP-14	REVERTED — mishap rate 74.8%, 0 real scribes
4	Resource-aware action selection (prefer full resources)	Future	Untested
5	Monitor clarity degradation	Observational	Finding 2 above — refreshes degrade 25x faster

Code-Level Assumption Audit (v1.5.19, Feb 2026)

Systematic line-by-line review of every hardcoded value, threshold, and decision point in the algorithm. Each entry identifies the assumption, its test status, and whether it can be isolated for experimentation.

A. Magic Numbers & Thresholds

ID	Line	Value	Assumption	Status
A1	264	`1.75`	Resource stars → precision conversion coefficient	EXP-9 (kept). Sub-assumption: all 3 resources are fungible — untested
A2	162,270,283	`target - 5`	Minimum useful scribe precision / bail-out margin	Game mechanic? Untested whether -3 or -7 is better
A3	323	`< 15`	Skip threshold for target 90	Data-confirmed (0 scribes from start<15). Part of EXP-13 revert — retest standalone
A4	328	`< 13`	Skip threshold for target 80	EXP-10 (kept)
A5	290	`< 2`	Max 2 aspect repairs per sigil	Original design. Never tested.
A6	669	`precision - 15`	Only repair when within 15 of target	Never tested.
A7	666	`<= 3`	Only trivial/straightforward/formidable repairs	Never tested.
A8	668	`>= 2`	Repair margin requirement (stricter than precision's > 0)	Never tested.
A9	291	`<= 18`	Don't repair when danger > 18	Near max, rarely reached
A10	379	`>= 14`	Trader luck threshold (guild-specific)	Never tested. Hard to isolate
A11	46	`1,2,3,4,5`	Difficulty ordinal values	EXP-6 (confirmed)
A12	43	`1,1,1`	Cost labels equalized	EXP-14 (reverted). Data-confirmed but harmful — old {1,2,3} provided useful implicit diversification

B. Algorithm Decision Logic

ID	Line	Logic	Assumption	Status
B1	648-661	`margin > 1` any, `margin > 0` for challenging+	Viability cutoffs	EXP-12 (margin>=0 reverted twice, +6.6pp mishap)
B2	233	Prefer highest difficulty	Max difficulty → max gain	EXP-7 (kept, confirmed)
B3	226	Skip ACTION verb	ACTION has 24.8% zero-gain rate	EXP-6 (confirmed)
B4	200-213	Repair when `stat - difficulty < 2 AND difficulty >= 3`	Pre-scan threshold for repair candidates	Untested — the difficulty >= 3 filter means we never repair for trivial/straightforward
B5	301-303	Refresh when no action available	Only alternative is quitting the sigil	Game mechanic — but refreshes have hidden clarity cost (Finding 2)

C. Known Bug

ID	Line	Bug	Impact
C1	162	SCRIBED misclassification	ACTIVE — 50% of all "SCRIBED" results across all versions are fakes. After loop exit (including mishaps), any sigil with precision >= target-5 is classified as SCRIBED even if no scribing occurred. Confirmed from raw logs: 10 of 20 "SCRIBED" results have zero "You carefully scribe" game messages. Additionally, the analyzer (line 566) inherits the bug by assigning `stop_reason=:scribed` based on the script's result field. CRITICAL FIX: (1) Script: track `@actually_scribed` flag, (2) Analyzer: use `scribe_count > 0` not result field.

D. Untested Game Mechanics

ID	Question	Current Position	Measurable From Logs?
D1	Does danger affect mishap probability?	Data says no (uniform 0-18)	Yes — measured in Finding 3
D2	Does clarity affect precision gain?	Weak signal (+0.6 from 70-79 to 90-99)	Yes — measured in Finding 2
D3	Resource consumption per difficulty	~4.2 stars per action (Finding 1)	Partially — need delta analysis per difficulty level
D4	Do refreshes cost resources?	Finding 1 says 0 resources, +1 danger	Yes — measured
D5	Is there a game iteration soft cap?	No evidence (max seen: 18)	Observational only
D6	Does repair restore the target resource?	Game text implies yes	Measurable from resource snapshots
D7	`repair_count` tracking	Initialized but never incremented in analyzer	Code gap — parser never counts repairs

E. Isolatable Experiments by Priority (post-EXP-13 revert)

Original priority list — superseded by simulation results below.

Priority	What to test	Which assumption	How to isolate
1	~~Skip threshold <15 standalone~~	A3	KILLED by SIM-7 — all scribes start at 13
2	~~Repair count limit (remove cap of 2)~~	A5	BLOCKED — SIM-4 inconclusive, need D7 fix first
3	~~Repair proximity (15 → 20 or remove)~~	A6	KILLED by SIM-8 — no sigils exhaust near target
4	Resource-specific projection (not sum-all)	A1 sub	SIM-3 validated fungibility — deprioritized
5	Fix SCRIBED misclassification	C1	Add `@actually_scribed` flag — promoted to Phase 0
6	Resource consumption per difficulty level	D3	SIM-2 validated flat rate — resolved, no experiment
NEW	Resource bail-out threshold	A1	SIM-3 finding: 38.5 precision wasted per sigil

See "Simulation-Based Testing Chronology" below for the updated experiment sequence.

Simulation-Based Hypothesis Validation (Feb 2026)

Eight simulations run against v1.5.17 data (22 sessions, 555 worked sigils, 3 scribed) to classify each hypothesis from the Code-Level Assumption Audit as:

Validated by logs — answered from existing data, no live experiment needed
Killed by simulation — simulation shows the hypothesis has no impact
Informs experiment — simulation guides how to design a live experiment
Inconclusive — data gap prevents reliable simulation

Script: scratchpad/assumption_simulations.rb

SIM-1: SCRIBED Misclassification Bug (C1) — CORRECTED: BUG IS ACTIVE

Metric	SIM-1 result	Corrected (raw log audit)
Total SCRIBED results	3	3
True scribes (scribe_count > 0)	3	1 (Fidon, 4 scrolls)
Misclassified (scribe_count = 0)	0	2 (Byd, Refia)

CORRECTION: SIM-1 reported 0 misclassifications because the analyzer's determine_stop_reason (line 566) assigns :scribed for ANY result=SCRIBED, inheriting the script's buggy classification. The simulation checked stop_reason != :scribed — which can never detect C1 because the analyzer trusts the script's result field.

A raw log audit using scribe_count (from actual "You carefully scribe" messages) reveals 2 of 3 v1.5.17 "SCRIBED" results are C1 fakes:

Byd #96: precision 85, "Sigil harvesting failed" (mishap), 0 scrolls produced

Refia #84: precision 88, "all traces of the sigil have vanished", 0 scrolls

Fidon #37: precision 86, "Final precision: 86, scribing", 4 scrolls (REAL)

Across ALL versions: 20 result=SCRIBED, 10 real, 10 C1 fakes. 50% misclassification rate. The bug is NOT dormant — it actively inflates scribe counts.

Classification: CRITICAL bug fix — actively corrupting data. Phase 0 priority.

SIM-2: Resource Consumption Per Difficulty (D3)

Iteration Range	N	Avg cost/iter	Total avg consumed
1-5	22	2.99	12.9
6-10	184	2.26	19.3
11-14	347	2.14	25.5
15	2	2.67	40.0

Resource consumption rate is roughly flat at ~2.1-3.0 stars/iter regardless of how deep into a sigil we are. Higher initial rate (1-5 iters) likely reflects higher starting resources enabling more expensive actions early. The per-action cost of ~4.2 stars (Finding 1) is consistent.

Classification: Validated by logs — no experiment needed.

SIM-3: Resource Fungibility (A1)

Resource	Avg at exit	Median	Zero%
sanity	7.4	7	1.1%
resolve	7.2	7	2.0%
focus	7.4	7	1.1%

Imbalanced exits (one resource=0, another>=3): 22 (4.0%)
Balanced exits: 533
Avg remaining stars at exit: 22.0
22.0 × 1.75 = 38.5 projected precision wasted per sigil

Resources deplete EVENLY, confirming the sum-all projection is valid. But the MAJOR finding is that sigils exit with 22 stars remaining on average. This means the bail-out formula (resource projection coefficient 1.75, line 264) is TOO AGGRESSIVE — it triggers the resource exhaustion exit while significant resources remain, leaving 38.5 projected precision on the table per sigil.

Classification: Validated (fungible) + MAJOR FINDING — bail-out aggressiveness is the highest-impact tuning target. See Testing Chronology Phase 2.

SIM-4: Repair Cap (A5)

The repair_count field is never populated in the analyzer (code gap D7). The simulation estimated repairs by counting action menu items where aspect = resource name. This methodology is FLAWED: it counts all menu items offered, not algorithm-selected repairs. Results (0-38 "repairs" per sigil, 99.8% "at cap") are misleading.

Classification: Inconclusive — must fix D7 first, collect 10+ sessions with real repair_count tracking, then re-simulate.

SIM-5: Fate of High-Precision Sigils (A2/C1)

Sigil	Start	Peak	Final	Result	Iters	Danger
#96	13	85	85	SCRIBED	12	17
#37	13	86	86	SCRIBED	14	18
#84	13	88	88	SCRIBED	13	18

3 sigils ever reached precision 85+ in 555 worked. The analyzer reports all 3 as "scribed" but raw log audit (SIM-1 correction) reveals only 1 actually scribed:

Sigil	Start	Peak	Final	Real?	Actual outcome
#96 (Byd)	13	85	85	FAKE	Mishap at 85 ("Sigil harvesting failed")
#37 (Fidon)	13	86	86	REAL	Scribed, 4 scrolls produced
#84 (Refia)	13	88	88	FAKE	Sigil vanished at 88

2 of 3 sigils that reached 85+ were LOST to mishap/vanish. Only 1 successfully scribed. The C1 bug window is NOT narrow — it's hitting 67% of 85+ sigils in our data.

All 3 started at precision 13 with 12-14 iterations and danger 17-18 at peak.

Classification: Partially validated, C1 impact severe — reaching 85+ does not guarantee scribing. The high danger (17-18) at 85+ means significant mishap risk remains.

SIM-6: Refresh Cost (D4)

Metric	With refreshes (N=339)	Without refreshes (N=216)
Avg refreshes per sigil	1.6	0
Resource cost/iter	2.08 stars	2.44 stars
Danger/iter	1.08	0.77

Refreshes consume 0 resources (lower per-iter cost because refresh iterations don't drain resources) but add danger (+0.31 danger/iter compared to non-refresh sigils). This confirms Finding 2 (refreshes have hidden cost through clarity/danger accumulation). No algorithmic change indicated — we already minimize refreshes.

Classification: Validated — no experiment needed.

SIM-7: Skip Threshold Sensitivity (A3) — CRITICAL FINDING

Threshold	Work	Skip	>=80	Scribes	Lost >=80	Lost scribes	Iters saved
Skip <12	555	0	14	3	0	0	0
Skip <13	555	0	14	3	0	0	0
Skip <14	342	213	8	0	6	3	2259
Skip <15	183	372	7	0	7	3	3899
Skip <16	1	554	0	0	14	3	5818

ALL 3 scribes started at precision 13. Any skip threshold above <13 eliminates ALL scribes from the v1.5.17 dataset. Skip <14 also loses 6 of 14 >=80 sigils. Skip <15 (the threshold from EXP-13) loses 7 of 14 >=80 sigils AND all 3 scribes.

This INVALIDATES Finding 4 for post-Awakened testing. The original analysis used combined v1.5.15+v1.5.17 data where the pre-Awakened v1.5.15 data showed 0 scribes from start=13. With Awakened active, the precision boost is sufficient for start=13 sigils to reach 90. The current skip <13 threshold is correct and MUST NOT be raised.

Classification: KILLED — skip <15 hypothesis is dead. Current <13 is optimal.

SIM-8: Repair Proximity Threshold (A6)

Distance from target	Count
0-5 (near target)	0
6-15 (in repair range)	0
16-30 (outside repair range)	2
31+ (far from target)	65

All 67 resource-exhausted sigils were 16+ precision from target. Zero were in the 6-15 range where the repair proximity threshold operates. The threshold is irrelevant because resource exhaustion only hits sigils far from target — sigils near target have been efficiently progressing and don't exhaust resources.

Classification: KILLED — widening/removing threshold has zero impact.

Hypothesis Classification Summary

ID	Hypothesis	SIM	Classification	Action
C1	SCRIBED misclassification bug	SIM-1, SIM-5	ACTIVE — 50% misclass rate	CRITICAL bug fix (Phase 0)
D3	Resource consumption varies by difficulty	SIM-2	Flat ~2.1-3.0 stars/iter	Validated — no experiment
A1	Resources are fungible (sum-all valid)	SIM-3	Even depletion (4% imbalanced)	Validated — no experiment
A1-sub	Bail-out threshold too aggressive	SIM-3	38.5 precision wasted/sigil	Experiment (Phase 2)
A5	Repair cap of 2 is binding	SIM-4	Estimation method flawed	Inconclusive — fix D7 first
A2	target-5 scribe margin	SIM-5	85+ does NOT guarantee scribe (1/3 real)	Needs investigation
D4	Refreshes cost resources	SIM-6	0 resources, +0.31 danger/iter	Validated — no experiment
A3	Skip <15 for target 90	SIM-7	All 3 scribes start at 13	KILLED — stay at <13
A6	Repair proximity threshold matters	SIM-8	0 sigils exhaust near target	KILLED — no experiment

Score: 5 validated by logs, 2 killed by simulation, 1 informs experiment, 1 inconclusive.

Simulation-Based Testing Chronology (Feb 2026)

Ordered experiment sequence informed by simulation results. Each phase depends on the previous phase being complete.

Phase 0: Infrastructure & Bug Fixes — DONE (v1.5.20 + v1.5.21)

Item	What	Status
Revert EXP-13	Restore v1.5.17 algorithm as baseline	Done (v1.5.20)
Fix C1	Add `@actually_scribed` flag	Done (v1.5.20) — validated: 0 fake SCRIBEDs
Fix D7	Increment `repair_count` in analyzer parser	Done (v1.5.20) — code correct, 0 repairs in sample
Fix EXP-9 omission	Restore resource exhaustion check	Done (v1.5.21) — was accidentally dropped in v1.5.20

v1.5.20 deployed and tested (11 sessions). Discovered missing EXP-9 resource check (0% resource_exhausted vs 12.1% baseline). Fixed in v1.5.21.

Phase 1: Complete EXP-14 Analysis — DONE, REVERT

Collected 11 v1.5.19 sessions (all 11 characters)
Ran on EXP-13 code base (not rebased — both EXP-13 and EXP-14 now REVERT)
Results: mishap rate 74.8% (+11.4pp vs EXP-13 base), 0 real scribes, no metric improvement
Cost equalization removed cost penalty for dangerous actions → more mishaps
See EXP-14 detailed results above

Phase 1.5: Corrected Baseline (v1.5.21) — DONE, CONFIRMED

11 sessions, all metrics match v1.5.17 within normal variance
=80/session: 0.64 (exact match), mishap: 40.8% (vs 42.0%), resource_exhausted: 13.4% (restored)
Key finding: 4 of 7 >=80 sigils stopped by move budget at 80-84 with 1 iter remaining
C1 validated (0 fakes), D7 code correct (0 repairs observed — genuinely rare)
Corrected baseline established. Ready for Phase 2.

Phase 2a: EXP-15 (v1.5.22) — DONE, REVERT

Change: (14 - @num_iterations) → (15 - @num_iterations) in move budget formula.

Metric	v1.5.21 (baseline)	v1.5.22 (EXP-15)	Delta
Worked	262	259	-3
>=80	7 (0.64/sess)	6 (0.55/sess)	-1
Scribes	0	0	0
Mishap rate	40.8%	52.5%	+11.7pp (Z=2.67)
moves_exhausted	104 (39.7%)	47 (18.1%)	-21.6pp
iteration_cap	0 (0%)	4 (1.5%)	+1.5pp
resource_exhausted	35 (13.4%)	37 (14.3%)	+0.9pp
sigil_vanished	16 (6.1%)	32 (12.4%)	+6.3pp

The formula change freed ~57 sigils from budget exits. Of those: ~29 mishapped, ~16 vanished, 4 reached iteration cap (3 at precision 83, gap=2 from scribe). The old off-by-one was functioning as a safety guardrail — extending sigils costs more mishaps than it gains.

moves_exhausted distribution shifted rightward by 1 iteration (each cohort got 1 more iter):

v1.5.21: iter 10(5), 11(28), 12(42), 13(23), 14(6)
v1.5.22: iter 12(8), 13(23), 14(16)

Lesson: Don't extend sigils deeper into the danger zone. The bottleneck is mishap rate at high iterations (24% at iter 11, 20% at iter 12), not the budget formula.

Phase 2b: EXP-16 (v1.5.23) — DONE, REVERT

Change: resource exhaustion coefficient 1.75 → 1.5 in sigil_info.

Results: Total wipeout — 0 worked sigils, 1718 skipped (100%), 0 scribes.

The coefficient 1.5 is mathematically impossible for target 90:

Max starting resources: 15 + 15 + 15 = 45 stars
Available at coeff 1.5: 45 × 1.5 + precision = 67.5 + precision
Threshold: target - 5 = 85
Need: 67.5 + precision ≥ 85 → precision ≥ 18 required
Starting precision is almost never 18+, so ALL sigils bail on iteration 0

At coeff 1.75: 45 × 1.75 + 13 = 91.75 ≥ 85 — passes fine. Minimum viable coefficient: (85 - 13) / 45 = 1.6

Of 1718 total sigils: 1388 skipped by "below 13" threshold, 330 passed it but immediately hit the resource exhaustion exit. The resource check is evaluated on iteration 0 with full resources — at 1.5, even full resources + precision 13 gives only 80.5, below the 85 threshold.

Post-mortem: This was a calculation error in experiment design. The coefficient determines the minimum starting precision at full resources. The relationship should have been checked: (target - 5 - min_starting_precision) / max_resources = (85 - 13) / 45 = 1.6. Any coefficient below 1.6 makes it impossible for precision-13 sigils (the skip threshold) to even start. The 1.75 coefficient already provides minimal headroom (91.75 vs 85 threshold). Future coefficient experiments should target 1.65-1.70 range, not below 1.6.

Lesson: Always verify the boundary condition: can a sigil at the skip threshold (precision 13) with max resources (45 stars) pass the resource check? If not, the coefficient is too low.

Phase 2c: EXP-14 Retest — Clean Cost Equalization (v1.5.24) — DONE, KEPT

Change: @action_cost from { taxing: 1, disrupting: 2, destroying: 3 } to { taxing: 1, disrupting: 1, destroying: 1 }. Tested against confirmed v1.5.21 baseline.

Metric	v1.5.21 (baseline)	v1.5.24 (retest)	Delta
Worked	262	263	+1
Worked/session	23.8	23.9	+0.1
Scribed (real)	0	2	+2
>=80	7	3	-4 (Fisher p=0.22, n.s.)
Mishap rate	40.8%	39.2%	-1.6pp (Z=0.39, p=0.70, n.s.)
moves_exhausted	104 (39.7%)	98 (37.3%)	-2.4pp
resource_exhausted	35 (13.4%)	32 (12.2%)	-1.2pp
sigil_vanished	16 (6.1%)	28 (10.6%)	+4.5pp (Z=1.88, p=0.06, marginal)
Avg gain/iter	7.2	7.3	+0.1
Danger at mishap	7.8	7.1	-0.7

Scribes (both C1-validated, 4 scrolls each):

Barrask #87: prec 92/90, 11 iters, danger 11, start 15 (efficient — low danger)
Refia #62: prec 86/90, 15 iters, danger 18, start 14

>=80 detail (3 total, 2 scribed):

Barrask #87: start 15 → 92, scribed (4 scrolls)
Refia #62: start 14 → 86, scribed (4 scrolls)
Refia #92: start 14 → 80, mishap at danger 11

Key findings:

Mishap rate UNCHANGED (Z=0.39, p=0.70). The confounded EXP-14 showed +11.4pp. The clean test shows -1.6pp (noise). The original "harmful" conclusion was wrong. The "accidental diversification" theory is refuted — equalizing costs has no measurable effect on mishap rate when tested against baseline code.
2 real scribes in 11 sessions — best single-test result since Awakened technique. Small counts (not statistically significant), but directionally positive.
sigil_vanished marginally up (p=0.06). Monitor but no action needed — not significant at p<0.05, and no code change would explain this (only cost tiebreaking changed).
>=80 down 7→3 but not significant (Fisher p=0.22). Notably, 2/3 >=80 sigils scribed (67% conversion) vs 0/7 in baseline (0% conversion).
Code is now correct: @action_cost accurately reflects that each label describes WHICH resource, not HOW MUCH. The {1,2,3} mapping was provably wrong.

Verdict: KEPT. Cost equalization is neutral. v1.5.24 becomes the new baseline.

Phase 2d: EXP-17 — Resource-Aware Tiebreaker (v1.5.25) — KEPT

Experiment selection analysis (v1.5.24 data, 2,570 iterations with parsed actions):

Four candidate experiments were evaluated for v1.5.25:

Option	Change	Mechanism	Effect size	Risk
A: Resource-aware tiebreaker	When 2+ actions share highest difficulty, prefer action draining most-available resource	Preserves scarce resources, extending productive iterations	9.4% of iterations (241/2,570 ties)	Low — only changes tiebreaking
B: Resource coefficient 1.75→1.65	Lower bail-out threshold	Retains ~1 more sigil per session	~4.5 precision points headroom	Low but tiny effect
C: Danger-aware throttling	Reduce difficulty when danger is high	Reduce mishap rate	68% of mishaps at danger <10 — weak signal	Medium — requires model of mishap function
D: Move budget 13→11	Lower precision/move coefficient	Bail fewer sigils	Wrong direction — bails MORE sigils	N/A — excluded

Why Option A (resource-aware tiebreaker):

Measurable frequency: Fires in 9.4% of iterations (241 ties out of 2,570). That's ~24 tiebreaker decisions per 11-session test — enough to detect an effect.
100% heterogeneous cost profiles: Every observed tie involves actions that drain DIFFERENT resources (e.g., one taxing/resolve, one destroying/sanity). This means every tie offers a real choice — the tiebreaker always has a meaningful preference to express.
Tie distribution (balanced across all resource pairs):
- taxing/destroying (resolve vs sanity): ~33%
- disrupting/taxing (focus vs resolve): ~33%
- disrupting/destroying (focus vs sanity): ~33%
- 3-way ties: <1%
Direct mechanism: Resource conservation extends the productive phase. When resources are asymmetric (e.g., sanity=12, focus=5, resolve=8), draining the abundant resource (sanity) instead of the scarce one (focus) avoids hitting the resource exhaustion bail-out prematurely. The bail-out check uses (sanity + resolve + focus) * 1.75 + precision < 85, so preserving total resource pool matters.
Low risk: Only fires when two actions are already tied on difficulty (same expected gain) and cost (same impact weight). The change never overrides the primary selection criterion (highest difficulty) or the secondary (lowest cost). It only resolves what was previously an arbitrary first-encountered-wins tie.

Implementation (lines 243-258 of sigilharvest.lic):

The existing action selection has two levels:

Prefer highest difficulty (EXP-7, determines precision gain)
Break ties by lowest cost/impact (conserve resources)

With cost equalization (EXP-14 retest, all costs = 1), level 2 never fires. EXP-17 adds a third level: when difficulty AND cost are tied, prefer the action whose resource label corresponds to the highest current resource level (contest_stat_for).

# Level 3 tiebreaker (EXP-17):
elsif x['difficulty'] == sigil_action['difficulty'] && x['impact'] == sigil_action['impact']
  if contest_stat_for(x['resource']) > contest_stat_for(sigil_action['resource'])
    sigil_action = x
  end

Resource mapping: contest_stat_for('sanity') → @sanity_lvl, 'resolve' → @resolve_lvl, 'focus' → @focus_lvl. These are already parsed from the game's star display each iteration.

Why not the other options:

B (coefficient): At 1.75, the formula gives 45 × 1.75 + 13 = 91.75 vs threshold 85. Changing to 1.65 gives 45 × 1.65 + 13 = 87.25 — only 2.25 points less headroom. The effect is too small to measure reliably in 11 sessions.
C (danger-aware): 68% of mishaps occur at danger <10, suggesting danger doesn't strongly predict mishap probability. Without a validated mishap model, any throttling rule is speculative. Needs more data analysis before experimenting.
D (move budget): Lowering the coefficient from 13 to 11 means the formula bails MORE sigils (declares them hopeless earlier). This shrinks the candidate pool — wrong direction.

EXP-17 Results (12 sessions, 11 characters, Shard/permutation/target=90/60min):

Sessions: 12 (11 complete, 1 incomplete — Kythkani fragment, 322 lines)
Logs: ~/SH_logs/v1.5.25/
Baseline: v1.5.24 (EXP-14 retest, 11 sessions)
C1 audit: 1 real scribe (Byd, 4 scrolls, precision 87). No C1 misclassifications.

Metric	v1.5.24 (baseline)	v1.5.25 (EXP-17)	Delta
Sessions	11	12	+1
Worked	263	254	-9
Scribed	2	1	-1
Mishap rate (per sigil)	39.2%	33.9%	-5.3pp
Mishap rate (per iter)	3.8%	3.2%	-0.6pp
Avg gain/productive iter	7.3	7.1	-0.2
Avg iters/sigil	10.3	10.6	+0.3
Refresh rate	9.5%	8.9%	-0.6pp
Failed actions	116	136	+20

Stop Reason	v1.5.24	v1.5.25	Delta
moves_exhausted	98 (37.3%)	117 (46.1%)	+8.8pp
mishap	103 (39.2%)	86 (33.9%)	-5.3pp
resource_exhausted	32 (12.2%)	27 (10.6%)	-1.6pp
sigil_vanished	28 (10.6%)	23 (9.1%)	-1.5pp
scribed	2 (0.8%)	1 (0.4%)	-0.4pp

Analysis:

Mishap rate directionally improved (39.2%→33.9% per sigil, 3.8%→3.2% per iter). Two-proportion z-test: z = -1.25, p ≈ 0.21 — not statistically significant at p<0.05. Consistent with hypothesis but insufficient sample size to confirm.
Stop-reason shift is mechanically coherent: fewer resource-exhausted exits (32→27) and fewer mishaps (103→86), with more moves_exhausted exits (98→117). Sigils survive longer (avg iters 10.3→10.6), dying to the move budget instead of resource depletion or mishaps. This is exactly what resource conservation should produce.
Scribe count (2→1) is in the noise. We've seen 0-2 real scribes per 11-session run consistently across all versions. Not a meaningful signal.
Failed actions increased (116→136). The tiebreaker may sometimes pick an action whose resource is abundant but has a higher failure rate. Worth monitoring but not alarming at this sample size.
Precision gain marginally lower (7.3→7.1). Expected — the tiebreaker resolves ties that were previously arbitrary, sometimes choosing a slightly different action. The tradeoff is resource conservation vs marginal per-iteration gain.

Verdict: KEPT. The tiebreaker is a zero-risk third-level selection rule that fires in ~9.4% of iterations. No degradation in any critical metric. Directional improvement in mishap rate and resource exhaustion. Mechanically coherent stop-reason shift. The change is too small to achieve significance in 11 sessions, but there is no signal of harm and the mechanism is sound. v1.5.25 becomes the new baseline.

v1.2.0 Baseline Comparison (Head-to-Head)

To validate the cumulative effect of all changes since the original script, an instrumented v1.2.0 baseline was created (sigilharvest-v120-baseline.lic) and run for 11 sessions under identical conditions (Shard/permutation/target=90/60min/Inspired+Enlightened+Illuminated+Awakened).

What the v1.2.0 baseline includes (instrumentation only, no algorithm impact):

C1 fix (@actually_scribed flag) — required for accurate scribe counting
resolve_burin/get_burin/stow_burin — infrastructure parity
Difficulty fix (formidable=3, challenging=4, difficult=5) — confirmed bug fix
Cost equalization ({1,2,3}→{1,1,1}) — confirmed correct mapping

What the v1.2.0 baseline retains (original algorithm):

Risk-based action selection (low risk far from target, high risk near target)
ACTION verb accepted (not filtered)
Skip threshold <10 (not <13)
No iteration cap (removed per Urbaj's correction)
Original bail-out coefficients (2.25/15) with <=80 guards
No resource-aware tiebreaker

Logs: ~/SH_logs/v1.2.0/DR-*.log (11 sessions, Feb 4 2026)

Core Metrics

Metric	v1.2.0 (original algo)	v1.5.25 (current)	Delta
Sessions	11	12	+1
Total sigils found	983	1,284	+301
Worked	414	254	-160
Skipped	569 (57.9%)	1,030 (80.2%)	+22.3pp
Scribed	0	1	+1
>=80 precision	7 (1.7%)	6 (2.4%)	+0.7pp
Avg precision	54	51	-3
Best precision	86	87	+1

Efficiency

Metric	v1.2.0	v1.5.25	Delta
Avg iters/sigil	10.9	10.6	-0.3
Avg gain/productive iter	8.4	7.1	-1.3
Refresh rate	13.9%	8.9%	-5.0pp
Productivity rate	45.1%	50.7%	+5.6pp
Failed actions	172	136	-36
Repairs detected	2	0	-2

Mishap & Stop Reasons

Metric	v1.2.0	v1.5.25	Delta
Mishap rate (per sigil)	50.7%	33.9%	-16.8pp
Mishap rate (per iter)	4.6%	3.2%	-1.4pp
Danger at mishap (avg)	8.6	8.1	-0.5

Stop Reason	v1.2.0	v1.5.25
mishap	210 (50.7%)	86 (33.9%)
moves_exhausted	167 (40.3%)	117 (46.1%)
sigil_vanished	35 (8.5%)	23 (9.1%)
resource_exhausted	2 (0.5%)	27 (10.6%)
scribed	0	1 (0.4%)

Analysis

v1.5.25 wins on quality, v1.2.0 wins on quantity. v1.2.0 works 63% more sigils (414 vs 254) because skip<10 attempts everything starting at precision 10+. But v1.5.25's skip<13 filters low-value sigils, yielding a higher 80+ rate (2.4% vs 1.7%) and the only scribe. v1.5.25 also finds more total sigils per session (107 vs 89) because it moves through rooms faster.
Mishap rate is the dominant difference — 50.7% vs 33.9% per sigil, 4.6% vs 3.2% per iter. The v1.2.0 risk-based selection exposes sigils to more danger: picking low-difficulty actions early wastes iterations without reducing danger, then switching to high-difficulty late increases exposure at peak danger. v1.5.25's always-highest strategy is more efficient.
v1.2.0 gets higher per-action gain (8.4 vs 7.1) but wastes more on refreshes (13.9% vs 8.9%). The risk-based selection picks high-difficulty near target (gain 13+) but low-difficulty far from target (gain 2-5), producing more refreshes when low-risk actions don't yield viable follow-ups. v1.5.25's constant difficulty preference is more consistent with fewer wasted iterations.
Resource exhaustion check validates EXP-9. v1.2.0 uses the original 2.25 coefficient — only 2 exits (0.5%). v1.5.25's 1.75 coefficient catches 27 sigils (10.6%) that would burn resources without reaching target.
C1 audit: The only SCRIBED result found in the v1.2.0 directory was a C1 fake from an old file (Saelia, Feb 1). Our 11 new sessions produced 0 real scribes.

Change Validation Summary

Every change from v1.2.0 to v1.5.25 is empirically validated:

Change	Version	Mechanism	Measured Effect
Difficulty fix	v1.5.10	Correct formidable ranking	Eliminates wrong-action near target
ACTION verb filter	v1.5.10	Skip zero-gain verb	-120 wasted iters/10 sessions
Skip <13	v1.5.12	Filter low-value sigils	+0.7pp 80+ rate, +301 sigils found
Difficulty-first selection	v1.5.13	Always pick highest gain	-5.0pp refresh rate, +5.6pp productivity
Resource coeff 2.25→1.75	v1.5.15	Earlier bail-out on hopeless	Catches 10.6% vs 0.5% resource exits
Iteration cap	v1.5.17	Limit mishap exposure	-16.8pp mishap rate
C1 fix	v1.5.20	Accurate scribe classification	44% of old SCRIBEDs were fakes
Cost equalization	v1.5.24	Correct resource mapping	Neutral (labels != amount)
Resource tiebreaker	v1.5.25	Preserve scarce resources	Directional mishap improvement

Verdict: The original algorithm works harder but less efficiently. v1.5.25 works smarter — fewer sigils attempted, but each one has better odds, lower mishap exposure, and more accurate instrumentation.

D7 Fix: Repair Logging (v1.5.26) — Infrastructure — PHASE 3 CLOSED

Bug: The DRC.message('Executing aspect repair') log line was gated behind if @debug (line 321). Since production runs don't use debug mode, the analyzer's REPAIR_ACTION pattern (line 171 of sigilharvest_analyzer.rb) never matched anything. Result: repair_count was always 0, making Phase 3 repair analysis impossible.

Fix: Removed if @debug from the repair log message. One-line change. The analyzer already has the detection code — it just never found the pattern in non-debug logs.

v1.5.26 Results (11 sessions, 255 worked sigils, 2606 iterations):

Metric	v1.5.26	v1.5.25 (baseline)	Delta
Worked	255	254	+1
Scribed	1 (Fidon, 3 scrolls, prec=93)	1 (Byd, 4 scrolls, prec=87)	—
Mishap/sigil	42.0%	33.9%	+8.1pp (p=0.06, n.s.)
Mishap/iter	4.1%	3.2%	+0.9pp
Productivity	51.1%	50.7%	+0.4pp
Avg gain	7.18	7.09	+0.09
Reached 80+	2.0%	2.4%	-0.4pp
Repairs	0	0	0

No algorithm change. Mishap uptick is run-to-run variance (z=1.88, p=0.06). 3 combat_distracted exits (new stop reason, enemies in sigil rooms).

D7 Validation: Repairs are non-existent. Zero repairs in 255 worked sigils (2606 iterations). Grep confirms zero "Executing aspect repair" messages across all v1.5.26 logs.

Why repairs don't trigger: The repair path requires !sigil_action.key?("difficulty") — meaning no precision action is available. With difficulty-first selection (v1.5.20+), the algorithm virtually always finds a viable precision action. The 224 refreshes (8.6%) happen at the execution level (game RNG), not the selection level.

Historical comparison: v1.5.17 (risk-based selection, debug mode) triggered 5 repairs across ~300 worked sigils (~1.7% of sigils). All occurred at high precision (75-81) in late iterations (10-13). Results:

2/5 succeeded: recovered a resource, no precision change
3/5 caused mishaps: sigil destroyed (60% mishap rate on repairs)

Phase 3: CLOSED. Repairs are a non-factor with the current algorithm. They don't happen, and when they did (v1.5.17), they were actively harmful (60% mishap rate). No experiment needed.

Phase 4: CLOSED. The repair difficulty filter (requires difficulty >= 3) is moot — loosening it would allow more repairs, but repairs themselves are counterproductive.

Gain Optimization Analysis (Post-Phase 3)

With repairs closed, the next question: where do we get precision gains? Monte Carlo simulation (100k sigils per scenario) comparing gain and mishap levers.

Why v1.2.0 has higher avg gain (8.39 vs 7.18):

The gain-per-difficulty-level is identical between versions (trivial=2-3, difficult=13-14). The difference is in how often each difficulty level is selected:

Gain range (est. difficulty)	v1.2.0	v1.5.26	Delta
1-3 (trivial)	4.5%	25.8%	+21.3pp
4-5 (straightforward)	25.2%	19.3%	-5.9pp
6-8 (formidable)	26.2%	19.1%	-7.1pp
9-11 (challenging)	18.5%	13.7%	-4.8pp
12-16 (difficult)	25.6%	22.1%	-3.5pp

v1.5.26 produces 5.7x more trivial-range gains. Both algorithms pick "highest difficulty available," but v1.5.26's skip<13 threshold works more low-starting-precision sigils where the game may offer weaker action menus. The effective gain per iteration (avg gain × productivity) is similar: v1.2.0 = 3.78, v1.5.26 = 3.67. v1.2.0's higher per-action gain is partially offset by lower productivity (45.1% vs 51.1%).

Gain optimization — scribe rate by avg gain:

Scenario	Scribe%	>=80%	Multiplier
Current v1.5.26 (7.2)	5.9%	8.6%	1.0x
+0.5 gain (7.7)	8.4%	11.8%	1.4x
+1.0 gain (8.2)	11.9%	15.9%	2.0x
v1.2.0 actual gains (8.4)	12.4%	16.9%	2.1x
+1.5 gain (8.7)	15.0%	19.4%	2.6x
+2.0 gain (9.2)	18.7%	23.5%	3.2x

Each +1.0 avg gain → ~2.0x scribe rate improvement.

Mishap reduction — scribe rate by mishap rate:

Scenario	Scribe%	>=80%	Multiplier
Current (4.1%/iter)	6.1%	8.9%	1.0x
-25% mishaps (3.1%/iter)	7.0%	10.0%	1.15x
-50% mishaps (2.1%/iter)	7.8%	11.2%	1.28x
-75% mishaps (1.0%/iter)	9.0%	12.9%	1.48x
No mishaps (0%/iter)	10.3%	14.6%	1.69x

Even eliminating ALL mishaps gives only 1.69x. Halving mishaps gives 1.28x.

Combined analysis:

Scenario	Scribe%	Multiplier
Baseline	5.8%	1.0x
Gain +1.0 alone	11.7%	2.0x
Mishap -50% alone	7.8%	1.3x
Both: gain+1.0 & mishap-50%	15.3%	2.7x (super-additive)
v1.2.0 gains & no mishaps	20.8%	3.6x

Conclusions:

Gain optimization is ~3.0x more impactful than mishap reduction
They stack super-additively (combined 2.7x vs additive 2.4x)
Priority: gain optimization first, mishap reduction second
Recovering v1.2.0's gain level (+1.2) without its mishap penalty is the ideal target
The mishap rate difference between v1.2.0 and v1.5.26 is small (4.6% vs 4.1%/iter) — the gain gap is not caused by risk tolerance, it's caused by action menu composition

EXP-18: Minimum difficulty threshold (v1.5.27) — Complete, KEPT

Hypothesis: Trivial-difficulty (1) precision actions produce avg gain of 2.3, far below the 6.8-13.3 for formidable-difficult. v1.5.26 gets 25.8% trivial-range gains vs v1.2.0's 4.5%. Skipping trivial actions and refreshing for a better menu is +EV: the probability of getting a non-trivial action next iteration is ~74%, and the expected gain from that (74% * 8.0 = 5.9) greatly exceeds the trivial gain (2.3).
Change: Add return false if difficulty < 2 at the top of precision_action_viable? (line 693). When the highest-difficulty action in the menu is trivial, the algorithm will refresh (analyze the sigil) instead of taking the trivial action.
Sessions: 11 (all characters, Shard, permutation, target=90, 60min)
Logs: ~/SH_logs/v1.5.27/
Baseline: v1.5.26 (11 sessions)

Results (11 sessions, 253 worked sigils, 2686 iterations):

Metric	v1.5.27	v1.5.26	Delta	v1.2.0
Worked	253	255	-2	414
Avg gain	8.54	7.18	+1.36	8.39
Productivity	43.9%	51.1%	-7.2pp	45.1%
Effective gain/iter	3.75	3.67	+0.08	3.78
Mishap/sigil	38.7%	42.0%	-3.3pp (n.s. p=0.46)	50.7%
Mishap/iter	3.65%	4.11%	-0.46pp	4.64%
Reached 60+	32.0%	25.1%	+6.9pp	33.3%
Reached 70+	10.7%	7.8%	+2.9pp	10.6%
Reached 80+	2.0%	2.0%	0.0pp	1.7%
Resource exhausted	5.9%	11.8%	-5.9pp	0.5%
Refresh rate	17.2%	8.6%	+8.6pp	13.9%
Repairs	3	0	+3	2
Scribed	1 (Barrask, 4 scrolls, prec=92)	1 (Fidon, 3 scrolls, prec=93)	—	0

Gain distribution by range:

Gain range	v1.5.27	v1.5.26	v1.2.0
Trivial (1-3)	3.6%	25.8%	4.5%
Straightfwd (4-5)	24.9%	19.3%	25.2%
Formidable (6-8)	24.9%	19.1%	26.2%
Challenging (9-11)	20.0%	13.7%	18.5%
Difficult (12+)	26.6%	22.1%	25.6%

Stop reasons:

Reason	v1.5.27	v1.5.26	Delta
moves_exhausted	45.8%	39.2%	+6.6pp
mishap	38.7%	42.0%	-3.3pp
sigil_vanished	9.1%	5.5%	+3.6pp
resource_exhausted	5.9%	11.8%	-5.9pp
scribed	0.4%	0.4%	0.0pp

Analysis:

Gain distribution transformed: Trivial-range gains dropped from 25.8% to 3.6%, almost exactly matching v1.2.0's 4.5%. All other brackets rebalanced proportionally. The v1.5.27 gain distribution is now essentially identical to v1.2.0's.
Avg gain exceeded v1.2.0: 8.54 vs 8.39. The trivial filter plus difficulty-first selection produces slightly higher gains than v1.2.0's risk-based selection because difficulty-first more reliably picks the highest-difficulty action when one is available.
Effective gain/iter improved: Despite 7.2pp lower productivity (more refreshes), the +1.36 avg gain more than compensates. Effective gain: 3.75 vs 3.67 (+0.08). Now nearly matches v1.2.0's 3.78.
Resource exhaustion halved: 11.8% → 5.9%. Refreshes cost 0 resources, so more refreshing = less resource drain per iteration. This is a significant structural improvement — fewer sigils bail out due to resource depletion.
60+ rate +6.9pp: More sigils reaching high-precision tiers (32.0% vs 25.1%). Now matches v1.2.0's 33.3%.
3 repairs detected: The trivial filter creates conditions where no precision action passes viability, allowing the repair path to trigger. First repairs in v1.5.20+. Confirms D7 fix is working and repairs can happen when the filter is stricter.
Min gain = 3: Confirms trivial-difficulty (gain 2-3) actions are being filtered. The remaining gain=3 entries are likely straightforward actions rolling low.
moves_exhausted +6.6pp: More sigils reaching the move budget limit at higher avg precision (55.6 vs 53.3). These are sigils that got further but couldn't finish.

Verdict: KEEP. Most measurably effective change since EXP-6 (difficulty fix). Achieved exactly what the gain optimization analysis predicted: recovered v1.2.0's gain level while maintaining lower mishap rate. The gain distribution is now structurally optimal.

Killed / No Experiment Needed

Hypothesis	Why killed	SIM
Skip <15 for target 90 (A3)	ALL 3 scribes start at 13; any raise above <13 loses all scribes	SIM-7
Repair proximity threshold (A6)	0 sigils exhaust resources near target; threshold never matters	SIM-8
Resource fungibility test (A1)	Even depletion confirmed (4% imbalanced); sum-all is valid	SIM-3
Resource consumption by difficulty (D3)	Flat ~2.1-3.0 stars/iter; no variation to exploit	SIM-2
Refresh cost experiment (D4)	0 resources, +0.31 danger/iter; already minimize refreshes	SIM-6

Ideas evaluated and killed (deep exploration pass #2, v1.5.2-v1.5.9)

The following ideas were simulated against 4097 worked sigils / 40018 iterations and found to be neutral or harmful:

Scribe at target-5 from iteration 8+: Only 1 of 4097 sigils ever peaked at 85+ and then fell below (Jazriel #15, peak=85 at iter 13, mishapped to final=3). The scenario this addresses is vanishingly rare. No expected benefit.
Consecutive refresh limit: All thresholds (2-5 max streak) had strongly negative net impact (-8.7 to -26.6). Refresh streaks don't predict failure — they're temporary bad menu RNG, and sigils recover from them.
Hard iteration cap reduction: Any cap below 13 costs more 80+ sigils than it creates (net -1.8 to -21.3). Current effective cap of ~14-15 is already optimal. Note (Feb 2026): External feedback indicates the game has no hard iteration cap. EXP-13 tests REMOVING the cap entirely (the opposite direction). This killed idea tested LOWERING the cap — still correct that lower caps are harmful.
Single-resource floor bail-out: Catastrophically negative at all thresholds (net -36 to -46). Individual resource depletion doesn't predict failure — the game uses different resources for different actions.
Quality actions as refresh fallback (from pass #1): Quality actions cost resources but give zero precision gain; refreshes cost nothing. Strictly worse.
Total resource bail-out threshold (from pass #1): 60% of affected sigils still improve after hitting low resources. Hard cutoff harms more sigils than it helps.

Technique Test: Awakened Sigil Comprehension (v1.5.17) — CONFIRMED ACTIVE

Background: Awakened requires Illuminated as prerequisite. Wiki says all technique bonuses are globally disabled; Illuminated confirmed no effect in v1.5.9. Tested last to keep a clean isolation — the per-difficulty median gain analysis (consistent across all v1.5.2–v1.5.9 data) provides a technique-sensitive metric unaffected by algorithm changes.
Wiki description: "Enchanters will find Awakened Sigil Comprehension allows the scribing of many more sigils from a single perception, vastly simplifying the harvesting process." This implies the technique increases the number of SCROLLS producible per scribed sigil, not precision gains.
Change: Version tick only (v1.5.15 → v1.5.17). No algorithm change. All characters trained Awakened Sigil Comprehension before running.
Baseline: v1.5.15 (same algorithm, without Awakened)
Depends on: All algorithm experiments completed first.
Sessions: 22 total (11 characters × 2 batches). Batch 1: 11 sessions. Batch 2: 11 sessions. Characters: Barrask, Byd, Christus, Fidon, Gnarta, Jazriel, Kythkani, Mahtra, Nelis, Refia, Throve.
Logs: ~/SH_logs/v1.5.17/ (7 merged files for split reconnections + 4 single files)

Raw results (22 sessions combined):

Metric	v1.5.15 (10 sess)	v1.5.17 (22 sess)	Delta
Worked	265	555	+290
result=SCRIBED (reported)	1	3	+2
Real scribes (log audit)	0	1	+1
C1 misclassifications	1	2	+1
>=80	1	14	+13
>=80/session	0.10	0.64	+0.54
Avg precision	51.2	52.8	+1.6
Max precision	85	88	+3
Mishap rate	43.0%	42.0%	-1.0pp
resource_exhausted	26	67	+41

SCRIBE COUNT CORRECTION (C1 bug audit): Raw log audit checking for actual "You carefully scribe" game messages reveals the reported scribe counts are inflated by the C1 misclassification bug (line 162):

v1.5.15: 1 reported SCRIBED — Throve #112 (precision 85, mishapped, "Sigil harvesting failed"). 0 real scribes, 1 C1 fake.

v1.5.17: 3 reported SCRIBED:

Byd #96 (precision 85, mishapped, "Sigil harvesting failed") — C1 FAKE

Fidon #37 (precision 86, "Final precision: 86, scribing", 4 scrolls) — REAL

Refia #84 (precision 88, sigil vanished) — C1 FAKE

Corrected: 0 real scribes (v1.5.15) → 1 real scribe (v1.5.17). The "Scribed: 1 → 3" delta in the original analysis was entirely a C1 artifact.

Scrolls-per-scribe analysis (the metric the wiki implies Awakened affects):

The game mechanic: after scribing, the game says "Remnants of the sigil pattern linger, allowing for additional scribing" — each "Remnants" message enables one more scribe attempt. The last scribe does NOT produce a "Remnants" message.

Version	Character	Precision	Scrolls	Awakened?
v1.5.3	Mahtra	90	2	No
v1.5.7	Kythkani	90	4	No
v1.5.8	Throve	88	4	No
v1.5.9	Barrask	85	3	No
v1.5.9	Gnarta	89	2	No
v1.5.9	Jazriel	88	2	No
v1.5.14	Kythkani	86	3	No
v1.5.17	Fidon	86	4	Yes

Pre-Awakened (7 events): mean 2.86 scrolls, median 3, range 2-4 Post-Awakened (1 event): 4 scrolls

Insufficient data to determine whether Awakened increases scrolls-per-scribe. The post-Awakened sample has N=1, and 4 scrolls already occurred in 3 of 7 pre-Awakened events. More real scribe events are needed to measure this.

Batch consistency:

Metric	Batch 1 (11 sess)	Batch 2 (11 sess)
Worked	301	254
>=80	6 (2.0%)	8 (3.1%)
>=80/session	0.55	0.73

Statistical significance (>=80 metric — NOT affected by C1 correction):

Baseline >=80 rate: 1/265 = 0.38%
Test >=80 rate: 14/555 = 2.52% (6.7x improvement)
Two-proportion Z-test: Z = 2.14 (p ≈ 0.016, one-tailed)
Poisson model (treating baseline rate as known): Z ≈ 8.2 — but this overstates confidence by not accounting for uncertainty in the baseline rate estimate
Both batches individually above baseline; batch 2 slightly stronger
The 14 >=80 sigils include the 2 C1 fakes (precision 85, 88) — they DID reach >=80 precision, they just didn't successfully scribe. The metric is valid.
Caveat: The improvement is statistically significant but the mechanism is unknown. Gain distributions, starting precisions, iteration counts, and work rates are all identical between v1.5.15 and v1.5.17. See "What Awakened actually does" section below.

Decision: KEEP Awakened technique on all characters — but mechanism is UNCERTAIN.

Despite no algorithm changes between v1.5.15 and v1.5.17, the >=80 rate improved from 0.38% to 2.52%. However, the wiki describes Awakened as a scrolls-per-scribe effect ("allows scribing of many more sigils from a single perception"), which does NOT predict precision improvement. Detailed comparison shows gain distributions, starting precisions, iteration counts, and work rates are all identical. The mechanism of the >=80 improvement is unexplained — it could be from Awakened (undocumented effect), or an uncontrolled confound (different test dates, server conditions). The two-proportion Z-test gives Z=2.14 (p~0.016), significant but not overwhelming. All future experiments should continue with Awakened trained (no downside risk).

What Awakened actually does — mechanism unknown:

The wiki says: "allows scribing of many more sigils from a single perception" — this describes a scrolls-per-scribe effect (more copies from each scribed sigil), NOT a precision improvement. Yet we observe more sigils reaching 80+ with Awakened trained.

Detailed mechanism analysis (comparing v1.5.15 vs v1.5.17 directly):

Gain-per-action distribution: IDENTICAL. v1.5.15 avg=8.2, v1.5.17 avg=8.4. Same bimodal shape (peaks at 2-3 and 13-15). Awakened does NOT boost gain per action.
Starting precision distribution: IDENTICAL. Both avg=14.0, median=14. Same proportions across buckets. Awakened does NOT change starting positions.
Iteration counts: IDENTICAL. Both avg ~10.4-10.5, median 11, same distribution. Awakened does NOT grant more iterations.
Work rate: IDENTICAL. 20.5% vs 19.8%. Same skip threshold, same behavior.
Mishap rate by bracket: Similar overall (~42%), but v1.5.17 has slightly HIGHER mishap rates at 60-79 precision (51-53% vs 39-44%). Not a protective effect.
>=80 rate: 0.38% → 2.52%. This is the ONLY metric that differs.

The attribution to Awakened was based on process-of-elimination reasoning: "no algorithm changed between v1.5.15 and v1.5.17, so the improvement must be from training Awakened." However, the wiki description does not predict this effect, and no per-action metric shows any change. Possible explanations:

Awakened has an undocumented effect we can't measure at the per-action level
Confound: sessions were run on different dates — server conditions, seasonal effects, or undocumented game patches could contribute
Statistical power: the two-proportion Z-test gives Z=2.14 (p≈0.016), significant but not overwhelming. The earlier Z=8.5 used a Poisson model that may overstate confidence.

Status: KEEP Awakened trained (no downside). Attribution UNCERTAIN — correlation observed but mechanism unexplained by wiki description or per-action data. Continue collecting scribe data to test the wiki's scrolls-per-scribe claim.

Experiment Log Template

When running a new experiment, record results here:

#### EXP-N: <name> (v<version>)
- Sessions: <count> (<list of characters>)
- Logs: ~/SH_logs/v<version>/

| Metric | Baseline | This exp | Delta |
|--------|----------|----------|-------|
| Worked |          |          |       |
| Avg precision |   |          |       |
| Sigils >= 80 |    |          |       |
| Mishap rate |     |          |       |
| Min per 80+ |     |          |       |

- **Verdict**: KEEP / REVERT
- **Action**: <what was done>

Sigil Harvesting minigame testing - elanthia-online/dr-scripts GitHub Wiki

Sigil Harvesting: Developer Reference

1. Quick Reference

Files

Commands

Version

Script Invocation

2. Code Architecture

Class: SigilHarvest

Method Map

Call Flow

Instance Variables (State)

Action Hash Structure

3. Core Algorithm: improve_sigil (line 178)

Decision Tree

Selection Priority Matrix

4. Key Formulas

Viability Filter (line 544)

Bail-Out Formulas

Repair Qualification (line 559)

5. Minigame Mechanics

Game Flow

Resources (0-20 stars each)

Action Properties

Precision Gains (Empirical, N=142)

Mishap System

Starting Precision

Iteration Budget

Path to 90 (Theoretical)

6. Known Weaknesses & Improvement Opportunities

6A. High Refresh Rate (37% of iterations wasted)

6B. Low-Value Actions Not Compared to Repair Value

6C. No Composite Resource Health Check

6D. Danger Thresholds May Be Misdirected

6E. Skip Filter Overhead

7. Testing Guide

Test Setup Pattern

Default Resource Levels in build_sigilharvest

Building Actions

Mock Modules

Script Loading

Test Compatibility

8. Log Analysis Infrastructure

Analyzer: sigilharvest_analyzer.rb

Log File Collection Procedure

9. Observed Session Statistics

v1.2.0 vs v1.4.1 Head-to-Head (2026-02-01)

Key Findings

v1.3.2 Baseline (23 sessions, earlier data)

10. Version History

Infrastructure (all versions v1.5.0+)

Infrastructure Changes (v1.5.18+)

11. Game Context

Seasonal / City Factors

Scroll Management

Trader Luck Mechanic

Precision / Clarity Flavor Text

Mishap Patterns (for @mishaps regex)

12. Development Practices

Version Bumping

Data-Driven Development

Analysis Script Notes

13. Experimental Testing Plan

Protocol

Current Baseline: v1.5.4 (EXP-2)

Completed Experiments

EXP-1: Remove move budget check (v1.5.0)

EXP-2: Accept trivial actions when margin > 1 (v1.5.4)

EXP-3: Prefer repair over trivial/straightforward precision actions (v1.5.5)

EXP-4: Composite resource health guard (v1.5.6)

EXP-5: Tighten move budget formula (v1.5.8)

In-Progress Experiments

EXP-6: Difficulty fix + ACTION filter (v1.5.10) — Completed, KEPT

EXP-10+11: Skip threshold + velocity bail-out (v1.5.11) — Completed, REVERTED

EXP-10: Skip threshold < 13 standalone (v1.5.12) — Completed, KEPT

EXP-7: Difficulty-based action selection (v1.5.13) — Completed, KEPT (neutral)

Viability Filter Analysis (post-EXP-7 investigation)

EXP-12: Loosen viability margin (v1.5.14) — Completed, REVERTED

EXP-12 Retest: Loosen viability margin (v1.5.16) — Completed, REVERTED

Technique Test: Illuminated Sigil Comprehension (v1.5.9) — Completed

Class: `SigilHarvest`

3. Core Algorithm: `improve_sigil` (line 178)

Default Resource Levels in `build_sigilharvest`

Analyzer: `sigilharvest_analyzer.rb`

Mishap Patterns (for `@mishaps` regex)

⚠️ GitHub.com Fallback ⚠️