VALIDATION_TESTING - mark-ik/graphshell GitHub Wiki
Purpose: Collect headed-manual and extended automated validation items that have not yet been executed or verified. Extracted from archived plan docs and ongoing plans so completed plans can be closed cleanly.
Note: Automated unit tests belong in their respective plan or source files. This file collects validation requiring headed execution or an extended integration harness.
Many validation items below should now be verified live using the Diagnostic Inspector.
-
Toggle:
Ctrl+Shift+DorF12, or via Command PaletteOpen Diagnostic Pane. -
Intents Tab: Verify intent emission, ordering, and
LifecycleCause. - Compositor Tab: Verify tile hierarchy, active rects, and webview mapping status (crucial for layout/rendering bugs).
- Engine Tab: Monitor channel latency and throughput during performance tests.
Automation Strategy:
The diagnostic system exposes structured state (DiagnosticsState) that can be queried in integration tests.
-
State Snapshots: Use
snapshot_json_value()to assert on the full system state (tile hierarchy, active intents) without screen scraping. -
Event Stream: Subscribe to the global diagnostic channel to assert on event ordering (e.g.,
url_changedbeforehistory_changed).
Automated validation run (2026-02-22):
-
cargo test --features diagnostics desktop::diagnostics::tests::percentile_95_uses_upper_percentile_rank -- --nocapture- Result: pass (1/1); confirms percentile helper behavior used by Engine/SVG p95 labels.
-
cargo test --features diagnostics desktop::diagnostics::tests::tick_drain_respects_10hz_interval_gate -- --nocapture- Result: pass (1/1); confirms diagnostics drain gate respects 10Hz interval.
-
cargo check --release --message-format short- Result: pass; release/default path builds with diagnostics included in default desktop build.
-
cargo test --features diagnostics desktop::diagnostics::tests::snapshot_json -- --nocapture- Result: pass (2/2); validates diagnostics snapshot JSON core-section presence and channel aggregate parity with in-memory diagnostics state.
-
cargo test desktop::persistence_ops::tests::test_frame_bundle_serialization_excludes_diagnostics_payload -- --nocapture- Result: pass (1/1); verifies frame/session bundle JSON payload excludes diagnostics
runtime sections (
diagnostic_graph/channels/spans/event_ring/recent_intents).
- Result: pass (1/1); verifies frame/session bundle JSON payload excludes diagnostics
runtime sections (
-
cargo test semantic_event_pipeline::tests::test_graph_intents_and_responsive_emits_semantic_pipeline_trace_marker -- --nocapture- Result: pass (1/1); confirms
tracing-testlog capture and semantic pipeline marker emission in diagnostics-enabled default desktop build.
- Result: pass (1/1); confirms
-
cargo test diagnostics_json_snapshot_shape_is_stable -- --nocapture- Result: pass (1/1); validates snapshot JSON shape stability.
-
cargo test diagnostics_svg_snapshot_shape_is_stable -- --nocapture- Result: pass (1/1); validates snapshot SVG payload shape and key sections.
-
cargo test edge_metric_respects_selected_percentile -- --nocapture- Result: pass (1/1); verifies Engine edge metrics follow selected percentile policy.
-
cargo test edge_metric_bottleneck_threshold_is_configurable -- --nocapture- Result: pass (1/1); verifies bottleneck highlighting threshold behavior.
-
cargo test proptest_tick_drain_aggregation_matches_event_stream -- --nocapture- Result: pass (1/1); property test validates aggregate counters against event stream semantics.
-
cargo test test_frame_bundle_payload_stays_clean_after_restart -- --nocapture- Result: pass (1/1); restart-style tempfile check confirms diagnostics payload remains ephemeral.
-
cargo test diagnostics_svg_snapshot_shape_is_stable -- --nocapture- Result: pass (1/1); confirms Engine SVG topology shape includes Servo bridge node/edge
wiring (
Servo Runtimepath) after Servo diagnostics integration.
- Result: pass (1/1); confirms Engine SVG topology shape includes Servo bridge node/edge
wiring (
-
cargo check --release --message-format short- Result: pass; release build path compiles with diagnostics defaults and Servo bridge wiring.
-
cargo run --release- Result: preflight attempted from terminal session; build/start path is reachable but headed
runtime verification (Engine tab +
servo.*activity visibility and shortcut interaction) still requires an interactive/manual run on desktop UI.
- Result: preflight attempted from terminal session; build/start path is reachable but headed
runtime verification (Engine tab +
- Diagnostics export evidence (headed run):
- User-captured file:
C:\Users\mark_\AppData\Roaming\graphshell\graphs\diagnostics_exports\diagnostics-1771804408.json - Result: exported snapshot includes non-zero Servo bridge channels
(
servo.delegate.*,servo.graph_event.*,servo.event_loop.spin), confirming Servo runtime path is persisted in JSON snapshot output.
- User-captured file:
- Headed perf validation: diagnostics tick rate vs egui responsiveness
- Result: user-confirmed on 2026-02-22 that diagnostics updates remained rate-limited (β€10 Hz behavior) and main egui FPS/responsiveness did not show degradation.
- Headed Engine visibility validation: Servo-originated activity in Diagnostics > Engine
- Result: user-confirmed on 2026-02-22 that Engine tab displays live Servo-originated channels/spans while loading and navigating real pages.
- Headed Engine-tab Servo visibility validation
- Result: user-confirmed on 2026-02-22 that Engine tab shows Servo-originated activity while loading/navigating real pages.
Source: implementation_strategy/SUBSYSTEM_ACCESSIBILITY.md (Β§6.1, Β§8 Phase 1)
Scope guard (Phase 1 only):
- Validate the WebView accessibility bridge baseline (
notify_accessibility_tree_update-> bridge injection). - Do not include Graph Reader/Map/Room mode validation in this slice.
Run these as the P10.d baseline harness set:
-
cargo test --lib shell::desktop::ui::gui::accessibility_bridge_tests::inject_webview_a11y_updates_drains_pending_map -- --exact- Result: pass (1/1); confirms pending WebView accessibility updates are consumed each frame and do not accumulate.
-
cargo test --lib shell::desktop::ui::gui::accessibility_bridge_tests::webview_a11y_graft_plan_includes_injectable_nodes_and_root -- --exact- Result: pass (1/1); confirms bridge planning yields an injectable node set + stable root in the supported baseline scenario.
-
cargo test --lib shell::desktop::ui::gui::accessibility_bridge_tests::webview_a11y_graft_plan_tracks_role_conversion_fallbacks -- --exact- Result: pass (1/1); confirms degraded conversion fallback remains explicit/observable when role compatibility falls back.
-
cargo check -q --lib- Result: pass (warnings only); confirms bridge validation/harness code compiles in library path.
This checklist is the required cross-contributor handoff script for P10.c/P10.d split work.
Preconditions
- Build/run Graphshell with normal desktop runtime (
cargo run -p graphshell --bin graphshell -- -M https://example.com). - Ensure one screen reader is active on platform under test:
- Windows: NVDA
- Linux: Orca
- macOS: VoiceOver (if available for contributor)
- Open at least one node pane with live embedded web content.
Checks
- Bridge baseline readability
- Action: move accessibility focus into the active embedded web content region.
- Expected: screen reader announces an embedded web content/document region (derived from focused or labeled nodes when present).
- Focused-label preference
- Action: navigate to a clearly labeled element (e.g., heading/link) in embedded content.
- Expected: announcement prefers focused element label over generic fallback wording.
- Frame-to-frame continuity
- Action: trigger content updates (scroll/navigation within the same pane) and continue reading.
- Expected: reading remains stable across updates; no repeated loss of document context.
- Degraded fallback remains announced
- Action: open sparse/minimal content where explicit labels may be absent.
- Expected: reader still announces a usable embedded web-content fallback label rather than silence.
Include the following in the issue/PR handoff comment:
- Platform + screen reader used.
- URLs/content types tested.
- Pass/fail per checklist item above.
- Any observed degraded behavior with concise reproduction notes.
- Automated harness command output summary (the four commands above).
Source: implementation_strategy/2026-02-24_performance_tuning_plan.md (Phase 1 + Validation)
Scope guard (P10.b only):
- Validate culling effectiveness and benchmark instrumentation.
- Do not change culling policy semantics in this slice.
-
cargo test -q viewport_culling_metrics_reduce_visible_set_and_submission_units -- --nocapture- Result: pass (1/1); confirms visible-set/submitted-set shrink relative to full graph and reduced submission work units.
-
cargo test -q viewport_culling_benchmark_reports_lower_prep_time_than_full_rebuild -- --nocapture- Result: pass (1/1); confirms benchmark instrumentation reports lower rebuild-prep time for culled graph vs full graph on dense synthetic fixture.
- Run Graphshell against the same dense graph fixture and camera framing for each comparison run.
- Keep diagnostics sampling bounded (default diagnostics cadence in the Diagnostic Inspector) to avoid observer-effect distortion.
- Capture both runs using this pair:
- Baseline: viewport culling disabled for the test window.
- Candidate: viewport culling enabled with identical fixture/camera path.
- Capture and compare:
- visible/submitted node counts from culling metrics path,
- prep/rebuild timing trend (full vs culled path),
- frame-time trend from diagnostics view over the same interaction script.
Use this exact structure in handoff comments for cross-contributor comparability:
- Fixture ID / node+edge count:
- Camera framing / zoom / pan seed:
- Interaction script duration:
- Baseline (
culling=off):- visible/submitted nodes:
- full/culled submission units:
- prep timing summary:
- frame-time summary:
- Candidate (
culling=on):- visible/submitted nodes:
- full/culled submission units:
- prep timing summary:
- frame-time summary:
- Delta summary:
- visible-set reduction direction (expected: down)
- submission-unit reduction direction (expected: down)
- frame-time impact direction (expected: down or neutral)
Evidence to capture in handoff comment
- Fixture size and viewport framing used.
- Commands + pass/fail for automated checks.
- Full vs culled timing summary and observed frame-time delta direction.
Source: implementation_strategy/2026-02-22_workbench_workspace_manifest_persistence_plan.md + implementation_strategy/2026-02-22_workbench_tab_semantics_overlay_and_promotion_plan.md
Context: Items 7, 10, and 11 now have automated coverage. The remaining routing/membership checks require headed-manual validation.
Start command (PowerShell):
$env:RUST_LOG="graphshell=debug"; cargo run -p graphshell --bin graphshell -- -M https://example.comBaseline setup (once):
-
Create at least 4 nodes (
A,B,C,D) by opening distinct URLs. -
Save frame
workspace-alphacontainingAandB. -
Save frame
workspace-betacontainingAandC. -
Save frame
workspace-singlecontaining onlyD. -
Return to a different layout so restore behavior is visible.
-
Single-membership routed open
- Action: double-click node
D. - Expected:
workspace-singlerestores directly; no synthesized fallback frame behavior. - Run note (2026-02-19): Passed.
- Action: double-click node
-
Multi-membership default recency + explicit chooser
- Action A: restore
workspace-beta, then leave it; double-click nodeA. - Expected A: default routed open restores
workspace-beta(most recent). - Action B: open node
Acontext menu/radial and chooseChoose Frame..., selectworkspace-alpha. - Expected B:
workspace-alpharestores. - Run note (2026-02-19): Passed (A and B).
- Action A: restore
-
Zero-membership open remains in current frame context
- Action: create a new node
Eand do not save any frame that contains it; openE. - Expected: opens in current frame context (tab open), no named frame is created implicitly.
- Run note (2026-02-19): Core behavior passed (no named frame auto-created). Prompt gating behavior changed twice (sticky, then suppressed). Follow-up fixes landed: routed frame-open no longer clears frame prompt state before switch handling;
SetNodePositionno longer flags unsaved state; session-autosave path no longer raises modal prompt (prompting is reserved for explicit frame-switch flows). Re-validation reported no prompt at all in expected switch case; follow-up fix pending and requires headed re-validation.
- Action: create a new node
-
Open with Neighbors synthesis cap
- Action: choose a hub node with many direct neighbors; run
Open with Neighbors. - Expected: synthesized frame contains hub + direct neighbors, capped at 12 opened tiles.
- Run note (2026-02-19): Passed (12-tile cap observed).
- Action: choose a hub node with many direct neighbors; run
-
Frame delete removes routing target immediately
- Action: delete
workspace-beta, then open nodeAvia default route and viaChoose Frame.... - Expected:
workspace-betais never selected/returned; chooser does not list it. - Run note (2026-02-19): Passed.
- Action: delete
-
Startup membership scan before first render
- Action: restart app.
- Expected: membership badges/tooltips (
N) are correct immediately on first render, without needing a manual frame switch. - Run note (2026-02-19): Passed.
-
Empty-restore fallback warning path
- Action: trigger restore of a named frame snapshot that prunes to empty after stale-key cleanup.
- Expected: app logs fallback warning and opens target node in current frame instead of failing.
- Note: this case may require a stale-layout fixture or manual store edit to force an empty post-prune restore.
- Run note (2026-02-19): Not yet reproduced.
-
Radial pair-edge parity with keyboard
G- Action: with a valid pair context, trigger pair edge creation from radial menu (
Edge -> Pair) and compare with keyboardG. - Expected: radial path creates the same
UserGroupededge as keyboard command. - Run note (2026-02-19): Reported mismatch (
Gworks, radial pair did not). Follow-up fixes landed: pair context honors explicit node context target; graph interactions are disabled while command menu is open; right-click now opens a compact node context menu (radial remains keyboard/F3 path), avoiding immediate close on right-button release. Re-validation passed via current context-menu path.
- Action: with a valid pair context, trigger pair edge creation from radial menu (
-
Node context menu hierarchy
- Action: right-click a node in graph view.
- Expected: context menu presents grouped hierarchy (
Frame,Edge,Node) via submenu-style entries, and closes onEscor action selection. - Keyboard: while open,
Left/Rightswitches group,Up/Downmoves action focus (includingClose),Enterexecutes highlighted action or closes whenCloseis focused. - Run note (2026-02-19): Passed; follow-up polish applied to rely on arrow-key navigation only and include keyboard-selectable close.
-
Add node tab to existing frame
- Action: right-click node
E->Frame->Add To Frame...and pickworkspace-alpha. - Expected:
workspace-alphasnapshot now includesEas a tab on next restore/load, andEframe badge count increments accordingly. - Run note (2026-02-19): Passed; wording/label clarity for frame-route action remains a UX follow-up.
- Unified context-aware pin control
- Action A: with graph/frame focus (no active pane focus), use Persistence Hub
Pin Frame; mutate layout and verify highlight toggles off; restore usingLoad Pin... -> Frame Pinand verify previous layout restores. - Action B: with an active pane focus, use Persistence Hub
Pin Pane; mutate layout and verify highlight toggles off; restore usingLoad Pin... -> Pane Pinand verify previous layout restores. - Expected: single pin control adapts to focus context (
FramevsPane) and renders active state only when current layout matches saved pin snapshot. - Run note (2026-02-19): Workspace pin flow passed. Pane pin restore semantics remain unclear in graph-only context, and pane pin persistence/visibility after view switching needs follow-up.
Source: implementation_strategy/2026-02-23_graph_interaction_consistency_plan.md + implementation_strategy/2026-02-20_edge_traversal_impl_plan.md
Context: Automated coverage exists for input/intent/app semantics. Headed-manual validation is required for end-to-end viewport behavior and interaction feel.
Keyboard model update:
-
Ckeyboard fit shortcut is retired. -
Zis the single smart-fit key:-
2+selected nodes: fit selected bounds. -
0or1selected node: fit full graph.
-
-
Keyboard zoom in/out/reset
- Action: in graph view (no text field focused), press
+a few times,-a few times, then0. - Expected: zoom increases, decreases, then resets to
1.0xin overlay.
- Action: in graph view (no text field focused), press
-
Keyboard zoom blocked during text entry
- Action: focus URL bar (
Ctrl+L), then press+,-,0,Z. - Expected: graph zoom/fit does not trigger while text input owns keyboard focus.
- Action: focus URL bar (
-
Smart-fit: zero selection
- Action: clear selection and press
Z. - Expected: full-graph fit.
- Action: clear selection and press
-
Smart-fit: single selection
- Action: select exactly one node and press
Z. - Expected: full-graph fit (not single-node framing).
- Action: select exactly one node and press
-
Smart-fit: multi-selection
- Action: select two or more distant nodes and press
Z. - Expected: viewport fits selected-node AABB with padding.
- Action: select two or more distant nodes and press
-
Wheel/trackpad zoom feel
- Action: use Ctrl+wheel or Ctrl+trackpad pinch/scroll.
- Expected: zoom responsiveness feels slightly faster than previous
0.01setting and remains controllable.
-
Zoom bounds
- Action: repeatedly zoom out, then repeatedly zoom in.
- Expected: zoom clamps at configured bounds; no runaway scale.
-
Interaction regression sweep
- Action: after zoom/fit operations, drag/select nodes and open nodes from graph.
- Expected: graph interactions remain stable (no stuck input state).
Source: Active UX polish + follow-up implementation sessions
Context: These checks were identified during rapid implementation and need explicit headed verification before closing the UX-polish tranche.
-
Wheel vs trackpad tuning target
- Action: validate
Ctrl+wheeland precision trackpad pinch/scroll on at least one wheel mouse and one trackpad. - Expected: zoom feels controllable on both devices with no over/under-shoot bias.
- Action: validate
-
Tab-click warm -> active routing
- Action: from workspace tabs, click a warm node tab and then navigate.
- Expected: node reliably promotes to active and page loads in the expected pane context.
-
Lasso reliability under dense node overlap
- Action: generate a dense graph cluster and perform repeated right-drag lasso sweeps.
- Expected: lasso selection set is deterministic and matches visual rectangle containment.
-
Omnibar non-
@ordering policy- Action: test non-
@query flow for: local tabs present, connected matches present, provider fallback, and global fallback. - Expected: ordering follows configured settings (scope + non-
@order preset) and non-local caps are respected.
- Action: test non-
-
Omnibar provider failure-state visibility
- Action: disable network or force provider error responses while typing non-
@and@g/@b/@dqueries. - Expected: dropdown surfaces clear provider status (loading / error) without blocking input or crashing.
- Action: disable network or force provider error responses while typing non-
Source: cargo check -p graphshell --lib warning sweep + code scan (2026-02-20)
Context: Keep these visible so transitional code paths do not silently rot. Prioritize items that indicate dormant feature producers/consumers and integration seams that may need revival.
-
Test-only helpers leaking into non-test builds
- Files:
ports/graphshell/desktop/gui.rs,ports/graphshell/desktop/persistence_ops.rs - Signal: unused import/function warnings for test wrappers and helpers.
- Validation expectation: move wrappers/imports behind
#[cfg(test)]and confirm no behavior change.
- Files:
-
Dormant intent producers vs handled intent variants
- File:
ports/graphshell/app.rs - Variants:
CreateUserGroupedEdgeFromPrimarySelection,WebViewScrollChanged,SetNodeFormDraft - Signal: variants are handled but currently not constructed in lib build.
- Validation expectation: either wire active producers in runtime flows or explicitly mark as deferred bridge paths in plan/docs.
- File:
-
Potentially stale GUI bridge helpers
- File:
ports/graphshell/desktop/gui.rs - Symbols:
active_tile_webview_id, semantic-event conversion helper wrappers. - Signal: dead-code warnings suggest old integration seams.
- Validation expectation: remove if obsolete, or reattach to intended flow and verify usage under headed run.
- File:
-
Scaffold marker still present in tile path
- File:
ports/graphshell/desktop/tile_kind.rs - Signal: explicit scaffold/dead-code allowance comment.
- Validation expectation: confirm this is still intentional in current phase, or retire the allowance when tile flow is fully active.
- File:
-
Graph model dead fields/methods that may reflect deferred features
- Files:
ports/graphshell/graph/mod.rs,ports/graphshell/graph/egui_adapter.rs - Symbols:
Node.velocity,get_node_by_id,has_edge_between,EguiGraphState::from_graph. - Signal: dead-code warnings; may indicate partially implemented or superseded paths.
- Validation expectation: decide revive/remove/annotate; if revive, add explicit validation scenarios tied to the owning plan.
- Files:
-
Recovery TODO/FIXME that can impact UX and correctness
- Files:
ports/graphshell/desktop/dialog.rs,ports/graphshell/desktop/headed_window.rs,ports/graphshell/panic_hook.rs,ports/graphshell/lib.rs,ports/graphshell/prefs.rs - Signal: active TODO/FIXME markers in runtime code.
- Validation expectation: each marker either receives a tracked owner/plan link or a scoped deferral note with revalidation trigger.
- Files:
Method: Use Diagnostic Inspector > Intents Tab to observe WebView* intents in real-time.
Automation Status: Ready.
Strategy: Capture DiagnosticEvent stream during navigation. Assert url_changed vs history_changed ordering and payload content programmatically.
Finding: During back/forward navigation, Servo fires url_changed before history_changed
reflects the new stack position β and the URL in url_changed is the source URL, not the
destination. Trace evidence (all events at t_ms=131, same frame):
seq=9: url_changed β ?step=2 (pushState arrives)
seq=10: history_changed len=3 current=2 (stack confirmed at step=2)
seq=11: url_changed β ?step=2 β spurious? fires before back completes
seq=12: history_changed len=3 current=1 β back navigation; now at step=1, but url above said step=2
seq=13: url_changed β ?step=2 β fires before forward completes
seq=14: history_changed len=3 current=2 β forward; back at step=2
Going back from ?step=2 to ?step=1: url_changed(?step=2) fires first, then
history_changed(current=1). sync_webviews_to_graph() reads url_changed to detect new
navigations and create nodes/edges β it will see ?step=2 as an apparent navigation event even
though the back transition is to ?step=1.
Risk: spurious node creation or misclassified edge type (Hyperlink instead of History)
for back/forward transitions that emit url_changed for a URL that already exists.
- No spurious node on back navigation: browse to A β B β C, go back to B β confirm only
nodes A, B, C exist (no duplicate C created during the back transition's
url_changed). - Back/forward edge type: back/forward transitions are classified as
Historyedges, notHyperlinkedges, even whenurl_changedfires with the source URL beforehistory_changed. - Burst scenario β node count invariant: run
scenario_back_forward_burst.html(pushState Γ3, back Γ1, forward Γ1) β confirm final node count is 4 (base + step=0,1,2), not 5 or 6 from spurious url_changed events. - Burst scenario β no duplicate edges: same run β confirm no duplicate
HistoryorHyperlinkedges exist between the burst nodes after the sequence completes.
Source: archive_docs/checkpoint_2026-02-19/2026-02-17_f1_multi_pane_validation_checklist.md
Context: F1 automated tests pass; headed split/grouping trigger validation not yet recorded.
Validation Aid: Use Diagnostic Inspector > Compositor Tab to verify tile hierarchy structure.
Automation Status: Ready.
Strategy: Inspect DiagnosticsState.compositor_state.frames after action. Assert hierarchy contains expected split/tab structure.
-
Split trigger creates edge
- Select node A, then
Shift+Double-clicknode B. - Expected: exactly one
UserGroupededge AβB is created.
- Select node A, then
-
Drag-into-same-tab-group trigger
- Open nodes A and B in separate detail panes.
- Drag one pane into the same
Tabscontainer as the other. - Expected: exactly one
UserGroupededge created (no duplicates).
-
Explicit group-with-focused action
- Focus node A (pane A active).
- Run the "Group with Focused" command on node B (command palette or
Gkey). - Expected: exactly one
UserGroupededge AβB created.
-
No-trigger paths
- Switch pane focus, switch tabs, navigate URLs normally.
- Expected: no new
UserGroupededges created from those actions alone.
Source: 2026-02-18_edge_operations_and_cmd_palette_plan.md (active plan)
Context: Scope implementation complete; headed validation not yet executed.
Automation Status: Ready.
Strategy: Inspect DiagnosticsState.intents to verify OpenNodeFrameRouted vs CreateNodeAtUrl intent emission with correct scope.
- Graph mode, Enter cycling: type
@term, press Enter repeatedly β active match cycles through all results and wraps at end. - Detail mode, multi-pane: type
@term, press Enter β each press focuses/opens the matched node in the correct pane/tab context. -
@t termin detail mode: only currently open tab/pane-backed nodes are cycled. -
@T termin detail mode: active and saved frame tab matches cycled deterministically. -
@n termin graph mode: only active-graph-context matches are cycled. -
@N termin graph mode: active + saved graph matches cycled deterministically. -
@e termin graph mode: only active-graph searchable edge matches cycled and selected. -
@E termin graph mode: active + saved-graph searchable edge matches cycled deterministically. - Clear query after cycling: counter/session reset, normal URL submit behavior resumes.
- Mode switch during active query: graph β detail β no panic, no stale focus target, no incorrect pane navigation.
Source: archive_docs/checkpoint_2026-02-19/2026-02-18_f6_explicit_targeting_plan.md
Context: Exit criteria met (two focused unit tests pass). Optional extended harness tests not yet written.
- End-to-end wrapper dispatch: call
load_uri_for_webview,go_back_for_webview, and at least two input_for_webviewmethods from a simulated host-caller harness. Verify: correct webview receives each command, no fallback warning logged, other active webview unaffected. - Multi-webview isolation: with two active webviews, route input events to each via
_for_webview. Confirm no cross-contamination between webviews. - Fallback warning rate-limit: configure
preferred_input_webview_idto returnNone. Confirm deprecation warning emitted once and rate-limited on repeated calls within the cooldown window.
Source: Spec in 2026-02-18_edge_operations_and_cmd_palette_plan.md Β§Global Undo/Redo Boundary
Context: Undo/redo is functional but has no dedicated plan doc, no feature entry in IMPLEMENTATION_ROADMAP, and no recorded validation against the grouping/exclusion rules in the spec. A feature entry and these tests should be added when the roadmap is next updated.
- Single graph mutation: add a node β undo β node is removed; redo β node reappears.
- Multi-intent command as one undo step: run
ConnectBothDirections(creates two edges) β undo β both edges removed in one step (not two separate undos). - Workspace/graph restore is atomic: load a named workspace snapshot β undo β prior workspace layout is restored as one transaction.
- Webview navigation is not undoable: navigate a webview to a new URL via link click β undo β URL change is NOT reverted (only explicit user commands are undoable).
- Physics simulation is not undoable: run physics until nodes settle β undo β node positions are NOT reverted; only explicit layout commands (e.g.
Fit) are undoable. - Undo survives mode switch: perform an action in graph mode β switch to detail mode β undo β action is reversed (undo stack persists across view modes).
- Persistence parity after undo/redo cycle: create
UserGroupededge β undo removes it β redo re-adds it β restart app β edge state reflects the final post-redo state in the persisted log.
Source: archive_docs/checkpoint_2026-02-16/2026-02-15_navigation_control_plane_plan.md (Β§Required Validation Artifacts)
Context: Archived plan called for deterministic integration validation around targeted navigation. These checks remain relevant for current omnibar + tile-focused routing behavior.
Validation Aid: Use Diagnostic Inspector > Intents Tab to confirm OpenNodeFrameRouted vs CreateNodeAtUrl intent emission.
Automation Status: Ready.
Strategy: Assert DiagnosticsState.intents contains expected intent variants with correct target keys.
- Omnibar URL submit in graph mode: submit a URL from graph mode and verify deterministic target behavior (new/opened node, correct selection/focus outcome).
- Omnibar URL submit in detail mode: with two panes, focus pane A then pane B and submit distinct URLs; verify each submit targets the focused pane only.
-
@queryno-match feedback: enter a query with no matches and verify UI feedback is explicit, stable, and does not trigger unintended navigation. - Back/Forward/Reload target isolation: with distinct histories in two panes, invoke controls and verify only the focused pane is affected.
- Mode and focus regression pass: switch graph/detail modes, switch active tiles/tabs, and repeat submit/navigation actions; verify no stale target leakage.
Source: archive_docs/checkpoint_2026-02-16/2026-02-12_physics_selection_plan.md (Β§Validation Tests)
Context: These behaviors still exist and are user-visible; archived validation list remains useful as manual/integration guardrails.
- Physics toggle responsiveness:
Tkey reliably pauses/resumes layout updates with no stuck intermediate state. - Physics panel parameter effect: changing force parameters in panel produces observable layout response without instability/panic.
- Pinned-node invariance under motion: pinned nodes remain fixed while surrounding unpinned nodes continue to move.
- Selection durability through interactions: selection state remains coherent across drag, focus changes, and node delete operations.
- Mid-size convergence sanity: graph with ~50 nodes reaches a visually stable layout in a reasonable interval on baseline hardware.