2026 02 23_graph_interaction_consistency_plan - mark-ik/graphshell GitHub Wiki

Graph Interaction Consistency Plan (2026-02-23)

Status: Closed / Archived 2026-04-01 โ€” historical execution record only Supersedes: Prior ad-hoc zoom/scroll patches in render/mod.rs; absorbed remaining items from 2026-02-19_graph_ux_polish_plan.md ยง1.4 (scroll zoom speed) and the "smart fit" + "no-ctrl scroll" feature targets.

Canonical authority now lives in:

  • ../aspect_input/input_interaction_spec.md โ€” input routing, hover-vs-focus guardrails, and Escape semantics
  • graph_node_edge_interaction_spec.md โ€” graph-surface camera, selection, and multi-select semantics
  • layout_behaviors_and_physics_spec.md and 2026-02-24_layout_behaviors_plan.md โ€” viewport gravity and physics follow-on work
  • ../aspect_command/command_surface_interaction_spec.md and ../aspect_control/2026-02-24_control_ui_ux_plan.md โ€” command-surface and secondary-input-surface routing
  • ../workbench/workbench_frame_tile_interaction_spec.md and ../../TERMINOLOGY.md โ€” split/container/tab-group semantics

Closure Summary

  • Wheel Zoom, durable camera-command routing, and startup camera-fit behavior landed through the current graph camera path.
  • Shared input and dismissal policy moved into the canonical Input, Focus, and Command specs.
  • Split/container labels and workbench terminology moved into the workbench/tile and terminology authorities.
  • The only still-live follow-on from this plan, viewport-relative gravity behavior, moved into the layout/physics docs.
  • The historical plan text is retained below as an implementation record only; the unchecked tasks are not active authority.

Problem Statement

Three categories of UX inconsistency:

  1. Graph navigation is unreliable. Scroll-to-zoom without Ctrl doesn't work. Camera Fit / Focus Selection (Z/C keys) doesn't fire. Startup zoom has no visible effect. Multiple iterations have failed because the root cause โ€” input ownership and event routing โ€” was never addressed; patches targeted render-time helpers that execute too late or against stale state.

  2. Tile tree operations are semantically under-explained. "Horizontal" and "Vertical" appear in tab strips because they are real Container::Linear nodes in the tile tree. They can be useful (they expose split structure), but today they lack context and naming guidance, so users interpret them as bugs. The relationship between Graph/WebView panes, container nodes, and Workbench structure is still opaque.

  3. Nodes drift off-screen. A single node with no edges has no mutual-stabilizing forces. The center gravity locus is fixed at graph-space origin (0,0), not at the viewport center. After panning, gravity pulls nodes away from where the user is looking, and a lone node floats off the visible area.


Interaction Model Defaults (Revised)

Use conventional defaults first; keep alternatives as configuration.

This document no longer treats a single focus-routing rule as a hard invariant. The default interaction model should prioritize what users generally expect from pane-based interfaces, while exposing alternative routing policies as configuration options.

Canonical override note:

  • Semantic focus ownership is governed by ../subsystem_focus/focus_and_region_navigation_spec.md.
  • Where this plan previously implied hover-driven semantic retargeting, that behavior is superseded.
  • Hover is pointer-targeting input only; semantic keyboard/camera command targeting changes only via explicit activation and router-owned handoff.

Default policy targets:

  • Hovering a pane makes it the active pointer/scroll target. No click required.
  • Scrolling routes to the currently hovered pane โ€” graph panes zoom, webview panes scroll page content. No modifier key required by default.
  • Keyboard and camera/navigation commands target the semantic focus owner chosen by the focus router, not merely the hovered region.
  • Pointer hover alone must not retarget semantic keyboard/camera ownership; explicit pointer activation (for example click/tap) or explicit region-navigation commands are required.
  • scroll_zoom_requires_ctrl remains an explicit opt-out for users who prefer Ctrl-to-zoom conventions.

Design rule:

  • Prefer mainstream, predictable defaults first.
  • Preserve configuration hooks so alternate focus/input-routing behaviors can be offered later without re-architecting the input path.

Focus-routing guardrail:

  • Any optional alternative input-routing policy must remain compliant with the semantic-owner invariants in focus_and_region_navigation_spec.md.

Terminology Corrections

Per the TERMINOLOGY.md living document, the following renames apply in code comments, logs, and UI strings:

Old term Canonical term Notes
"fit to screen" Camera Fit Fits viewport to node bounds. Avoids confusion with display/fullscreen.
"zoom to selected" Focus Selection Fits viewport to selected-node bounds.
"scroll zoom" Wheel Zoom Covers mouse wheel, trackpad scroll, and smooth-scroll.
"Horizontal" / "Vertical" (tile containers) Split User-facing label for Container::Linear. Internal code may keep Linear.
"graph_surface_focused" Graph Pane Focused Aligns with Pane terminology.

Action: Update TERMINOLOGY.md with Camera Fit, Focus Selection, Wheel Zoom, Split.


Root Cause Analysis

Why scroll-to-zoom without Ctrl doesn't work

egui_graphs SettingsNavigation::with_zoom_and_pan_enabled(true) registers an InputState callback that consumes scroll events when Ctrl is held. When we set with_zoom_and_pan_enabled(false), egui_graphs stops consuming scroll events โ€” but egui's ScrollArea or parent Ui widgets may still interpret them as scroll/pan. Our post-render handle_custom_navigation reads smooth_scroll_delta / raw_scroll_delta, but by the time it runs, the scroll events may have been consumed by egui's own scroll handling earlier in the frame.

Fix: Intercept scroll events before GraphView renders by injecting a ui.input_mut() call that converts scroll deltas into zoom state, or by using ui.interact() with a Sense::hover() on the graph rect to claim the input.

Why Camera Fit doesn't fire

The custom apply_pending_fit_to_screen_request reads app.fit_to_screen_requested, but the flag is consumed by take_pending_fit_to_screen_request() which was called in an earlier code path. Additionally, the flag must survive until the MetadataFrame is available in egui's persisted data โ€” on the first frame after graph init, it may not exist yet.

Fix: Use a two-phase approach: set a durable flag that persists across frames until successfully applied, and only clear it after confirming the MetadataFrame write succeeded.

Why startup zoom has no effect

pending_initial_zoom is set in the constructor, but apply_pending_initial_zoom fires before the MetadataFrame is populated by egui_graphs on its first layout pass. The zoom is attempted, finds no MetadataFrame, does nothing, and the flag is never retried.

Fix: Same durable-flag pattern. Additionally, startup should trigger Camera Fit instead of a fixed zoom value, since a fixed zoom can't account for the number or spread of nodes.

Why nodes drift off-screen

The FR center gravity force (state.extras.0.params.c = 0.18) pulls toward graph-space origin (0, 0). After the user pans, the viewport center diverges from (0, 0). Nodes with weak or no edge forces get pulled toward graph-space origin, which is now off-screen.

Fix: Update the gravity locus to track the viewport center in graph space. This means the gravity target shifts as the user pans, keeping nodes attracted toward what the user is actually looking at.

Why tile operations feel confusing

egui_tiles exposes Container::Linear(LinearLayout { dir: LinearDir::Horizontal | Vertical }) as a visible tab title when the container appears in the tab strip. This is architecturally correct: container tiles are first-class nodes that can appear anywhere tabs can appear. In Graphshell, all_panes_must_have_tabs: true intentionally wraps panes in Tabs, so split/merge flows often surface container nodes. The current rendering path falls through to format!("{:?}", container.kind()) without structural cues, so valid architecture is presented with ambiguous UX.

Fix: Keep container visibility, but make it explicit and teachable. Override tab_title_for_tile (not just tab_title_for_pane) to render semantic labels and optionally directional glyphs (e.g., Split โ†”, Split โ†•, Tabs, Grid) plus lightweight affordances that explain what selecting that container means.


Implementation Phases

Phase 1: Input Ownership (fixes scroll-to-zoom)

Goal: Scroll wheel over graph pane = zoom. No Ctrl required. Configurable.

Approach: Pre-render input interception.

  1. Before GraphView::new() renders, call ui.input_mut(|i| ...) to:
    • Read smooth_scroll_delta.y and raw_scroll_delta.y.
    • If the graph rect is hovered (check via stored response or ui.rect_contains_pointer), zero out the scroll deltas so egui/egui_graphs won't interpret them as scroll.
    • Store the consumed scroll delta in an app-owned field (app.pending_wheel_zoom_delta).
  2. In handle_custom_navigation (post-render), read app.pending_wheel_zoom_delta and apply the zoom transform to MetadataFrame.
  3. The scroll_zoom_requires_ctrl setting gates step 1: if true, only consume scroll when Ctrl is held.

Why this works: By zeroing the scroll delta before the GraphView widget runs, no other widget can consume it. The zoom application happens post-render against the now-populated MetadataFrame.

Zoom pivot transform (pointer-relative zoom, exact form):

fn apply_zoom_around_point(meta: &mut MetadataFrame, pivot_screen: Pos2, factor: f32) {
    let new_zoom = (meta.zoom * factor).clamp(ZOOM_MIN, ZOOM_MAX);
    // Keep the graph-space point under the pointer stationary:
    // pivot_screen = pan + pivot_graph * zoom  (before and after)
    // => new_pan = pivot_screen - pivot_graph * new_zoom
    //            = pivot_screen - (pivot_screen - pan) / zoom * new_zoom
    meta.pan = pivot_screen - (pivot_screen - meta.pan) * (new_zoom / meta.zoom);
    meta.zoom = new_zoom;
}

This function should be extracted as a named, unit-testable helper so the coordinate-space arithmetic can be verified in isolation without a full harness frame.

Note โ€” lasso drag shares the same input race: Mouse drag on the graph pane for lasso selection must also be claimed before egui's own drag-panning consumes the pointer event. The pre-render ui.input_mut() infrastructure built here should be designed generically enough to support claiming drag events for lasso in the same interception pass. See Phase 1 tasks below.

Files: render/mod.rs

Tasks:

  • Add pending_wheel_zoom_delta: f32 field to GraphBrowserApp.
  • In render_graph_in_ui_collect_actions, before GraphView render: ui.input_mut() to intercept and zero scroll deltas when graph is hovered.
  • Implement apply_zoom_around_point(meta, pivot_screen, factor) as a named helper with unit tests.
  • In handle_custom_navigation, consume pending_wheel_zoom_delta and call apply_zoom_around_point with current pointer position as pivot.
  • Design the pre-render interception block to also support claiming drag/pointer events (needed for lasso โ€” same race, same fix location).
  • Remove old apply_scroll_zoom_without_ctrl function entirely.
  • Verify scroll_zoom_requires_ctrl setting is respected.

Phase 2: Durable Camera Commands (fixes fit + startup zoom)

Goal: Camera Fit (Z/C), Focus Selection, and startup zoom always succeed.

Approach: Replace one-shot booleans with durable command enums that retry until the MetadataFrame is ready.

  1. Replace fit_to_screen_requested: bool and pending_initial_zoom: Option<f32> with a single pending_camera_command: Option<CameraCommand> enum:

    enum CameraCommand {
        Fit,              // Fit all nodes with relax factor (also used on startup)
        FitSelection,     // Fit selected nodes (tighter)
        SetZoom(f32),     // Absolute zoom
    }

    No StartupFit variant โ€” startup uses Fit directly. The durable retry-until-MetadataFrame-ready pattern handles the first-frame timing race for both startup and keypress paths identically. A separate variant would add complexity without enabling any different behavior.

  2. handle_custom_navigation attempts to apply the pending command. If the MetadataFrame doesn't exist yet, it leaves the command in place for the next frame.

  3. On successful application, clear the command.

  4. Startup: set pending_camera_command = Some(CameraCommand::Fit) in the constructor. Remove DEFAULT_STARTUP_ZOOM constant and pending_initial_zoom field.

  5. Z key: if 2+ selected, CameraCommand::FitSelection; else CameraCommand::Fit.

  6. C key: always CameraCommand::Fit.

Tuning constants (named, top of render/mod.rs):

  • CAMERA_FIT_PADDING: f32 = 1.1 โ€” bounding-box padding multiplier.
  • CAMERA_FIT_RELAX: f32 = 0.5 โ€” zoom-back factor (0.5 = 50% as tight as mathematical fit).
  • CAMERA_FOCUS_SELECTION_PADDING: f32 = 1.2 โ€” tighter padding for selection fit.

Files: app.rs, render/mod.rs, input/mod.rs

Tasks:

  • Define CameraCommand enum in app.rs.
  • Replace fit_to_screen_requested, pending_initial_zoom, pending_zoom_to_selected_request with pending_camera_command: Option<CameraCommand>.
  • Update request_fit_to_screen() โ†’ request_camera_command(CameraCommand::Fit).
  • Update Z/C key handlers in input/mod.rs to emit the correct CameraCommand.
  • In handle_custom_navigation: single dispatch site for CameraCommand with retry-on-missing-metadata.
  • Remove apply_pending_initial_zoom, apply_pending_fit_to_screen_request, apply_pending_zoom_to_selected_request (consolidated into one function).
  • Constructor: initialize with CameraCommand::Fit.

Phase 3: Viewport-Tracking Gravity (fixes node drift)

Goal: Center gravity pulls nodes toward the viewport center, not graph-space origin.

Approach: Each frame, compute the viewport center in graph space from the current MetadataFrame (pan + zoom), and pass it to the physics simulation as the gravity target.

  1. After GraphView renders and MetadataFrame is populated, compute:
    viewport_center_graph = (viewport_center_screen - meta.pan) / meta.zoom
    
  2. Write this to the FR state's gravity target: state.extras.0.params.target = viewport_center_graph.
  3. This requires extending FruchtermanReingoldWithCenterGravityState (or its params) to accept a target point instead of defaulting to (0, 0). If the upstream crate doesn't expose this, apply the gravity force manually in apply_semantic_clustering_forces or a new apply_viewport_gravity helper.

Fallback (if upstream doesn't support target point): After the FR layout pass, apply a manual force toward the viewport center to all nodes. This is less elegant but achieves the same result.

Gravity locus dampening: Snapping the gravity target to the exact viewport center every frame causes nodes to aggressively chase rapid panning gestures. Use an exponential lerp instead:

self.gravity_target = self.gravity_target.lerp(viewport_center_graph, 0.05);

A factor of 0.05 per frame (at 60fps) gives a ~3-second settling time โ€” nodes drift gently toward where the user is looking rather than lurching. Expose this as a named constant GRAVITY_TARGET_LERP_FACTOR: f32 = 0.05 so it can be tuned alongside the other physics parameters.

Files: render/mod.rs, possibly egui_graphs fork

Tasks:

  • Add gravity_target: Pos2 field to the app or physics state (initialized to (0, 0)).
  • Check if FruchtermanReingoldWithCenterGravity params support a configurable gravity target.
  • If yes: lerp gravity_target toward viewport center each frame, then set as target.
  • If no: add apply_viewport_gravity helper that applies a small force toward gravity_target after layout.
  • Define GRAVITY_TARGET_LERP_FACTOR: f32 = 0.05 constant alongside other physics tuning constants.
  • Verify single nodes stay on-screen after panning.
  • Verify nodes don't lurch during fast pan gestures (feel drifts, not snaps).

Phase 4: Tile Tree Semantics & Discoverability (fixes user confusion)

Goal: Preserve true architecture in the UI while making it predictable and understandable. Users should understand why container tabs exist and what actions they represent.

Approach:

  1. Container semantic labels: In tile_behavior.rs, container fallback currently does format!("{:?}", container.kind()). Replace this with explicit container labels in tab_title_for_tile:

    • ContainerKind::Horizontal โ†’ Split โ†”
    • ContainerKind::Vertical โ†’ Split โ†•
    • ContainerKind::Tabs โ†’ Tab Group
    • ContainerKind::Grid โ†’ Grid Keep names short and stable for persistence screenshots and user guidance.
  2. Clarify architecture in-product: Add a concise tooltip/help text for container tabs:

    • "Split tabs represent layout groups, not content panes."
    • Include one sentence on how to collapse them (close tabs on one side, or merge by drag).
  3. Simplification invariants: Keep all_panes_must_have_tabs: true (required for local tab strips), and explicitly verify simplification behavior:

    • single-child Linear collapses,
    • same-direction nested linears join,
    • cross-direction nesting is preserved,
    • lone pane tabs remain wrapped.
  4. Split UX contract: When a user drags a tab outside the strip (current "detach to split" behavior), the resulting split should:

    • Show the graph pane on one side and the detached webview on the other.
    • Pane tabs show pane titles; container tabs show semantic split/group labels.
    • If the split is later collapsed (all tabs closed on one side), the layout should simplify back to a single pane.
  5. Container tab activation: When a user clicks a Split โ†” / Split โ†• / Tab Group container tab (promoting the container to the active tab in its parent strip), the intended behavior is no-op / passive focus: the container becomes the active tab in the strip but does not navigate, collapse, or perform any destructive action. The container tab's selection state is ephemeral โ€” it exists because the user clicked it, and is superseded as soon as the user clicks a content pane. This must be explicitly handled in tab_title_for_tile and the tab selection callback so it doesn't fall through to undefined behavior.

    Future upgrade path: Stage 8D (2026-02-22_workbench_tab_semantics_overlay_and_promotion_plan.md) will upgrade container tab activation for demoted semantic tab groups โ€” clicking the inverted-tab chrome affordance dispatches PromotePaneToSemanticTabGroup. That is a distinct surface and a follow-on concern; it does not change the no-op / passive focus behavior for Split and structural container tabs defined here.

  6. Documentation sync: Keep design_docs/TERMINOLOGY.md as source-of-truth and add a brief "Workbench Layout" section to the help panel explaining Tile, Pane, Container, Split, and Tab Group semantics.

Files: desktop/tile_behavior.rs, desktop/tile_view_ops.rs

Tasks:

  • Replace format!("{:?}", container.kind()) fallback with semantic labels in tab_title_for_tile.
  • Add tooltip/help affordance for container tabs.
  • Specify and implement container-tab activation as no-op / passive focus (no collapse, no navigation).
  • Verify simplification invariants with targeted tests (including same-direction join + cross-direction preserve).
  • Test split โ†’ close โ†’ simplify flow and tab-detach predictability.
  • Ensure terminology alignment between UI strings and TERMINOLOGY.md.

Validation Checklist

Wheel Zoom

  • Mouse wheel over graph pane zooms without Ctrl (default setting).
  • Mouse wheel over graph pane with scroll_zoom_requires_ctrl = true requires Ctrl.
  • Trackpad two-finger scroll over graph pane zooms.
  • Mouse wheel over webview pane scrolls page content (does not zoom graph).
  • Zoom is pointer-relative (zooms toward cursor position).

Camera Fit

  • Z key with 0 or 1 selected nodes: fits all nodes with relaxed zoom.
  • Z key with 2+ selected nodes: fits selected nodes with tighter zoom.
  • C key: always fits all nodes.
  • On startup with existing graph: camera fits to nodes automatically.
  • On startup with empty graph: no crash, camera at default position.
  • Camera fit with a single node does not zoom in so far that the node fills the pane.

Node Stability

  • Single node with no edges stays on-screen after 5 seconds.
  • After panning, nodes drift toward new viewport center (not back to origin).
  • Multiple disconnected nodes cluster near viewport center.

Tile Tree

  • Container tabs use semantic labels (Split โ†”, Split โ†•, Tab Group, Grid) instead of raw enum debug strings.
  • Pane tabs remain content-centric (Graph, webview title, diagnostics title).
  • Closing last tab in a split collapses the split.
  • Dragging a tab out of the strip creates a clean split.
  • Users can distinguish content panes vs layout-group tabs without ambiguity.
  • Clicking a container tab does not collapse the split, navigate, or produce any destructive action.

Architecture Notes

Why post-render helpers fail for scroll zoom

The egui frame lifecycle is: input โ†’ layout/render โ†’ post-render. Scroll events are consumed during layout/render by whatever widget first claims them. By the time our post-render helpers run, the deltas read from ui.input() may already be zero because another widget consumed them. The only reliable interception point is ui.input_mut() before the widget that would consume them.

Why one-shot flags fail for camera commands

egui_graphs creates MetadataFrame lazily on its first layout pass. Any camera command that fires before that pass finds no MetadataFrame and silently fails. One-shot flags (bool โ†’ false after first attempt) don't survive this race. Durable commands (Option<Enum> โ†’ None only after successful MetadataFrame write) do.

Consistency principle

See Interaction Model Invariant above. The architectural rationale for Phase 1's pre-render interception is that scroll events must be claimed before any widget renders โ€” the only way to guarantee hover-based routing is to act before the frame's layout/render pass begins.

Captured decisions from architecture deep-dive (2026-02-23)

These points are now treated as design constraints for future UX changes:

  1. Container nodes are first-class and should not be treated as rendering bugs. Horizontal/Vertical originate from real Container::Linear nodes in the tree.

  2. Nested containers are expected and valuable. The model intentionally supports arbitrary composition (Tabs โ†” Linear โ†” Grid) with simplification, not strict one-level pane grouping.

  3. Multiple tab bars are an intended capability. Because panes are wrapped in Tabs (all_panes_must_have_tabs: true), each split region can own an independent local tab strip.

  4. UX should expose structure semantically, not hide structure categorically. Default policy is rename/reframe container labels for clarity, not blanket suppression.

  5. Terminology must track actual code architecture. TERMINOLOGY.md remains authoritative and must be updated whenever tile model/UI language changes.


Phase 5: Secondary Input Surfaces (Absorbed 2026-02-24)

This phase absorbs and replaces 2026-02-24_input_surface_polish_plan.md.

Scope and constraints

  1. Radial menu, context menu, and command palette are discovery/execution surfaces over ActionRegistry.
  2. InputRegistry remains the single binding authority for triggers.
  3. No parallel command enums; execution flows through ActionRegistry::execute(...).

Implementation tasks

  • Replace hardcoded radial command/domain enums with action-metadata-driven sectors from ActionRegistry category data.
  • Support directional navigation while radial is active (arrow keys + gamepad stick) via InputRegistry bindings.
  • Group context menu actions by ActionCategory from ActionRegistry::list_actions_for_context(context).
  • Ensure context menu and command palette route shared actions to identical handlers.
  • Keep implementation modular (desktop/radial_menu.rs, desktop/context_menu.rs) and leave render/mod.rs as orchestration callsite.

Validation

  • Radial open/selection works through InputRegistry bindings.
  • Radial actions execute through ActionRegistry only (no hardcoded parallel enum).
  • Context menu grouping and command palette entries resolve to the same handlers.
  • Dialogs are used for destructive/branching decisions; toasts are used for non-blocking acknowledgements/progress.

Dependency Map

Phase 1 (Input Ownership) โ”€โ”€โ”€ no dependencies
Phase 2 (Camera Commands)  โ”€โ”€โ”€ no dependencies (can parallelize with Phase 1)
Phase 3 (Viewport Gravity) โ”€โ”€โ”€ depends on Phase 2 (needs MetadataFrame access pattern)
Phase 4 (Tile Clarity)     โ”€โ”€โ”€ no dependencies (can parallelize with everything)

Phases 1, 2, and 4 can be implemented in parallel. Phase 3 should follow Phase 2.

โš ๏ธ **GitHub.com Fallback** โš ๏ธ