2026 03 27_physics_spike_metrics - mark-ik/graphshell GitHub Wiki
Date: 2026-03-27 Status: Metrics defined; baselines pending runtime measurement Purpose: Stage 3 output of the physics worker spike. Defines the success criteria and measurement protocol that must be satisfied before any worker implementation proceeds.
A physics worker is only worth building if it measurably reduces frame-budget pressure without introducing position divergence or velocity-loss regressions. This document defines the three metric families and how to collect baseline numbers.
See also:
- Spike Stage 1 receipt:
graph/layouts/graphshell_force_directed.rs(doc comment block) - Spike Stage 2 receipt:
2026-03-27_egui_retained_state_efficiency_and_physics_worker_evaluation_plan.md(ownership table and ordered phases) - egui_graphs efficiency improvements:
structural_dirty/visual_dirtysplit should happen before Stage 3 baseline measurement, since it changes the velocity-loss rate.
What it measures: How much of the egui frame budget the synchronous physics step consumes. If the step is fast enough, a worker adds complexity with no benefit.
How to measure:
Wrap the physics step region in render/mod.rs (lines 561โ670: set_layout_state โ
GraphView::add() โ get_layout_state โ apply_graph_physics_extensions) with a
std::time::Instant pair and emit via the existing emit_span_duration helper:
let t0 = std::time::Instant::now();
// ... physics step region ...
emit_span_duration("render::physics_step", t0.elapsed().as_micros() as u64);The emit_span_duration function is in shell/desktop/runtime/diagnostics.rs:147.
Graphs to benchmark:
| N | Description |
|---|---|
| 100 nodes | Typical active session |
| 500 nodes | Large knowledge graph |
| 1000 nodes | Stress case |
Pass criteria:
- If p99 frame time for the physics step is < 1 ms at N=500, a worker is not justified.
- If p99 is > 2 ms at N=500 or > 1 ms at N=100, a worker is worth prototyping.
Baseline numbers (pending measurement):
| N | avg (ยตs) | p99 (ยตs) | measured at |
|---|---|---|---|
| 100 | โ | โ | โ |
| 500 | โ | โ | โ |
| 1000 | โ | โ | โ |
What it measures: Whether a proposed worker path produces the same node positions as the synchronous path after N steps. If the FR step is non-deterministic across async boundaries (e.g. due to floating-point ordering differences or state races), the worker model is incorrect by construction.
How to measure:
- Run a deterministic graph (fixed seed positions, no drag, no lens modification) for
N=1000 frames synchronously. Capture the final node positions as a
Vec<(NodeKey, Pos2)>snapshot. - Run the same graph via the proposed worker path (copy-out โ off-thread step โ copy-in). Capture the same snapshot.
- Assert that all positions agree within epsilon (suggested: 1e-3 in each axis).
Pass criteria:
- Position divergence after N=1000 frames < 1e-3 in both x and y for all nodes.
- If divergence is larger, the copy-out / copy-in boundary introduces ordering differences that make the step non-reproducible โ the worker design is invalid.
Baseline numbers: N/A (comparison metric, not an absolute baseline).
What it measures: How often per session the FR velocity is reset to zero due to a
full egui_state_dirty rebuild of EguiGraphState. Every full rebuild calls
EguiGraphState::from_graph(), which seeds positions from Node::projected_position()
and discards all accumulated FR velocity โ causing a visible physics stutter.
How to measure:
Add a diagnostics emit inside EguiGraphState::from_graph() in
model/graph/egui_adapter.rs:
emit_event(DiagnosticEvent::MessageSent {
channel_id: CHANNEL_GRAPH_EGUI_STATE_REBUILT, // new channel โ see below
byte_len: 0,
});A new channel graph:egui_state_rebuilt (severity: Info) should be registered.
Count this channel's events per session in the Diagnostics Inspector pane.
Baseline and target:
| Condition | Expected rate | Notes |
|---|---|---|
Before structural_dirty/visual_dirty split |
High โ triggered by selection, badge, crash-flag changes (40+ egui_state_dirty sites) |
Baseline |
| After split | Low โ triggered only by node/edge add/remove | Target |
| Physics worker active | Should be โค "after split" rate | Worker must not introduce additional rebuilds |
Baseline numbers (pending measurement):
| Condition | rebuilds/min (typical session) | measured at |
|---|---|---|
| Before split | โ | โ |
| After split | โ | โ |
- Build with
--release(physics step time is not representative in debug builds). - Load the same fixture graph for each N (100/500/1000 nodes) โ use a saved
.gswworkspace or a deterministic in-memory fixture. - Let physics run for 30 seconds with no user interaction.
- Read
emit_span_duration("render::physics_step", ...)events from the Diagnostics Inspector ring buffer. - Record average and p99 in the table above.
After baselines are collected:
-
If p99 < 1 ms at N=500 and velocity-loss rate is acceptable after split: Do not build a physics worker. Close the spike as "not worth it."
-
If p99 > 2 ms at N=500 or velocity-loss is unacceptable after split: Proceed to worker implementation per the conditional architecture in the spike plan (
2026-03-27_egui_retained_state_efficiency_and_physics_worker_evaluation_plan.mdโ Phase E).
Before collecting Stage 3 baselines, complete:
-
structural_dirty/visual_dirtysplit inmodel/graph/egui_adapter.rsโ this changes the rebuild rate that Metric 3 measures. - Add the
graph:egui_state_rebuiltdiagnostic channel to the channel registry. - Add the
render::physics_stepspan emit torender/mod.rs.