2026 02 27_egui_stack_assessment - mark-ik/graphshell GitHub Wiki
Date: 2026-02-27
Status: Background rationale / comparative research (superseded as the primary strategy source)
Scope: Background analysis of how to get the best results from egui + egui_tiles + egui_graphs, what technical boundaries must exist between Graphshell and the egui stack, and what tradeoffs informed the later migration strategy.
Use this doc for:
- rationale
- tradeoff analysis
- historical framing
Do not use this doc as the canonical execution plan.
Use implementation_strategy/aspect_render/2026-02-27_egui_wgpu_custom_canvas_migration_strategy.md for current migration sequencing and issue posture.
The evaluated egui stack was viable for Graphshell, but only if Graphshell treated it as a set of backends rather than as product architecture.
Historical decision snapshot: this document was written when the working assumption was that Graphshell would retain egui and egui_tiles, and eventually migrate away from both egui_graphs and egui_glow.
That framing is now historical in one important respect:
- the
egui_glow->egui_wgpuUI-backend cut has landed -
egui_graphsreplacement is now conditional rather than automatic
The present problems are mostly not "egui is wrong"; they are boundary problems:
- too much Graphshell orchestration still lives inside framework integration files,
- multiple layers can still mutate closely-related state in the same frame,
-
egui_graphsandegui_tilesare both being asked to support workbench semantics they do not natively model.
The right strategy is:
-
Encircle the current stack first: keep
egui,egui_tiles,egui_graphs, andegui_glowonly long enough to narrow them to paint/layout/event roles and create stable replacement seams. - Make Graphshell the sole authority for interaction semantics: focus, camera policy, viewer dispatch, selection semantics, lifecycle, and persistence invariants all belong to app code.
-
Replace
egui_graphswith a Graphshell-owned custom canvas: this is the primary product-surface migration and the core long-term target. -
Replace
egui_glowwithegui_wgpuonce the runtime viewer surface bridge is proven: the renderer backend migration follows the canvas seam and should not lead it. -
Keep
egui_tilesunless it becomes a proven blocker: it remains useful as the layout/workbench host far longer thanegui_graphs. - Avoid long-lived forks unless forced: prefer thin adapters or upstreamable patches; fork only for short-lived, tightly-scoped gaps.
Rule of thumb:
Extend, wrap, and constrain while stabilization and core feature delivery dominate.
Replace or go custom only when framework constraints become the main source of churn.
From Cargo.toml at the time of writing:
| Crate | Version | What Graphshell uses it for |
|---|---|---|
egui |
0.33.3 |
Immediate-mode desktop UI, widget chrome, panels, popups, overlays |
egui-winit |
0.33.3 |
winit input bridge |
egui_glow |
0.33.3 |
OpenGL-backed egui renderer |
egui_graphs |
0.29.0 |
Graph widget, force-directed layout state, node/edge events |
egui_tiles |
0.14.1 |
Docking/tiling tree layout |
egui-file-dialog |
0.12.0 |
File picker |
egui-notify |
0.21 |
Toast notifications |
The codebase already contains the right architectural instinct in several places:
-
render/mod.rs disables
egui_graphszoom/pan (with_zoom_and_pan_enabled(false)) and routes camera movement through Graphshell's custom navigation path. -
render/mod.rs uses an event sink (
Rc<RefCell<Vec<Event>>>) and converts framework events into Graphshell actions instead of mutating most app state inline. -
shell/desktop/workbench/tile_compositor.rs already treats rendering/compositing as Graphshell-owned logic layered on top of
egui_tiles, not somethingegui_tilesitself can provide. -
model/graph/egui_adapter.rs already acts as a useful boundary between Graphshell graph state and
egui_graphs.
That means Graphshell does not need a panic rewrite. It needs stricter ownership boundaries.
The largest migration blocker identified in this research was not egui_tiles; it was the renderer/compositor coupling present at the time:
-
shell/desktop/ui/gui.rs was built around
egui_glow::EguiGlow. -
shell/desktop/workbench/compositor_adapter.rs was explicitly OpenGL-bound and used
egui_glow::CallbackFnplus GL state guardrails. - Composited runtime viewer content was therefore not just "egui-rendered"; it was wired into a GL callback path with concrete OpenGL invariants.
That means the first serious technical question in any egui_glow -> egui_wgpu migration is:
How do current runtime viewer surfaces cross the GL ->
wgpuboundary without unacceptable latency, copies, or complexity?
Until that is answered, the renderer migration is not ready.
Upstream source review reinforces the same conclusion:
-
eguiis optimized for immediate-mode UI composition and custom widget authoring, not for being a domain-specific scene engine. -
egui_tilesis a docking/layout crate centered on aTreeplus aBehaviorimplementation; it is not a workbench runtime or persistence authority. -
egui_graphsis a graph widget; it bundles rendering, interaction, and layout helpers. It is not a spatial browser/workbench engine.
Those crates are useful, but they are not substitutes for Graphshell's domain model.
This is the main practical risk today.
When Graphshell and the framework both believe they may mutate related state in the same frame, "fighting" bugs appear:
- camera jitter or camera drift,
- selection mismatches,
- drag state desynchronization,
- focus moving to a visually stale pane,
- physics state appearing to lag or snap.
Current high-risk zones:
| State | Framework can hold/mutate | Graphshell can hold/mutate | Risk |
|---|---|---|---|
| Camera | egui_graphs::MetadataFrame |
GraphViewFrame / camera commands |
Camera "fighting" |
| Physics |
egui_graphs layout state |
GraphWorkspace.physics and per-view simulation |
Drift / hidden writes |
| Selection | Graph widget internal selection/hover state | selected_nodes |
Divergence after complex sequences |
| Focus | egui response focus and hover state | workbench focus + active pane state | Input routed to wrong owner |
| Tile layout |
egui_tiles::Tree runtime tree |
app policies and persistence reconstruction | Structural mismatch |
If Graphshell does not enforce one authority per state category, every new feature increases the chance of subtle regressions.
egui_graphs and egui_tiles are generic UI crates. Graphshell is not building a generic graph viewer or generic docking UI.
Graphshell-specific semantics that do not naturally fit widget assumptions:
- workbench lifecycle (
Active/Warm/Cold), - viewer routing and render-mode dispatch,
- tile compositor pass scheduling,
- graph-as-browser semantics,
- multi-view graph policies,
- persistence invariants over tile trees and frame state,
- explicit authority boundaries between semantic graph, spatial layout, and runtime instances.
The longer these semantics are expressed as ad hoc code inside widget adapters, the more the product model gets squeezed into the framework's default shape.
This risk is manageable today because Graphshell is still on upstream crates with no long-lived local forks.
It becomes expensive if:
- Graphshell starts patching internals of
egui_graphsoregui_tiles, - app logic depends on undocumented crate behavior,
- large integration modules absorb crate-specific assumptions everywhere.
Upgrade drag is therefore mostly a boundary discipline problem, not yet a dependency problem.
This is not the current top blocker, but it is the medium-term ceiling:
-
eguiis immediate mode, so the app rebuilds UI descriptions every frame. -
egui_glowis OpenGL-oriented and not ideal for a future deeply custom graph render pipeline. -
egui_graphsis not a full custom scene renderer; it is a widget with layout helpers. - Graphshell's most distinctive future needs (graph-native compositing, richer physics, custom hit testing, more explicit render passes) will eventually outgrow a widget-oriented graph layer.
- The current compositor path is tied to OpenGL callback semantics, which raises the cost of moving the UI renderer to
wgpuuntil a practical surface-interop plan exists.
That is why the graph canvas is the first planned replacement, while egui_tiles remains in the target stack.
If the upstream stack were ideal for Graphshell, it would offer three things more explicitly:
Graphshell needs stronger primitives for:
- "this region owns drag semantics right now,"
- "this pointer sequence is captured by this subsystem,"
- "camera updates come from one path only."
Without that, Graphshell must manually prevent duplicate behavior through policy gates and careful event ordering.
Graphshell benefits when a framework can be used as:
- a pure renderer,
- a pure hit-test/event source,
- or a pure layout helper,
without also pulling in built-in interaction policy.
egui_graphs is useful, but it couples those concerns more tightly than Graphshell ideally wants.
Graphshell needs workbench-specific concepts that generic docking crates do not define:
- pane activation/deactivation contracts,
- compositor scheduling hooks,
- tile lifecycle cleanup,
- persistence invariants and structural validation,
- focus transfer and cross-pane routing rules.
Those extension points are not really "missing features" in egui_tiles; they are product-specific runtime semantics. Graphshell must own them explicitly.
This is the highest-value near-term work.
Graphshell should formalize one owner per state category.
Recommended boundary:
| Concern | Graphshell owns | egui stack owns |
|---|---|---|
| Semantic graph | Nodes, edges, metadata, lifecycle, selection truth | Nothing |
| Camera policy | Bounds, commands, focus-target behavior, persistence | Ephemeral per-frame widget metadata only |
| Graph interaction semantics | Click meaning, lasso rules, multi-select rules, drag semantics | Raw pointer/key observations and widget-local hit results |
| Physics policy | Config, presets, per-view simulation ownership, persisted state | Temporary layout engine state bridge only |
| Workbench semantics | Pane identity, focus, active tile, routing, lifecycle, persistence invariants | Tile rect/layout computation only |
| Viewer routing | Viewer ID selection, render mode policy, degradation rules | Nothing |
| Compositor | Pass ordering, overlay rules, surface scheduling | Nothing |
| UI chrome content | App meaning, commands, settings state | Widget drawing, layout helpers, ephemeral widget state |
The principle is simple:
- Graphshell owns all durable meaning.
- The egui stack owns only ephemeral rendering and widget-local mechanics.
This is the most important implementation rule.
The app should:
- compute policies,
- derive the current scene,
- route intent,
- decide semantic outcomes,
- persist durable state.
The framework should:
- draw UI,
- provide layout rectangles,
- emit raw widget/graph events,
- store strictly local transient frame data where unavoidable.
In practice:
-
egui_graphsshould be treated as a graph canvas backend, not as a graph authority. -
egui_tilesshould be treated as a tile layout backend, not as a workbench authority. -
eguishould be treated as the chrome UI toolkit, not as the owner of interaction semantics.
Before adding new interaction complexity, Graphshell should lock down existing invariants.
Required test coverage:
- camera never escapes bounds across frame sequences,
- custom navigation is the only path that changes the effective camera policy,
- pinned nodes do not move during drag/physics sequences,
- selection truth remains synchronized after click/drag/lasso sequences,
- tile tree invariants hold after split/merge/close/open operations,
- pane focus transfer is deterministic after close or retarget,
- physics state round-trips through the layout bridge without silent drift.
These tests make framework integration safer by detecting the exact category of bug that "dual authority" causes.
There should be only two acceptable integration styles:
- Upstreamable changes
- Thin local adapters
There should not be a third category where Graphshell quietly depends on deep crate internals spread across large modules.
That means:
- avoid scattering crate-specific workarounds through broad orchestration files,
- isolate framework-specific logic into bridge modules,
- keep any local patch small, documented, and easy to delete.
Two files currently hold too much mixed responsibility:
Recommended split for the graph path:
render/graph_canvas_backend.rsrender/graph_camera_bridge.rsrender/graph_event_bridge.rsrender/graph_selection_bridge.rsrender/graph_physics_bridge.rs
Recommended split for the workbench path:
shell/desktop/workbench/tile_layout_backend.rsshell/desktop/workbench/focus_router.rsshell/desktop/workbench/pane_dispatch.rsshell/desktop/workbench/tile_persistence_contract.rs
This is how Graphshell stops "framework adapters" from becoming de facto controllers.
Going custom is not automatically better. It is only justified when it solves the problems the framework keeps reintroducing.
A Graphshell-owned custom layer uniquely unlocks:
No hidden defaults, no widget-provided drag semantics, no ambiguity over who consumed a pointer sequence.
This matters when Graphshell wants:
- richer graph gestures,
- custom camera animation contracts,
- multi-modal input routing,
- exact focus/selection transitions,
- domain-specific interaction timing.
A custom graph canvas can be built around Graphshell's actual needs:
- graph view as a first-class scene,
- custom hit testing and culling,
- richer layout models than a generic graph widget provides,
- physics that reflect Graphshell's workbench semantics instead of widget assumptions.
When UX/spec is novel and central to product identity, owning the critical rendering and interaction layer removes a class of future blockers:
- upstream API changes stop breaking core product behavior,
- framework constraints stop shaping feature design,
- performance and rendering decisions become product-driven rather than crate-driven.
This is the main long-term argument for a custom graph canvas.
Before choosing an option, it is important to separate three different migrations:
-
Renderer backend migration:
egui_glow->egui_wgpu -
Graph canvas migration:
egui_graphs-> Graphshell custom canvas -
Workbench layout migration:
egui_tiles-> custom docking/layout engine (deferred unlessegui_tilesbecomes a proven blocker)
These should not be treated as one rewrite. They have different risks, different dependencies, and different rollback paths.
Keep temporarily: egui + egui_tiles + egui_graphs + egui_glow
Change: Integration discipline and ownership boundaries
This is the best short-term move because it:
- preserves current delivery velocity,
- reduces bug classes immediately,
- creates replacement seams before any rewrite,
- avoids premature framework churn.
This is the recommended immediate path, but only as a staging state.
Keep: egui, egui_tiles
Replace: egui_graphs with a Graphshell-owned graph canvas backend
This is the most likely medium-term path because:
- the graph canvas is where Graphshell diverges most from generic widget assumptions,
- the workbench tree is still reasonably served by
egui_tiles, - the graph path is where custom rendering/interaction control yields the most product value.
This is the recommended first major replacement and the chosen canvas direction.
Keep: egui, egui_tiles
Migrate: egui_glow -> egui_wgpu
Replace: egui_graphs with a Graphshell-owned graph canvas backend
This is the likely long-term target if Graphshell wants a unified wgpu render path, but the sequencing should be reversed in implementation planning: establish the custom canvas seam first, then do the backend swap.
This path only makes sense after Graphshell has answered the GL -> wgpu compositor question for runtime viewer surfaces. If that answer is poor (for example, excessive copies or unstable interop), the backend swap should wait while the graph canvas and backend seams are prepared first.
This is the "encirclement" path:
- keep
egui_tileswhile treatingegui_graphsandegui_glowas legacy integration layers to be replaced, - disable more default behavior,
- move more meaning into Graphshell-owned controllers,
- allow the framework layer to shrink to backends over time.
This is effectively Option A executed in stages, and it is the correct transition path even if Graphshell later chooses Option B.
Possible, but costly.
This is acceptable as a short-lived bridge if:
- Graphshell needs one missing API or bug fix,
- the diff stays small,
- the patch can plausibly be upstreamed or removed quickly.
Lower-value first move. egui_tiles is not the main current architectural mismatch.
Technically possible. Strategically weak.
The overlap between egui_graphs and egui_tiles is not where Graphshell's real architecture belongs. Their "redundancy" is mostly widget-level assumptions. The correct place to unify behavior is not inside a combined fork; it is inside Graphshell-owned controller and scene layers.
This is the highest-maintenance option and should be avoided unless Graphshell is already committed to long-term internal ownership of both forks.
This should only happen if the egui stack becomes the main source of churn.
Plausible alternatives:
-
iced: stronger message/update architecture, better fit for explicit state flow, but still immature for Graphshell's custom workbench needs and offers no drop-in equivalent to the current graph + docking combination. -
xilem: interesting to watch, but still explicitly early and not a practical replacement target today. -
makepad: stronger custom rendering posture, but much more opinionated and less aligned with Graphshell's current architecture. -
slint: strong for structured application UI, weaker fit for a graph-heavy spatial workbench and carries licensing/product tradeoffs. -
Fully custom (
winit+ renderer + AccessKit + Graphshell UI systems): maximum control, maximum cost.
A full framework switch is not the recommended near-term move; the preferred target remains egui + egui_tiles + egui_wgpu + a Graphshell-owned custom canvas.
Before attempting a renderer or canvas migration, Graphshell should answer these questions explicitly.
- Can current runtime viewer content be presented in a
wgpu-backed frame without unacceptable copies? - Is there a practical zero-copy or low-copy path from the current GL-oriented runtime viewer surfaces into the future render path?
- If not, what is the measured cost of CPU readback + re-upload or GPU copy bridges?
- What is the acceptable per-frame budget for that bridge under normal multi-pane use?
This is the hard technical gate for egui_glow -> egui_wgpu.
- Will Graphshell own the
wgpu::Instance/Adapter/Device/Queue, or will the egui integration layer own them? - Can all desired render and compute subsystems share one coherent
wgpudevice? - Will Graphshell pin direct
wgpuusage to the version expected byegui_wgputo avoid split dependency trees?
These answers determine whether the future graphics stack is coherent or fragmented.
- Will the custom canvas render to an offscreen texture, then present that texture in egui?
- Or will it render directly into the egui frame via backend-specific callback integration?
- Which model gives the best tradeoff for latency, testability, and backend coupling?
This choice should be made intentionally early.
- What frame-time budget must the custom canvas meet?
- What node-count and pane-count targets must it support?
- Which platforms must be first-class on day one?
- What are the acceptable degradation modes if GPU budgets are exceeded?
Without these targets, the migration cannot be evaluated honestly.
- What accessibility obligations apply to the graph surface?
- How will the custom canvas expose diagnostics for frame timing, copy counts, and degraded modes?
- What feature flags or fallback paths allow rollback if the new path regresses?
These are product requirements, not optional polish.
Graphshell should not start the backend swap until it can answer this with confidence:
How does a current runtime viewer surface, which today is composed through a GL callback path, become visible in the new
wgpuframe without an unacceptable latency or bandwidth penalty?
If that answer is unknown, the migration should remain in research/spike mode.
The custom canvas should be a Graphshell subsystem, not just "another widget."
-
Scene layer
- Graphshell-derived renderable scene data only
- nodes, edges, labels, overlays, thumbnails, diagnostics primitives
-
Camera/projection layer
- explicit camera state and projection modes
- begin with orthographic 2D, then support projected 2.5D / isometric, then true 3D only if justified
-
Interaction layer
- hit testing, lasso, drag, hover, camera navigation
- consumes input snapshots and emits intents
-
Render backend layer
- resource caches, GPU buffers, textures, pipelines
- texture or callback presentation path
-
Presentation bridge
- consumes pane rect, scale factor, input snapshot, and scene
- returns render output plus emitted intents/hover state
-
Diagnostics layer
- frame timings, draw counts, copy counts, fallback/degradation events
The custom canvas should not become a second app model. It must not own:
- semantic graph truth,
- node identity,
- selection truth,
- pane lifecycle policy,
- viewer routing,
- persistence,
- command semantics outside the canvas interaction contract.
The cleanest future seam is a Graphshell-owned interface such as:
GraphCanvasBackend
fed by Graphshell-owned inputs such as:
GraphCanvasSceneGraphCanvasInputGraphCanvasFrameConfig
This lets Graphshell prepare the seam now while still using the existing egui stack.
This is the core contract.
These are product authorities and should not be delegated:
- semantic graph state,
- node and edge identity,
- viewer ID selection,
- render mode policy,
- focus and active pane ownership,
- pane open/close/split semantics,
- tile lifecycle semantics,
- camera policy and persistence,
- selection truth,
- command interpretation,
- physics policy and persisted configuration,
- compositor scheduling and pass ordering,
- degradation/fallback policy,
- persistence and invariants.
If one of these is currently implicit in a widget callback or framework state object, that is a boundary leak.
These are acceptable framework responsibilities:
- widget layout,
- widget painting,
- immediate-mode UI composition,
- ephemeral hover/focus state local to a widget,
- raw input observation,
- tile rectangle computation,
- widget-local drag/hover geometry,
- graph widget-local render caches,
- per-frame temporary layout metadata used strictly as a backend bridge.
These are implementation mechanics, not product semantics.
At any point in time, it should be obvious:
- who owns the truth,
- who is allowed to mutate it,
- whether a framework value is authoritative or only a cache/bridge,
- whether a given callback is allowed to make semantic decisions or only emit events.
If that is unclear, the code will drift back toward dual-authority bugs.
- Document the ownership matrix in architecture docs.
- Add debug assertions for illegal state changes.
- Extract bridge modules from large integration files.
- Expand interaction conformance tests.
- Treat all framework callbacks as event sources, not semantic authorities.
Introduce Graphshell-owned interfaces such as:
GraphCanvasBackendTileLayoutBackendInteractionRouterFocusAuthorityWorkbenchScene
Then implement them using the current egui stack.
This gives Graphshell clean seams without an immediate rewrite.
The planned first replacement is:
- replace
egui_graphs, - keep
egui, - keep
egui_tileslonger.
This preserves the useful parts of the stack while moving Graphshell's most product-specific surface under full local control.
After the custom canvas seam is stable:
- prove the runtime viewer surface bridge,
- choose device ownership,
- select texture-vs-callback presentation,
- then migrate
egui_glow->egui_wgpu.
This sequencing isolates the real risk and avoids turning the backend swap into a blind dependency churn project.
Graphshell should stay on the current stack while:
- the main bugs are still boundary/ownership bugs in Graphshell code,
- the stack still accelerates delivery,
- framework constraints are annoying but not the dominant source of churn.
Graphshell should move toward a custom graph canvas when:
- core UX keeps being distorted to fit widget assumptions,
- adapter code becomes larger than the value the widget provides,
- framework workarounds dominate graph-canvas work,
- performance/rendering requirements clearly exceed what the widget path can do cleanly.
Graphshell should consider a fuller custom UI stack only if:
- even the chrome layer becomes blocked by framework constraints,
- or the maintenance cost of the entire stack outweighs its delivery advantage.
That threshold has not been reached yet.
Primary sources consulted on 2026-02-27:
-
eguidocs and repository: -
egui_tilesdocs and repository: -
egui_graphsdocs and repository: -
iceddocs and repository: -
xilemrepository and docs: -
makepadrepository: -
slintdocs:
These sources are useful for understanding current crate scope, intended use, and maturity. The Graphshell-specific recommendation in this document is an inference from those sources combined with the current code structure.