en Linear Time Complexity - chiba233/yumeDSL GitHub Wiki
Home | Performance | Source Position Tracking
This document now stays within one md: the front half is the theorem body for the current full-parse implementation, and the back half is an in-document appendix that preserves walkthroughs, branch inventories, historical paths, empirical constants, and incremental-parsing analysis. That keeps everything on one page without letting the theorem object drift into engineering commentary.
The main text proves exactly two claims, and only for the current implementation of the normal full-parse entry points.
Theorem S (parseStructural)
T_parseStructural(n) = Θ(n)
Theorem R (parseRichText, library-owned portion)
T_parseRichText, library-owned(n) = Θ(n)
If user-supplied handler work is included in the total wall-clock time, this document does not silently relabel that work as a library theorem. The precise total formula is instead:
T_parseRichText,total(n) = Θ(n + T_user-handlers(n))
-
Input size:
n = text.length, measured in UTF-16 code units. -
Proof object: the normal full-parse entry paths of
parseStructural(text, options)andparseRichText(text, options); notparseIncremental(...),session.applyEdit(...), orsession.applyEditWithDiff(...). - Current-implementation boundary: the main text speaks only about the current structural / render / ownership / EOF-recovery implementation. Historical implementations and version evolution are moved to appendices instead of being interleaved with the theorem body.
-
Render boundary: the
parseRichTexttheorem counts library-owned render work on parser-produced structural trees: traversal, materialization, degradation, trimming, buffering, and token assembly performed by the library itself. -
Explicit exclusions: the complexity of user
handler.inline/raw/blockbodies, highly overlapping external trees manually injected by callers, and caller-driven output amplification are not smuggled into the library's Θ(n) guarantee.
Under that boundary, the main text does only four things: define the charging scheme, prove why the current implementation keeps those charged quantities linear, close the Θ(n) upper and lower bounds for parseStructural, and then lift the same boundary-aware argument to parseRichText.
The main text now leaves branch inventory behind and works directly with a cost decomposition. The point is to define summable library-owned quantities and then show why none of them can repeatedly charge the same source region without bound. Let the source length be n = |source|.
Let:
-
N_iter: total iterations of the top-levelwhile (stack.length > 0)loop; -
B_skip: total bytes skipped byfindNextBoundaryChar(...)over ordinary text runs; -
B_escape: total bytes consumed by escape / literal-text branches; -
B_head: total bytes scanned by tag-head / end-tag / separator / shorthand-head recognition; -
B_form: total bytes scanned by form-classification scans such asfindTagArgClose(...); -
B_close: total bytes scanned byraw/blockclose finders; -
B_probe: total bytes touched by ownership checks, ancestor-owner comparisons, and close-ownership probes; -
B_replay: total bytes/work involved in EOF malformed-inline replay / downgrade / child attachment; -
N_frame: total parse-frame creation / pop / parent-child attachment operations.
Then the structural phase can first be written as:
T_structural(n)
= c0·N_iter
+ c1·B_skip
+ c2·B_escape
+ c3·B_head
+ c4·B_form
+ c5·B_close
+ c6·B_probe
+ c7·B_replay
+ c8·N_frame
Writing the formula is the easy part. The real proof obligation is to show that none of these terms can grow by repeatedly rescanning the same source region without bound.
findNextBoundaryChar(...) advances across maximal ordinary-text runs. Different loop iterations skip disjoint source intervals, so:
B_skip = Σ length(skipped_run_i) <= n
The escape branch either consumes escapeChar + nextChar as a pair or consumes the current byte as literal text. In either case frame.i moves forward, and bytes already consumed by the escape path are not re-consumed by that same escape path on the same frame. Hence:
B_escape = Σ length(escape_chunk_i) <= n
B_head is the term that most often triggers skepticism because tag-head, end-tag, separator, and shorthand recognition can look like repeated speculative probing. The summable invariant is:
- successful head recognition corresponds to a real finite head interval in the source;
- failed recognition in the current implementation either degrades immediately to text or advances
frame.ipast the current decision point, rather than staying pinned at the same head start and repeating an unbounded full scan; - in the current shorthand-ownership design, push and close are split, so the same tail is no longer re-judged through repeated frame-to-frame full reclassification.
Lemma H (head-attempt charging). For any source position p, the number of times it can be charged into B_head is bounded by a constant.
The point is not to say only “it seems not to rescan much,” but to pin down the finite event classes that may happen after one head-attempt:
- Successful recognition. The head interval is immediately assigned to a created structure (full tag, shorthand child, or end-tag-like dispatch), so the current frame does not keep retrying the same start position in the same role without bound;
-
Failed recognition with forward progress. After failure, the implementation either degrades immediately to text or advances
frame.ito a strictly larger position, so the same frame cannot remain pinned at that head start and perform unbounded full retries; - Recovery replay. A failed frame's tag head may be replayed back into the parent during recovery, but the replay object is exactly one failed-frame head, and each failed-frame head enters that replay accounting at most once.
Therefore each position p can be charged to only finitely many event classes: one normal recognition/failure path plus constant-many recovery replays. There is no path on which the same offset is fully charged by unboundedly many head-attempts. Hence:
B_head <= C_head · n = O(n)
Therefore B_head can be treated as the sum of finitely many head scans, with each head start triggering only constant-many bounded scans:
B_head = Σ length(head_scan_i) = O(n)
B_form is counted separately for form-classification scans such as findTagArgClose(...),
rather than being folded into B_close. The reason is semantic as well as analytical: this family
does not search for a raw/block closer, but classifies a newly encountered head once.
More precisely, each findTagArgClose(...) call is tied to the event “classify this newly
encountered head”:
- on success, that head immediately enters the corresponding structure-creation path;
- on failure, that head degrades or causes the current frame to advance;
- under the current implementation, the same frame does not relaunch an unbounded number of fresh form-classification windows from the same start position.
Hence:
B_form = Σ length(form_scan_i) = O(n)
It is not enough here to collapse every scan into one vague “close-search” bucket. The proof-relevant scanner families are:
-
findRawClosefor raw-content close search; -
findBlockClosefor block-content close search, with its own depth/nesting state machine; -
shorthand ownership probe, which is not counted in
B_closeat all but inB_probe.
Lemma C (close-scan family charging). For each of findRawClose and findBlockClose, a given window is not relaunched as a fresh full search unboundedly many times under the same parent-frame context.
Concretely:
- for one successful or failed
findRawClosesearch, the internal scan pointer advances monotonically across the target raw window; after the search, the parent either advances tonextIor the structure degrades, so the same raw window is not reopened as an unbounded number of fresh full searches under the same parent frame; - for one successful or failed
findBlockClosesearch, the call may manage block depth and skip nested raw/inline regions, but the scan pointer inside that call is still monotone over the inspected block window; the current implementation does not relaunch the same block window as a fresh full search unboundedly many times under the same parent frame;
So the linearity of B_close is not “trust me, these scans are all roughly the same”; it is that the genuine close-finder families for raw/block windows separately satisfy the charging condition “window-local monotone advance + no unbounded relaunch.”
So:
B_close = Σ length(close_scan_i) = O(n)
This subsection gives only the closure formula for recovery. The actual proof bridges are the support lemmas later in this section:
-
E1: why the probe performed by
resolveShorthandOwnershipPush(...)is amortized linear; -
E2: why close-site
defer-parentis a one-shot downgrade rather than a multi-round rewind of the same suffix; - E3: why EOF malformed-inline replay introduces only one extra recovery rescan.
ancestorEndTagOwnerIndex, buildMalformedInlineReplayPlan(...), replayMalformedInlineChainAtEof(...), and downgradeInlineIntoParent(...) matter here because they are the implementation facts that make E1/E2/E3 true; this paragraph is not intended to replace those lemmas with a summary slogan.
So the new recovery term looks like:
T_new-recovery(m)
= Θ(number_of_frames)
+ Θ(total_replayed_head_bytes)
+ Θ(total_child_attach_ops)
Each frame, replayed head segment, and child attachment is charged only once in that accounting, so they can be absorbed into:
B_probe + B_replay = O(n)
In an even more explicit summation form:
T_new-recovery(m) = Σ Θ(1) + Σ replay_bytes_i + Σ attach_i = O(m) <= O(n)
That is the formal source of why the current recovery term still folds into the linear upper bound.
Collecting the previous bounds gives:
T_structural(n)
= c0·N_iter
+ c1·B_skip
+ c2·B_escape
+ c3·B_head
+ c4·B_form
+ c5·B_close
+ c6·B_probe
+ c7·B_replay
+ c8·N_frame
with:
N_iter = O(n)
B_skip <= n
B_escape <= n
B_head = O(n)
B_form = O(n)
B_close = O(n)
B_probe = O(n)
B_replay = O(n)
N_frame = O(n)
Therefore there exists a constant C_s > 0 such that:
T_structural(n) <= C_s·n
And structural parsing is not o(n) either, because it must at least inspect the input's boundary-relevant characters and perform node/frame work proportional to the parse it constructs. So:
T_structural(n) = Θ(n)
Let:
-
N_render_iter: iterations of the explicit render-stack main loop; -
B_textbuf: total text-like bytes touched bybufferText(...)/flushTextBuf(...); -
B_raw_unescape: total bytes scanned while unescaping raw content insiderenderRawNode(...); -
B_materialize: total token/byte volume materialized intoTextToken[]on unknown-tag passthrough / degrade paths; -
B_trim: total token/byte volume handled by trim helpers such astrimBlockBoundaryTokens(...); -
N_dispatch: total inline/raw/block/text/comment dispatch operations; -
T_user-handlers(n): extra work performed inside user-supplied handler bodies on this input of lengthn.
Then the render side can be written as:
T_render(n)
= d0·N_render_iter
+ d1·B_textbuf
+ d2·B_raw_unescape
+ d3·B_materialize
+ d4·B_trim
+ d5·N_dispatch
+ T_user-handlers(n)
The separation between library-owned work and user-handler work is essential; otherwise the claimed Θ(n) silently changes meaning.
-
N_render_iter = O(number_of_nodes) = O(n): each structural node is pushed, popped, and dispatched only finitely many times; -
B_textbuf = O(n): text-like content enters the buffer and is flushed into tokens without unbounded repeated concatenation of the same library-owned text segment; -
B_raw_unescape = O(n):renderRawNode(...)performs a single forward scan over raw content, and lazypartsallocation changes only constants; -
B_materialize = O(n): full materialization of the sameTextToken[]object happens only finitely many times; the current implementation tracks already-materialized arrays to avoid repeated full expansion of the same array object; -
B_trim = O(n): trim helpers touch only boundary token regions, and the cumulative size of those regions is bounded by total library-owned token volume.
Hence, when we count library-owned work only, we get:
T_render, library-owned(n)
= d0·N_render_iter
+ d1·B_textbuf
+ d2·B_raw_unescape
+ d3·B_materialize
+ d4·B_trim
+ d5·N_dispatch
<= C_r·n
so:
T_render, library-owned(n) = O(n)
And because render must still traverse a linear-size portion of nodes/output on the library-owned side, under the normal assumption that library-owned output volume is linear in input size, we also have:
T_render, library-owned(n) = Θ(n)
So the complete, non-slogan formula is:
T_parseStructural(n) = T_structural(n) = Θ(n)
T_parseRichText, library-owned(n)
= T_structural(n) + T_render, library-owned(n)
= Θ(n) + Θ(n)
= Θ(n)
If user-handler complexity is included, the precise statement must instead be:
T_parseRichText,total(n) = Θ(n + T_user-handlers(n))
That is why this page keeps insisting on library-owned Θ(n): not to dodge the hard part, but to separate the linear guarantee owned by the library from the extra complexity deliberately introduced by the caller.
To avoid reading the same Θ(n) claim as “two separate proofs,” the dependency order of the main text should be fixed explicitly:
- the main proof spine is the charging scheme in the previous section, where
T_structural(n)andT_parseRichText, library-owned(n)are decomposed into summable cost terms; -
the next section is not a second independent proof; it is a support-lemma layer whose only purpose is to prove that those charged terms are each
O(n); - the theorem closes only once: by substituting the bounds established below back into the charging scheme above.
More concretely, the support obligations map to the earlier symbols as follows:
-
I₁ + main-loop iteration analysis support
N_iter = O(n)and the claim that source-consumption terms do not overlap without bound; -
the auxiliary-scan covering lemma + branch-level analysis support
B_form = O(n)andB_close = O(n), together withB_head = O(n)for the structural scanning terms; -
the recovery-overhead lemmas support
B_probe = O(n)andB_replay = O(n); -
the render-side linear-summation argument supports
T_render, library-owned(n) = O(n).
So the intended reading order is not “one proof in the first half, another proof in the second half,” but:
charging scheme
-> support lemmas for each charged term
-> substitute bounds back into the total formula
-> close the theorem
The intuition "every character is scanned once" needs a tighter statement. Here is the proof sketch, derived directly from the source.
The task of this section is not to re-prove the whole theorem from scratch. It is to discharge the upper-bound obligations required by the charging scheme above.
The structural parser (parseNodesWithFactory, structural.ts line 624+) maintains an explicit
stack: ParseFrame[]. Each iteration of the main while (stack.length > 0) loop does exactly
one of:
-
Frame completion —
frame.i >= frame.textEnd: pops the frame (no character consumed). -
Character consumption — all other branches: advances
frame.iby ≥ 1.
Definition. A frame f is called a successful frame if it completes via the A2 path
(flushBuffer + stack.pop() + completeChild, structural.ts lines 648–652) rather than
via the A1 error-recovery path (INLINE_NOT_CLOSED / SHORTHAND_NOT_CLOSED, lines 629–646). The root frame
(returnKind === null) always completes via A2.
Definition. The consumed range of a frame f is the interval [startPos, endPos) that f
actually scans through character-consuming iterations (startPos = initial value of f.i,
endPos = value of f.i when f completes or is popped).
Invariant I₁ statement. At the end of parseNodesWithFactory execution, the consumed ranges
of all successful frames form a partition of [0, n) — i.e., (a) their union = [0, n), and
(b) they are pairwise disjoint.
Strong induction on the total number of main-loop iterations T.
Base case (T = 1). The input is the empty string (n = 0). The root frame is created with
frame.i = 0 = textEnd, immediately triggering A2 completion. The sole successful frame's
consumed range is [0, 0) = ∅, consistent with [0, 0).
Inductive step. Assume I₁ holds for all executions with T′ < T iterations. Consider which branch the top-of-stack frame enters at iteration T. We analyze all 8 branches (A–H):
(A) Frame completion — frame.i >= frame.textEnd (lines 628–652).
-
A1 (unclosed inline error recovery, lines 629–646): The frame is NOT a successful frame; its consumed range is discarded.
appendBuf(parent, frame.tagStartI, frame.argStartI)appends the tag-head text to the parent's buffer;parent.i = frame.argStartImakes the parent resume scanning from argStartI. This means the region[argStartI, textEnd)that the frame had scanned will be re-scanned by the parent (or its subsequent children). Since the frame itself is not a successful frame, this overlap does not violate I₁. But it introduces extra iterations — see Error recovery overhead below. -
A2 (normal completion, lines 648–652): The frame is a successful frame.
completeChildsetsparent.ito the right endpoint of the child frame's consumed range:- inline:
parent.i = closeStart + child.inlineCloseWidth(line 370). For full DSL inline,inlineCloseWidth = endTag.length; for normal shorthand close,inlineCloseWidth = tagClose.length. Under full-form close competition, shorthand does not close with width=0; it yields via ownership arbitration (degrades and rewinds toargStart). - rawArgs / blockArgs / blockContent:
parent.iis already set tonextIby the respective push path (lines 609, 670), or the child frame's[contentStart, closeStart)naturally does not extend beyond the parent's textEnd.
Thus the parent continues from where the child stopped — no overlap, no gap.
- inline:
(B) Escape sequence (lines 659–672). frame.i = next, where next > i
(readEscapedSequence advances by ≥ 1). Single-frame advancement, no frame operations.
(C) Inline frame argClose detection (lines 675–836). Fires only when
inlineCloseToken !== null. The frame model uses two fields — inlineCloseToken (the
token that closes this frame) and inlineCloseWidth (how many source chars to consume on
close). Sub-branches:
-
C1 (shorthand frame,
inlineCloseToken === tagClose, lines 678–684): It first runs same-position ownership arbitration: if the cursor is also a full-form endTag for the parent, shorthand yields and degrades; otherwisestartsWith(tagClose, i)closes the shorthand frame withinlineCloseWidth = tagClose.length, thenflushBuffer+stack.pop()+completeChild. Still O(1) per cursor, no per-level rescan. -
C2 (full DSL frame, endTag match → inline close, lines 689–695):
startsWith(endTag, i)→inlineCloseWidth = endTag.length,flushBuffer+stack.pop()+completeChild.completeChildsetsparent.i = closeStart + inlineCloseWidth. -
C3 (rawOpen
)%match, lines 638–697): After finding raw close,parent.i = nextI(line 609), where nextI =closeStart + rawClose.length. The child frame consumes[argStart, argClose). The raw content range[contentStart, closeStart)is not character-consumed by any frame (passed directly as a literal tofactory.raw). The parent skips the entire construct. IffindRawClosereturns -1 (line 565), the degraded path treats the tag head + rawOpen as text,parent.i = contentStart. -
C4 (blockOpen
)*match, lines 699–770): Similar to C3.parent.i = nextI = closeStart + blockClose.length(line 670). A blockContent child frame is pushed with textEnd = closeStart, non-overlapping with the parent. -
C5 (ordinary
)with no recognized suffix, lines 772–775):frame.i += tagClose.length. Single-frame advancement.
(D) Unexpected endTag in non-inline frame (lines 841–845).
frame.i += endTag.length. Pure text consumption, single-frame advancement.
(E) Pipe separator (lines 850–858).
frame.i += tagDivider.length. Fires only in insideArgs frames. Single-frame advancement.
(F) No tag match / ordinary character (lines 862–870).
When readTagStartInfo returns null:
- If the frame is inline (
inlineCloseToken !== null),readInlineShorthandStartis tried. If a shorthand match is found and gating allows it,tryPushInlineShorthandChildpushes a child frame withinlineCloseToken = tagClose,implicitInlineShorthand = true— same push semantics as H1 but for the shorthand form. If no shorthand match, falls through. - Otherwise:
appendBuf(frame, i, i + 1),frame.i++. The simplest branch — consumes 1 character.
(G) Depth limit degradation (lines 876–890).
frame.i = degradedEnd, where degradedEnd = skipTagBoundary(…) returns the position after
the tag boundary. Single-frame skip over the entire tag head. No child frame pushed.
(H) Tag head recognized, form determination and dispatch (lines 799–904).
-
H1 (nested tag inside inline frame, lines 892–918): When a full DSL tag head is detected inside an inline frame (
inlineCloseToken !== null):- Current implementation does not close shorthand here via
inlineCloseWidth = 0. Competition is resolved at close-token positions (C1/C2), not at tag-head detection. -
tryPushInlineChildsucceeds → pushes a child frame (child.i = argStart, child.textEnd = frame.textEnd). Does NOT callgetTagCloserType. The parent's frame.i does not advance — the child takes over. Upon child completion,completeChildsets parent.i. IftryPushInlineChildis rejected by gating → the tag head is skipped viaskipTagBoundaryand treated as text.
- Current implementation does not close shorthand here via
-
H2 (non-inline frame,
getTagCloserTypereturns null, lines 923–941):findTagArgClosefound no matching). The entire tag head degrades to text:appendBuf(frame, i, info.argStart),frame.i = info.argStart. frame.i advances by ≥ 1 (info.argStart > ibecause the tag name is ≥ 1 char +(). -
H3 (inline form
closerInfo.closer === endTag, lines 938–952): Same push logic as H1. -
H4 (raw form, lines 946–995):
findRawCloselocates the closing marker.frame.i = nextI(line 830), pushes a rawArgs child frame with textEnd = argClose. -
H5 (block form, lines 998–1060): Similar to H4.
frame.i = nextI(line 892), pushes blockArgs then blockContent child frames. All child frame ranges stay within the parent's textEnd.
Synthesis. Across all branches:
- When a child frame is pushed, its
[startPos, endPos)is always a sub-interval of the parent's not-yet-consumed range. - When a child completes,
completeChildor the push path setsparent.ipast the child's consumed range. - Therefore parent and successful child consumed ranges do not overlap and have no gaps.
The root frame starts at 0 with textEnd = n. All successful frames' consumed ranges form a
partition of [0, n). ∎
Note. This proof does not cover findTagArgClose / findBlockClose / findInlineClose
auxiliary boundary scans — they access the source outside the main loop. Their complexity is
analyzed separately in Auxiliary scan covering lemma.
This section has to be read together with the 1.4.4 changelog, because this is exactly where the
newly documented complexity escape hatch used to live.
Before 1.4.4, the most dangerous path was not ordinary well-formed inline nesting. It was the
combination of broken shorthand / unclosed inline chains with ownership competition and EOF
recovery, producing a pattern like “pop one frame → return to the parent → re-enter the main loop
→ discover the next layer is still broken”.
The problem was not that one individual branch was intrinsically expensive. The problem was that the same remaining suffix could be dragged back into the main loop repeatedly across the nested chain, so the accumulated work could form a triangular series:
k + (k - 1) + (k - 2) + ... + 1
That is O(k^2).
What 1.4.4 does is remove the two multipliers that made that escape hatch real:
-
Nearest ancestor full-form close ownership is now carried explicitly on each frame.
ParseFrame.ancestorEndTagOwnerIndexlets shorthand logic reach the nearest ancestorendTagowner in O(1) during both push and close arbitration. It no longer needs to “re-discover” that owner indirectly by falling back into the parent and re-observing the same close position. -
Malformed EOF inline chains are now planned once and replayed once.
buildMalformedInlineReplayPlan(...)computes the whole malformed chain first, andreplayMalformedInlineChainAtEof(...)emits errors, pops the chain, and performs only one recovery resume.
So after 1.4.4, the right mental model is no longer “every broken layer may drag the same tail
back into the main loop again”. It is:
- shorthand candidates that can be rejected at push time are rejected at push time,
- shorthand closes that must yield to an ancestor full-form close are downgraded at the close site,
- malformed EOF chains are replayed in one batch,
- the parser no longer performs frame-by-frame re-entry across the same suffix.
The three recovery costs should therefore be analyzed separately.
tryPushInlineShorthandChild(...) no longer relies on the older monolithic
resolveShorthandOwnership(...) shape. It delegates directly to
resolveShorthandOwnershipPush(...) in structuralOwnership.ts.
That helper does three conceptually different things:
-
O(1) owner checks first
- if the current frame itself is the full-form owner and
argStartoverlaps a same-cursorendTag, it immediately returnsdefer-parent; - if the nearest ancestor owner matches
endTagat that same point, it also immediately returnsdefer-parent.
- if the current frame itself is the full-form owner and
-
A forward probe only when the current frame itself is a full-form owner
- it scans rightward from
argStart; - escape sequences are skipped in one jump;
- a full DSL tag head stops the probe with
reject = false; - a
tagClosestops the probe and then checks whether that position is also the start ofendTag.
- it scans rightward from
-
Probe results are cached in
shorthandProbe- the cache key is effectively “this frame under this
textEndscan boundary, over this[startI, boundaryI]window”, not a global free-floating position; - later shorthand candidates inside the same reusable window reuse the cached result instead of probing the same window from scratch again.
- the cache key is effectively “this frame under this
So the key fact here is not “one probe may scan several characters”. The key fact is: the same window is not rebuilt over and over.
For a fixed frame and fixed textEnd, a source position is either:
- visited during the first window-building probe, or
- covered by later reuse of
currentProbe.
Therefore the total push-time shorthand probing cost is the sum of first-time window constructions, which is linear under amortization, not “scan the same tail again for every shorthand candidate”.
The other crucial 1.4.4 change is downgradeInlineIntoParent(...).
Its current semantics are:
-
flushBuffer(frame): finalize any text still buffered in the child frame; -
appendBuf(parent, frame.tagStartI, frame.argStartI): replay the shorthand / inline tag head as plain parent text; -
flushBuffer(parent): stabilize the parent's current text state; -
appendNodesWithMergedText(parent.nodes, frame.nodes): attach the already parsed child nodes directly to the parent; -
parent.i = nextParentI: continue in the parent from the current close site onward.
This is fundamentally different from the old “rewind to argStartI and re-scan the whole tail”
shape.
- Old reading: the child frame's already parsed body is effectively thrown away, and the parent has to reconstruct the suffix again.
-
Current reading: the tag head is degraded to text, but the child frame's already established
body nodes are preserved and attached upward; the parent continues from
nextParentI.
So for each shorthand / inline frame that yields via defer-parent:
- the frame can degrade at most once,
- once degraded it leaves the active stack and does not compete for ownership again,
- its already produced
frame.nodesare moved upward only once.
If m_deg is the number of degraded frames in a parse and N_deg_nodes is the total number of
nodes attached upward through this path, then:
T_defer-parent = O(m_deg + N_deg_nodes)
and both terms are ≤ O(n), so close-site ownership yielding remains O(n) overall.
When tryFinalizeFrameAtEof(...) sees frame.inlineCloseToken !== null, it no longer handles the
situation as “current frame errors → pop one frame → return to the main loop”. It directly calls
replayMalformedInlineChainAtEof(frame).
That path has three distinct phases:
-
Build the replay plan
-
buildMalformedInlineReplayPlan(startFrame, getFrameByParentIndex)walks only the parent chain; - it does not scan source characters;
- therefore its cost is O(h), where
his the length of the malformed inline chain.
-
-
Emit errors and pop the whole chain
- each frame in
replayPlan.chainemits one error; - then each is popped from the stack;
- again this is O(h), proportional only to chain length.
- each frame in
-
Perform a single recovery resume
- if a first non-inline parent exists above the chain, the current implementation only replays
the outermost degraded tag head from the plan:
appendBuf(parent, resumeTagStartI, resumeArgStartI); - then it sets
parent.i = resumeArgStartI; - then the parent continues.
- if a first non-inline parent exists above the chain, the current implementation only replays
the outermost degraded tag head from the plan:
The key point is: the whole malformed chain causes only one recovery resume scan.
EOF happens at most once in a parse call, so the cost of “resume from resumeArgStartI and scan
what remains” is bounded by:
T_eof-rescan ≤ parent.textEnd - resumeArgStartI ≤ n
It is no longer multiplied by chain length, and it no longer forms the frame-by-frame triangular re-entry series.
For a single parseNodesWithFactory(...) call, the extra work introduced by inline / shorthand
ownership and EOF recovery satisfies:
T_recovery-extra(n)
= T_probe + T_defer-parent + T_replay-plan + T_eof-rescan
≤ O(n) + O(n) + O(h) + O(n)
= O(n)
where h is the malformed inline-chain length at EOF, and h ≤ the number of live frames ≤ O(n).
Proof ingredients.
-
T_probe: first-time probe-window construction inresolveShorthandOwnershipPush(...), linear under amortization; -
T_defer-parent: each yielding frame degrades once, and its already produced nodes are attached upward once; -
T_replay-plan: parent-chain walk only, no source scan; -
T_eof-rescan: one resume after replay, not one resume per broken layer.
So inline / shorthand malformed recovery no longer creates quadratic re-entry. ∎
For the main while loop inside parseNodesWithFactory, the post-1.4.4 count is best split into
four parts:
-
Successful-frame source-consuming iterations ≤
n- directly from invariant I₁.
-
Bulk non-boundary text-run append length ≤
n- since
1.4.3, plain text no longer enters the full branch tree one character at a time; - but the sum of all run lengths is still bounded by total source length.
- since
-
Extra ownership / downgrade / EOF-replay work ≤
c_err · n- from Lemma E₁ above;
- importantly,
c_erris now an implementation constant, not a frame-by-frame multiplier such as2^Dfrom the old re-entry reading.
-
Stack operations: frame completion / push / pop / child attachment ≤ O(number of frames) ≤ O(n)
- each frame corresponds to at least one recognized structural fragment or tag head;
- push, pop,
completeChild(...), andappendNodesWithMergedText(...)all run linearly in those frame / node counts.
So the combined loop-and-recovery bound is:
T_loop(n) ≤ c_scan · n + c_err · n + c_stack · n
= C_loop · n
where C_loop depends on syntax density, input shape, and implementation constants, but not on n.
For well-formed input, c_err is close to 0.
For malformed input that used to trigger the 1.4.4 bad path, c_err is still a constant, rather
than a multiplier that grows with re-entry depth.
Root / block frames call getTagCloserType, which in turn calls findTagArgClose to classify tag
form. Since 1.4.4, shorthand ownership probing must be included in the auxiliary-scan story as
well; otherwise the document would still leave open the question “is the main loop linear while
probe / replay work quietly explodes off to the side?”.
So this section should distinguish two auxiliary-scan families:
-
root / block form-classification scans:
findTagArgClose,findBlockClose,findRawClose -
inline shorthand ownership scans: the forward probe in
resolveShorthandOwnershipPush(...)
Lemma A₁. For one complete parse, total auxiliary scan work satisfies:
T_aux(n) = T_form-scan(n) + T_shorthand-probe(n) ≤ c_aux · n
where c_aux depends on syntax and input shape, but not on n.
Proof ingredients.
-
findTagArgClose/findBlockClose/findRawClosestill run only in root / block-level form classification, not once per inline nesting level; - after successful form classification, either a child consumes that region or the parent advances
itonextI, so those scan intervals are absorbed by subsequent consumption / skipping; - shorthand ownership probing is already covered by Lemma E₁ as linear under amortization because
each
(frame, textEnd)window is built once and then reused; - the pre-
1.4.4multiplier “auxiliary scan itself is cheap, but frame-by-frame replay multiplies it” is gone; the relationship is additive now, not multiplicative.
Therefore total auxiliary scan work still fits c_aux · n. ∎
For 1.4.3+, per-iteration work is best split into three buckets:
-
Boundary-position branch cost
- when execution actually enters the branch cascade, it runs a fixed set of
startsWith/ equality checks; - under the default syntax this is still well approximated by a bounded constant k.
- when execution actually enters the branch cascade, it runs a fixed set of
-
Bulk text-run append cost
-
appendBuf(...)should no longer be described as "always O(1)"; - more precisely, one invocation is O(batch size), where batch size is the
distance to the next boundary returned by
findNextBoundaryChar(...); - but the sum of all such batch sizes is still ≤ n on the successful path.
-
-
Auxiliary scan and cache cost
- non-inline form detection may still perform
findTagArgClosescans; -
1.4.3addsgetTagCloserTypeWithCache(...)for large windows, but its cache lifetime is only a single parse call; - escape-token matching now pays a one-time per-
SyntaxConfigpreparation cost, then checks only the same-lead candidate bucket on subsequent uses.
- non-inline form detection may still perform
So the parser is no longer best described as "O(1) helper work per consumed character". The more accurate statement is:
- successful-frame consumption + bulk text append = O(n)
- root/block auxiliary scans = amortized O(n)
- cache construction changes constants, not the asymptotic bound
Combining the components, we can write:
T_structural(n) ≤ k · n + s · n
= C₁ · n
where:
-
kis the main-loop per-character branch budget (empirically about 3–15 under default syntax), -
sis the amortized constant from auxiliary scans such as root/block-level form detection and block/raw closing-boundary searches.
So C₁ = k + s is a constant that depends on the implementation and input shape, but not on n.
For the render layer, the core traversal (renderNodes) visits each IndexedStructuralNode
exactly once. If handlers further call materializeTextTokens, then under WeakSet-based skipping,
each same-reference TextToken[] array is fully materialized at most once within one render pass.
Under a fixed handler implementation where single-handler output size remains linearly bounded by the
input, we have nodes ≤ O(n) and tokens ≤ O(n), so:
T_render(n) ≤ C₂ · n
Preconditions. The bound above relies on the following constraints on handler implementations:
- Per-handler output size ≤ α · (source span of the input node), where α is a constant > depending on the handler. This guarantees total handler output ≤ α · n.
- Handlers do not perform super-linear external operations (e.g., full-text search, > sorting). If a handler contains O(n log n) or higher logic, T_render's bound increases > accordingly.
materializeTextTokens's WeakSet deduplication is keyed by array identity: > the sameTextToken[]reference is materialized at most once within one render pass. > Distinct arrays with equivalent logical content are still materialized separately.These conditions are satisfied by all built-in yumeDSL handlers. Third-party custom handlers must ensure them independently.
Therefore:
T_parseRichText(n) = T_structural(n) + T_render(n) ≤ (C₁ + C₂) · n = C · n
This gives the formal T(n) ≤ C · n bound.
The above proves T(n) ≤ C · n (upper bound). For completeness, we show T(n) = Ω(n) (lower bound) also holds, yielding T(n) = Θ(n).
Proposition. For any input of length n > 0, parseRichText time T(n) ≥ n.
Proof. The root frame of parseNodesWithFactory has textEnd = n. The root frame (or its
descendant frames) must consume every position in [0, n) — otherwise some source content would
not appear in the output AST (the root frame's textEnd covers the full text, and I₁ guarantees
all positions are consumed by successful frames).
But the lower bound can no longer be phrased as “each character-consuming iteration advances only a
constant number of positions,” because 1.4.3+ fast skip may advance across an entire non-boundary
text run in one iteration. The correct lower bound here is a source-touch lower bound:
- for any position
p ∈ [0, n), either it is ultimately consumed by some successful frame, or it lies inside a non-boundary text run scanned byfindNextBoundaryChar(...); - if
plies in a non-boundary run, the parser must at least read/compare that position in order to determine that it is not a current-frame boundary character and keep advancing the fast-skip pointer; - if
pis a boundary position, the parser must at least perform one boundary-site decision there to determine whether it is escape, tag head, close, separator, or ordinary text; - therefore, regardless of which class
pfalls into, the current implementation incurs at least one library-owned primitive source touch on it (a read, comparison, or inclusion in a monotone scan).
Hence the total number of source touches is at least linear in n:
source_touches(n) >= n
and each primitive source touch has constant cost, so:
T(n) = Ω(n)
Combined: T(n) = O(n) ∧ T(n) = Ω(n) ⟹ T(n) = Θ(n). ∎
The in-document appendix below keeps the implementation walkthroughs, branch-coverage inventories, historical bad-path comparisons, empirical constants, and incremental-parsing analysis.
The material below stays in the same page: it supplies implementation location, historical comparison, and engineering context, but it does not change the theorem object proved above.
The next section explains only why the historical recovery chain was quadratic. The main proof page now proves only the current implementation.
The quadratic term did not come from the mere existence of the main loop. It came from the historical malformed-shorthand / EOF-recovery path that repeatedly re-entered the same remaining suffix frame by frame, redoing ownership and close classification.
Let a toxic sample contain a malformed shorthand chain of depth m that must be downgraded or recovered at EOF. The old path can be abstracted as:
- the first fallback rechecks a remaining suffix of size about
m; - the second fallback rechecks a suffix of size about
m-1; - ...
- the last fallback rechecks a suffix of size about
1.
Then the extra recovery cost of the old behavior is:
T_old-recovery(m)
= Θ(m) + Θ(m-1) + ... + Θ(1)
= Σ_{k=1..m} Θ(k)
= Θ(m²)
If m scales with input size, that single recovery branch is enough to destroy the linear upper bound.
The appendices preserve code-location walkthroughs, branch inventories, version evolution, historical bad paths, empirical constants, and incremental-parsing analysis. They are engineering context, not part of the theorem object's definition.
The diagram below lists every step parseRichText("...", options) actually takes,
based on the current implementation in parse.ts / resolveOptions.ts / structural.ts /
scanner.ts / render.ts / builders.ts. No ellipses, no "and so on" — every function
named here corresponds to a locatable piece of source.
1.4.3reading note. Since1.4.3, the structural main loop is no longer best read as "every character enters the full branch cascade". Before the cascade, it now runsfindNextBoundaryChar(...)and bulk-skips non-boundary text runs. The current boundary set includestagPrefix,tagClose,tagDivider(inside args only),escapeChar(when escapes are enabled for the frame), the currentinlineCloseToken, andtagName.isTagStartChar(...)when shorthand mode is active. So the chart below should be read as a boundary-triggered control-flow map, not a per-character branch tree.
flowchart TD
A[text + ParseOptions] --> B[parseRichText parse.ts]
B --> C{empty text?}
C -->|yes| Z[return empty array]
C -->|no| D[resolveParsePipelineBase parse.ts]
subgraph BASE[Base Resolve]
D --> D1[buildGatingContext resolveOptions.ts]
D --> D2[resolveBlockTags parse.ts]
D --> D3[resolveBaseOptions resolveOptions.ts]
D3 --> D4[resolve syntax/tagName/tracker]
end
D --> E[createId seed or options.createId]
E --> F[createRenderContextFromBase parse.ts]
F --> G[withLegacyAmbientState compat wrapper]
G --> H[parseStructuralWithResolved structural.ts]
subgraph STRUCT[structural.ts main path]
H --> H1[parseNodesWithFactory]
H1 --> H2[prepare syntax closures]
H2 --> H3[stack = makeFrame root]
H3 --> H4[while stack.length > 0]
H4 --> H4A[A frame-finish branch]
H4 --> H4B[B escape branch]
H4 --> H4C[C inline argClose branch]
H4 --> H4D[D unexpected endTag branch]
H4 --> H4E[E args separator branch]
H4 --> H4F[F tag-head branch]
H4 --> H4G[G inline shorthand branch]
H4 --> H4H[H text fallback branch]
H4F --> H5[getTagCloserType]
H5 --> H6[findTagArgClose scanner.ts]
end
H1 --> I[StructuralNode array]
subgraph SCAN[scanner.ts helpers]
I --> I1[findBlockClose]
I --> I2[findRawClose]
I --> I3[findInlineClose compat-only]
end
I --> J[renderNodes render.ts]
J --> J1[explicit stack traversal]
J1 --> J2[mergeTextToken / handler dispatch]
J2 --> J3[materializeTextTokens builders.ts]
J3 --> J4[WeakSet hit => skip repeated materialization]
J4 --> K[TextToken array]
The chart maps to concrete execution stages as follows:
-
Entry stage (
parse.ts)-
parseRichText(text, options)first runs an empty-string short-circuit. - Non-empty input enters
resolveParsePipelineBase(text, options). -
createIdis resolved (options.createIdor defaultrt-${seed++}). -
createRenderContextFromBase(...)builds render context. -
withLegacyAmbientState(...)wraps the structural + render pipeline for legacy ambient compatibility.
-
-
Base resolve stage (
parse.ts+resolveOptions.ts)-
buildGatingContext(handlers, allowForms)splits handlers by form and prepares gating indices. -
resolveBlockTags(gating.handlers, options.blockTags)derives block-tag lookup and applies explicit overrides. -
resolveBaseOptions(text, options)resolves finalsyntax/tagName/tracker.
-
-
Structural entry (
structural.ts)-
parseStructuralWithResolved(...)callsparseNodesWithFactory(...). -
parseNodesWithFactoryinitializes closure-local syntax tokens and helper functions. - The parser uses an explicit
stack = [makeFrame(root)]and iterates withwhile (stack.length > 0).
-
-
Structural main-loop branches (
AtoH)-
Aframe done: flush/pop/complete child or return root. -
Bescape sequence: emitescapenode and advance. -
Cinline argClose: wheninlineCloseToken !== null, detects the frame's close token. For shorthand frames (inlineCloseToken === tagClose), a bare)closes the frame. For full DSL frames,)triggers suffix-based dispatch:)$$→ inline close,)%→ raw form,)*→ block form. -
DunexpectedendTagin non-inline frame. -
Eargs separator (tagDivider) handling. -
Ftag-head detection + form decision viagetTagCloserType/getTagCloserTypeWithCache(findTagArgCloseentry for non-inline frames only). -
Ginline shorthand recognition: when inside an inline frame and no tag-head matches,readInlineShorthandStarttries to matchname(patterns for implicit inline shorthand. -
Hfallback text append / bulk non-boundary run append.
-
-
Scanner helper stage (
scanner.ts)-
findTagArgClosefor form decision. -
findBlockClosefor block close boundary. -
findRawClosefor raw close boundary. -
findInlineCloseremains compatibility path.
-
-
Render stage (
render.ts+builders.ts)-
renderNodes(...)uses explicit-stack traversal over structural nodes. - Node dispatch handles
text/escape/inline/raw/blockwith handler calls. -
appendTokenmerges adjacent text tokens. -
materializeTextTokens(...)usesWeakSet(materializedArrays)to skip repeated subtree materialization.
-
-
Output stage
-
parseRichText(...)returnsTextToken[]. -
parseStructural(...)returnsStructuralNode[]at structural stage exit.
-
Three places in this diagram are worth a second look — they are exactly the spots that turned the pre-1.1.2 O(n²) into Θ(n):
-
Branch C (inline argClose) is fully resolved inside the main loop. When it
sees
), it checks the frame'sinlineCloseTokento decide on the spot whether this is a shorthand close (token =), with ownership arbitration when full-form close competes), or for full DSL frames checks the suffix ($$/%/*) to decide whether this is an inline close, a raw form, or a block form. There is no "scan ahead again to find the closer" step. Before 1.1.2 this used to callfindInlineClose/findTagArgClose, re-sweeping the tail at every inline level — that was O(n·d) at depth d. -
findTagArgClosein Branch F still exists, but it only runs when a new tag header appears and only on non-inline frames. Since1.4.3, large remaining windows may go throughgetTagCloserTypeWithCache(...), which reuses closer decisions within the same parse call. This changes constants, not the asymptotic class. -
Escape lookup also changed in
1.4.3: escapable token sets are cached perSyntaxConfig, and candidate tokens are bucketed by first character. In practice this makes each escape attempt closer to "check only matching-lead candidates" rather than "scan every token". -
The render-layer WeakSet dedup only guarantees that the same
TextToken[]array object is fully materialized at most once within a single render pass. Distinct arrays that contain the same logical subtree are still materialized separately.
If you want to see how this diagram plays out on a concrete input step by step, read on to "Quick tour 2".
1.4.3note. Since1.4.3, the main loop runsfindNextBoundaryChar(...)before entering the branch cascade, so consecutive non-boundary characters are bulk-advanced in a singleappendBufstep. The current boundary set includestagPrefix,tagClose,tagDivider(inside args only),escapeChar, the currentinlineCloseToken, andtagName.isTagStartChar(...)when shorthand mode is enabled. The table below intentionally preserves the pre-1.4.3per-character expansion only to explain branch semantics; the fast skip changes constants, not correctness or asymptotic complexity.
To avoid hand-waving with ellipses, the table below lists every iteration of
the parseNodesWithFactory main loop. The input simultaneously exercises a tag
header, an inline nesting, plain text, and the inline-close cascade (n = 23):
index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
char : h i $ $ b ( w o $ $ i ( r l d ) $ $ ) $ $ !
Relevant tokens under the default syntax: tagPrefix = "$$", tagOpen = "(",
tagClose = ")", endTag = ")$$" (length 3). The inline frame's close detection
first matches tagClose and then matches endTag starting at the same i. In the current implementation this is a same-cursor scanEndTagAt(...) check followed by same-cursor suffix dispatch inside the full DSL inline path.
Initial state: stack = [root], where root is built by
makeFrame(text, 0, false, 0) with textEnd = 23 and inlineCloseToken = null.
Legend for the table:
-
frame is the top of the stack (
b/idenote inline child frames named by theirtag;Ris the root). - stack shows the entire stack at the start of the iteration (top on the right).
-
i₀ → i₁ is
frame.ibefore and after the iteration. - branch identifies the source-level branch actually taken (line numbers are current structural.ts).
- buf / nodes is the cumulative state of that frame after this iteration.
| # | frame | stack | i₀ → i₁ | branch taken (line) | accumulated state of this frame |
|---|---|---|---|---|---|
| 1 | R | [R] | 0 → 1 | frame not done (628) → escape miss (659) → inlineCloseToken === null, skip inline argClose block (675) → unexpected endTag miss (non-inline branch) → readTagStartInfo miss → appendBuf | R.buf = "h" |
| 2 | R | [R] | 1 → 2 | same path as #1, appendBuf | R.buf = "hi" |
| 3 | R | [R] | 2 → 3 | same as above, appendBuf | R.buf = "hi " |
| 4 | R | [R] | 3 → 7 | escape miss → inlineCloseToken null → unexpected endTag miss → readTagStartInfo matches $$b(: tagPrefix $$ at 3, name b at 5, tagOpen ( at 6. Then getTagCloserType → findTagArgClose decides this is the inline form → tryPushInlineChild (497): flushBuffer pushes [0,3) as text node "hi " into R.nodes, builds child frame b and pushes it; child has b.i = info.argStart = 7, b.inlineCloseToken = endTag (i.e. )$$), b.tagStartI = 3, b.argStartI = 7. Stack top is now b; next iteration starts on b. |
R.nodes = [text "hi "]; R.buf cleared |
| 5 | b | [R, b] | 7 → 8 | escape miss → inlineCloseToken !== null enters inline argClose block (675): inlineCloseToken !== tagClose (full DSL frame), so enters suffix path; startsWith(")", 7) → false → falls through to tag-head check; readTagStartInfo at 7 → 'w' is not tagPrefix → inline shorthand miss → appendBuf | b.buf = "w" |
| 6 | b | [R, b] | 8 → 9 | same path as #5, appendBuf | b.buf = "wo" |
| 7 | b | [R, b] | 9 → 13 | escape miss → tagClose ")" miss → readTagStartInfo matches $$i(: tagPrefix at 9, name i at 11, tagOpen at 12. Since b is an inline frame (inlineCloseToken !== null), takes H1 path (no getTagCloserType) → tryPushInlineChild: flushBuffer pushes "wo" as text node into b.nodes; pushes child frame i with i.i = 13, i.inlineCloseToken = endTag ()$$), i.tagStartI = 9, i.argStartI = 13` |
b.nodes = [text "wo"]; b.buf cleared |
| 8 | i | [R, b, i] | 13 → 14 | escape miss → tagClose ")" at 13 miss → readTagStartInfo at 13 → 'r' is not tagPrefix → inline shorthand miss → appendBuf | i.buf = "r" |
| 9 | i | [R, b, i] | 14 → 15 | same as #8, appendBuf | i.buf = "rl" |
| 10 | i | [R, b, i] | 15 → 16 | same as #8, appendBuf | i.buf = "rld" |
| 11 | i | [R, b, i] | 16 → 19 | escape miss → inlineCloseToken !== tagClose (full DSL frame); tagClose ")" at 16 hits, enters suffix check: startsWith(")$$", 16) → chars 16,17,18 = )$$ → true → inline close path: i.inlineCloseWidth = endTag.length = 3, flushBuffer pushes "rld" as text node into i.nodes → stack.pop() → completeChild(i): closeStart = 16, nextI = closeStart + 3 = 19, pushes factory.inline("i", i.nodes, meta, false) into b.nodes and sets b.i = 19. Stack top returns to b. |
frame i is gone; b.nodes = [text "wo", inline "i"(...)] |
| 12 | b | [R, b] | 19 → 22 | escape miss → tagClose ")" at 19 hits; endTag check: chars 19,20,21 = )$$ → true → inline close: b.inlineCloseWidth = endTag.length = 3, flushBuffer (b.buf empty, no-op) → pop → completeChild(b): closeStart = 19, nextI = 22, pushes factory.inline("b", b.nodes, meta, false) into R.nodes and sets R.i = 22 |
frame b is gone; R.nodes = [text "hi ", inline "b"(...)] |
| 13 | R | [R] | 22 → 23 | escape miss → inlineCloseToken null, skip inline argClose block → unexpected endTag miss → readTagStartInfo miss → appendBuf | R.buf = "!" |
| 14 | R | [R] | 23 |
frame.i >= frame.textEnd (628) hits → flushBuffer pushes "!" as text node into R.nodes → pop → returnKind === null → return frame.nodes
|
R.nodes = [text "hi ", inline "b"(...), text "!"] |
The main loop runs 14 iterations total to consume 23 source positions plus 2 inline pops. You can verify:
-
Char-consuming iterations (#1–#3, #5–#6, #8–#10, #13) = 9, exactly matching
the 9 plain text characters
h i ␠ w o r l d !. Each one strictly advancesiby 1. -
Tag-header jump iterations (#4, #7) = 2. Each one lets the parent frame jump
ipast an entire$$X(header (4 characters) in one step. Those 4 positions are then consumed by the child frame in its own subsequent scanning — so at the final-frame level every position still belongs to exactly one frame. -
Inline-close iterations (#11, #12) = 2. Each one advances
ipast)$$(3 characters); again, those 3 positions are accounted for in the parent frame as parent-frame consumption, with no overlap across frames. - Frame-done iteration (#14) = 1, no character consumed.
Add up "the 4-character jump when the parent pushes a child" with "the child's own subsequent character-by-character scan", and every source position is consumed by exactly one frame at the final-frame level. This is the invariant the formal section below proves rigorously.
Outside the main loop, this input triggers the following helper scans (all from scanner.ts, only at root/block level):
| call site | start | end | scanned interval | length |
|---|---|---|---|---|
findTagArgClose (#4) |
7 | 19 | [7, 19) | 12 |
findTagArgClose (#7) |
13 | 16 | [13, 16) | 3 |
The second findTagArgClose runs in a deeper context (frame b is already an
inline child), but the caller is not the inline frame itself — it is
getTagCloserType doing a one-shot form decision for the new tag $$i(.
Physically that call is nested inside an inline frame, but semantically it is
still root/block-level form detection because getTagCloserType only fires when
readTagStartInfo first encounters a new tag header. That is a fundamentally
different cost model from "rescan the closer at every inline nesting level":
the former costs O(number-of-tag-headers), the latter costs O(depth · n) and
gave the old O(n²).
Helper scans cover 12 + 3 = 15 characters total, partially overlapping with the
23 positions consumed by the main loop. Total work stays inside
n + Σ(helper-scan lengths) = O(n), matching the closed-form bound
T_structural(n) ≤ k·n + s·n from the "Formal upper bound → Per-iteration work"
subsection below.
The example above contains only full DSL structures ($$name(...)$$). Since 1.3
the parser supports shorthand form name(...) inside inline argument regions —
the prefix and suffix are omitted and a bare ) closes the frame. The following
input mixes full DSL with shorthand nesting, tracing every iteration to show
shorthand push / close / preemption by full DSL.
Input (n = 27):
index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
char: $ $ b o l d ( h e l l o b o l d ( w o r l d ) ) $ $
Meaning: $$bold(hello bold(world))$$ — the outer tag is a full DSL bold; inside
its argument region there is a shorthand bold(world).
Relevant tokens are the same as before: tagPrefix = "$$", tagOpen = "(",
tagClose = ")", endTag = ")$$". Shorthand frames have
inlineCloseToken = tagClose (i.e. )); full DSL frames have
inlineCloseToken = endTag (i.e. )$$).
| # | frame | stack | i₀ → i₁ | branch taken (line) | accumulated state of this frame |
|---|---|---|---|---|---|
| 1 | R | [R] | 0 → 7 | readTagStartInfo matches $$bold(: tagPrefix at 0, name "bold" at 2, tagOpen at 6. getTagCloserType → findTagArgClose decides inline form → tryPushInlineChild: pushes child bold, bold.i = 7, bold.inlineCloseToken = endTag
|
R.nodes = []; R.buf cleared |
| 2 | bold | [R, bold] | 7 → 8 | escape miss → tagClose ")" at 7 miss → readTagStartInfo miss → readInlineShorthandStart at 7: name "hello" read successfully but startsWith("(", 12) → char 12 is space → false → shorthand miss → appendBuf |
bold.buf = "h" |
| 3 | bold | [R, bold] | 8 → 9 | same as #2, appendBuf | bold.buf = "he" |
| 4 | bold | [R, bold] | 9 → 10 | same | bold.buf = "hel" |
| 5 | bold | [R, bold] | 10 → 11 | same | bold.buf = "hell" |
| 6 | bold | [R, bold] | 11 → 12 | same | bold.buf = "hello" |
| 7 | bold | [R, bold] | 12 → 13 | space ' ' is not tagStartChar → shorthand miss → appendBuf | bold.buf = "hello " |
| 8 | bold | [R, bold] | 13 → 18 | readTagStartInfo miss ('b' is not tagPrefix $$) → readInlineShorthandStart at 13: name "bold", tagNameEnd=17, startsWith("(", 17) → true → tag "bold" in registeredTags → supportsInlineForm passes → returns info. tryPushInlineShorthandChild: ambiguity probe scans from argStart=18, checking escape / full DSL tag head / tagClose at each position; hits ) at 23 (boundary=23), startsWith(")$$", 23) → chars 23,24,25 = ),),$ ≠ )$$ → reject = false → probe passes. flushBuffer pushes "hello " into bold.nodes. Pushes shorthand child bold(sh): bold(sh).i = 18, bold(sh).inlineCloseToken = ")", bold(sh).implicitInlineShorthand = true
|
bold.nodes = [text "hello "] |
| 9 | bold(sh) | [R, bold, bold(sh)] | 18 → 19 | inlineCloseToken = ")" === tagClose → shorthand frame → startsWith(")", 18) → false → readTagStartInfo miss → readInlineShorthandStart miss ("world" not followed by () → appendBuf |
bold(sh).buf = "w" |
| 10 | bold(sh) | [R, bold, bold(sh)] | 19 → 20 | same as #9 | bold(sh).buf = "wo" |
| 11 | bold(sh) | [R, bold, bold(sh)] | 20 → 21 | same | bold(sh).buf = "wor" |
| 12 | bold(sh) | [R, bold, bold(sh)] | 21 → 22 | same | bold(sh).buf = "worl" |
| 13 | bold(sh) | [R, bold, bold(sh)] | 22 → 23 | same | bold(sh).buf = "world" |
| 14 | bold(sh) | [R, bold, bold(sh)] | 23 → 24 | inlineCloseToken = ")"; first checks scanEndTagAt(23), which is not a full endTag in this example, so ownership does not defer. Then startsWith(")", 23) → true → C1 shorthand close: inlineCloseWidth = tagClose.length = 1, flushBuffer pushes "world" into bold(sh).nodes → pop → completeChild: closeStart=23, nextI=24, factory.inline("bold", nodes, meta, true) pushed into bold.nodes, sets bold.i = 24
|
bold(sh) gone; bold.nodes = [text "hello ", inline "bold"(sh)(...)] |
| 15 | bold | [R, bold] | 24 → 27 | tagClose ")" at 24 hits → startsWith(")$$", 24) → chars 24,25,26 = )$$ → true → C2 inline close: inlineCloseWidth = 3, flushBuffer (buf empty) → pop → completeChild: closeStart=24, nextI=27, factory.inline("bold", nodes, meta, false) pushed into R.nodes, sets R.i = 27
|
bold gone; R.nodes = [inline "bold"(...)] |
| 16 | R | [R] | 27 |
frame.i >= frame.textEnd → flushBuffer (buf empty) → pop → return frame.nodes
|
R.nodes = [inline "bold"(...)] |
The main loop runs 16 iterations to consume 27 source positions plus 2 pops.
-
Char-consuming iterations (#2–#7, #9–#13) = 11, matching the 11 text
characters
h e l l o ␠ w o r l d. -
Tag-header jump iterations (#1, #8) = 2. #1 skips the full DSL header
$$bold((7 chars); #8 skips the shorthand headerbold((5 chars). In both cases the child frame will account for those positions in its own scanning. -
Frame-close iterations (#14, #15) = 2. #14 (shorthand close) consumes just
1 character
). #15 (full DSL close) consumes 3 characters)$$. - Frame-done iteration (#16) = 1, no character consumed.
In addition to findTagArgClose (seen in Walkthrough 2), shorthand mode
introduces one extra helper scan — the ambiguity probe
(tryPushInlineShorthandChild(...), which in the current implementation delegates its ownership decision to resolveShorthandOwnershipPush(...)).
When the parent frame is a full DSL frame (inlineCloseToken === endTag), the
parser must verify that the first "boundary" inside the shorthand argument region
does not cause the shorthand to consume the parent's endTag. The probe scans
forward character by character, stopping at one of three conditions:
- Escape sequence — skipped, scanning continues.
- Full DSL tag header (
readTagStartInfomatches$$name() — boundary is set to this position,reject = false(the shorthand frame would be preempted by the full DSL structure before reaching it, so no ambiguity). -
tagClose()) — boundary is set to this position;startsWith(endTag, boundary)is checked: if the)is the start of endTag ()$$),reject = trueand the shorthand is rejected.
In this example (#8), the probe scans from 18 to 23 and finds ) (boundary=23).
startsWith(")$$", 23) → false → reject = false → probe passes. If the boundary lands at
textEnd, scanEndTagAt may return truncated-prefix; this is treated as non-full.
The result is
cached in shorthandProbe: ShorthandProbeState | null (lazily created, containing
textEnd / startI / boundaryI / reject), and reused only under the same
textEnd scan-boundary version to avoid cross-window stale reuse.
| call site | start | end | scanned interval | length |
|---|---|---|---|---|
findTagArgClose (#1) |
7 | 23 | [7, 23) | 16 |
tryPushInlineShorthandChild probe (#8) |
18 | 23 | [18, 23) | 5 |
Helper scans cover 16 + 5 = 21 characters total, partially overlapping with the
27 positions consumed by the main loop. Total work stays inside
n + Σ(helper-scan lengths) = O(n).
This is the 1.3.x "poison" sample that most easily causes wrong ownership if the
parser uses ad-hoc special cases. Like 2b, we list the main-loop progression, but
this case includes two explicit "degrade + rewind to argStart" cycles.
Input (n = 45, custom syntax):
=bold<天気がbold<いlink<baidu.com|い>=>から>=散歩しましょうSyntax used in this walkthrough (same as tests/shorthandBehavior.test.ts):
tagPrefix = "="tagOpen = "<"tagClose = ">"tagDivider = "|"endTag = ">="
Character index map:
index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
char : = b o l d < 天 気 が b o l d < い l i n k < b a i d u . c o m | い > = > か ら > = 散 歩 し ま し ょ うStack labels:
-
R: root frame -
Bf: outer full-form=bold<...>=frame (inlineCloseToken = ">=") -
Bs: shorthandbold<...>candidate frame (inlineCloseToken = ">") -
Ls: shorthandlink<...>candidate frame (inlineCloseToken = ">")
| # | frame | stack | i₀ → i₁ | Matched branch (single ownership resolver across phases) | State snapshot |
|---|---|---|---|---|---|
| 1 | R | [R] | 0 → 6 |
readTagStartInfo matches =bold<; tryPushInlineChild pushes full-form Bf. |
R.buf cleared |
| 2 | Bf | [R, Bf] | 6 → 7 | Plain-text append (天). |
Bf.buf="天" |
| 3 | Bf | [R, Bf] | 7 → 8 | Plain-text append (気). |
Bf.buf="天気" |
| 4 | Bf | [R, Bf] | 8 → 9 | Plain-text append (が). |
Bf.buf="天気が" |
| 5 | Bf | [R, Bf] | 9 → 14 |
readInlineShorthandStart matches bold<; resolveShorthandOwnershipPush(...) returns allow; push Bs. |
Bf.nodes += text("天気が") |
| 6 | Bs | [R, Bf, Bs] | 14 → 15 | Plain-text append (い). |
Bs.buf="い" |
| 7 | Bs | [R, Bf, Bs] | 15 → 20 |
readInlineShorthandStart matches link<; resolveShorthandOwnershipPush(...) returns allow; push Ls. |
Bs.nodes += text("い") |
| 8 | Ls | [R, Bf, Bs, Ls] | 20 → 31 | Scans baidu.com + divider + い: text segments use text append; the divider hits tagDivider and emits a separator. |
Ls emits a separator and keeps side text |
| 9 | Ls | [R, Bf, Bs, Ls] | 31 → 31 | At tagClose (>), resolveShorthandOwnershipClose(...) runs first: the same cursor can form outer endTag (>=), so the result is defer-parent. Ls degrades. |
Ls enters degrade |
| 10 | Bs | [R, Bf, Bs] | 20 → 31 | After Ls degradation, parent rewinds to argStart and rescans; the link<... segment is absorbed as plain text in Bs. |
Bs absorbs that segment |
| 11 | Bs | [R, Bf, Bs] | 31 → 31 | At the same competition point, Bs also gets resolveShorthandOwnershipClose(...) = defer-parent; Bs degrades too. |
Bs enters degrade |
| 12 | Bf | [R, Bf] | 14 → 36 | Parent Bf rewinds to argStart and rescans, accumulating the corresponding text span; no full close occurs before index 36. |
Bf.buf updated |
| 13 | Bf | [R, Bf] | 36 → 38 | Full endTag (>=) matches; Bf closes normally via completeChild and is emitted to root. |
R.nodes += inline(bold) |
| 14 | R | [R] | 38 → 45 | Tail 散歩しましょう appended as plain text; EOF pop and return. |
done |
Output (implicitInlineShorthand=false/true are identical):
天気がbold<いlink<baidu.com|い>から>=散歩しましょうNote: the | is processed by the pipe-divider branch during scanning (not as a plain text character); it appears
preserved in final text because this path later degrades/rewinds and settles as text.
Why this matters: push, close, and fallback all call the same ownership
resolver. That prevents split-brain behavior where one phase "salvages"
shorthand and steals close ownership from a stable full-form parent.
When a shorthand close point competes with a full DSL endTag at the same cursor,
ownership arbitration runs first: if endTag fully matches, shorthand yields
(degrades to text and rewinds to argStart) and the parent frame consumes that
same close token. This guarantees full DSL syntax takes priority over shorthand.
-
iis strictly monotonic. Across both examples,frame.ieither stays the same (frame-done pop) or strictly increases. No branch ever rewindsi. -
Inline close needs no rescanning. Full DSL frames hit
)and perform same-position token checks (endTag/rawOpen/blockOpen) —findInlineCloseis never called. Shorthand frames also stay local: they run same-position ownership arbitration before consuming). This keeps each iteration O(1) and is the key reason inline nesting stays O(n) since 1.1.2. -
Helper scans are enumerable. Both examples produce helper scans independent
of inline nesting depth —
findTagArgCloseis triggered by root/block-level form decisions, and the shorthand ambiguity probe is triggered by disambiguation with cacheable results. That is the structural reason the asymptotic bound stays linear.
Stretch this table out to n = 200,000 and both the iteration count and the total helper-scan coverage grow linearly — which is exactly what the formal section below proves rigorously.
Since 1.1.2, both parser entry points have a worst-case Θ(n) time complexity:
| API | Worst-case | Reason |
|---|---|---|
parseStructural |
Θ(n) | Single scan pointer i in a while main loop; no per-level rescan |
parseRichText |
Θ(n) |
parseStructural Θ(n) + render traversal Θ(n) = Θ(n); if handlers further call materialization helpers, those additional steps also remain linear |
Two independent O(n²) bottlenecks were eliminated:
-
Structural layer — inline forward scan removed. Before
1.1.2, every inline nesting level calledfindInlineClose/findTagArgClose(scanner.ts) to pre-scan forward for the matching close. With d nesting levels each scanning ~n characters, the worst case was O(n·d) = O(n²).Now: inline child frames use an
inlineCloseTokenfield (structural.ts, main loop) that records the exact token needed to close the frame —endTag(i.e.)$$) for full DSL inline, ortagClose(i.e.)) for shorthand inline. The frame checks for its close token at the current scan position — the comment at the push site says "不需要预扫 findTagArgClose … 纯 inline 深嵌套保持 O(n)". The oldfindInlineClose/findTagArgClosefunctions still exist in scanner.ts but are only called fromgetTagCloserType(root/block-level form detection) andfindBlockClose(block-close boundary search); neither path is entered inside an inline frame. -
Render layer —
materializeTextTokensre-traversal eliminated. Each handler call used to recurse into the full subtree. A module-levelWeakSet(materializedArraysin builders.ts) now marks processed arrays; subsequent calls hitting the same array skip recursion entirely (builders.ts line 87–86), collapsing O(n²) to O(n).
Both fixes are documented in the Performance deep-nesting section.
getTagCloserType / getTagCloserTypeWithCache still call findTagArgClose at root/block level
to determine whether a tag is inline / raw / block form. Since 1.4.3 the call is split by
remaining text length: ≤ 256 characters uses the uncached getTagCloserType; > 256 characters
uses getTagCloserTypeWithCache (sharing argClose scan results via a lazily created Map
cache). This does not reintroduce O(n²) because:
- It runs once per tag at the top level or inside a block content frame — not per nesting level.
-
findBlockClose(scanner.ts) internally handles nested block depth, skips inner raw spans viafindRawClose, and skips inner inline spans viafindInlineClose. Within a singlefindBlockClosecall, the scan pointer still advances monotonically and does not reintroduce per-level backtracking rescans, so the total work of that call remains O(n).
Saying only “the main loop is Θ(n)” is not the same as accounting for every library-owned branch in the current implementation. This section makes that explicit, function by function:
- where the branch is triggered,
- what its local one-hit cost is,
- which global bound absorbs it,
- and where the analysis stops because we would otherwise be counting user handler code rather than library-owned control flow.
Here “covered” means specifically:
- the normal
parseStructural(...)/parseRichText(...)entry paths, - the current structural + render implementation shipped by the library,
- excluding the internal complexity of user-provided handler callback bodies.
| Top-level branch | Trigger point | Local one-hit cost | Where it is absorbed |
|---|---|---|---|
tryFinalizeFrameAtEof(frame) |
start of each loop iteration | O(1) on normal completion; malformed EOF replay handled separately below | normal completion folds into O(number of frames); post-1.4.4 EOF replay folds into T_recovery-extra = O(n)
|
findNextBoundaryChar(...) fast skip |
non-boundary text run | O(run length) for one skip | the sum of all skipped run lengths is ≤ n, so it folds into source-consumption O(n) |
readEscapedForFrame(...) |
escape attempt | O(1) on miss; O(escape-sequence length) on hit | escape spans are disjoint and fold into total source/escape consumption O(n) |
tryConsumeInlineCloseAtCursor(...) |
inline-frame close site | O(1) same-cursor arbitration; raw/block close search accounted separately | C1/C2/C5 fold into main-loop O(n); C3/C4 close searches fold into auxiliary scans O(n) |
non-inline scanEndTagAt(...)=full
|
unexpected close | O(1) | folds into main-loop O(n) |
insideArgs && startsWith(tagDivider) |
arg separator | O(1) | folds into main-loop O(n) |
tryConsumeTagOrTextAtCursor(...) |
full tag / shorthand / text dispatch | depends on its sub-branch | sub-branches are split below into tag-dispatch O(n) + helper-scan O(n) |
defensive fallback appendBuf(...); i++
|
theoretical safety net | O(1) | folds into main-loop O(n) |
| Sub-branch | Local cost | Global absorption |
|---|---|---|
| C1 shorthand normal close | O(1) same-cursor ownership check + O(1) close | folds into main-loop O(n) |
C1 shorthand defer-parent
|
O(number of already produced child nodes attached upward) | post-1.4.4 folds into T_defer-parent = O(n)
|
C2 full DSL inline close (endTag) |
O(1) | folds into main-loop O(n) |
C3 raw form ()%) |
one linear scan of that raw-content window via findRawClose
|
all raw-close searches fold into auxiliary scans O(n) |
| C3 raw-unclosed degradation | O(1) error + O(tag-head span) source fallback | folds into auxiliary scanning / degraded-source total O(n) |
C4 block form ()*) |
one linear scan of that block window via findBlockClose
|
all block-close searches fold into auxiliary scans O(n) |
| C4 block-unclosed degradation | O(1) error + O(tag-head span) source fallback | folds into auxiliary scanning / degraded-source total O(n) |
C5 ordinary )
|
O(1) | folds into main-loop O(n) |
| Sub-branch | Local cost | Global absorption |
|---|---|---|
readTagStartInfo(...) recognizes a full DSL head |
O(tag-name length) | total tag-head character volume is ≤ n, so this folds into O(n) |
inline-frame tryPushInlineChild(...)
|
O(1) push; does not call getTagCloserType
|
folds into O(number of frames) ≤ O(n) |
readInlineShorthandStart(...) recognizes a shorthand head |
O(tag-name length) | folds into total tag-head volume O(n) |
resolveShorthandOwnershipPush(...) owner fast-path |
O(1) | folds into T_probe / T_recovery-extra = O(n)
|
resolveShorthandOwnershipPush(...) forward probe |
O(probe-window length) per first build | each (frame, textEnd) window is built once, so total cost is amortized O(n) |
getTagCloserType(...) / getTagCloserTypeWithCache(...)
|
O(length of that tag's arg window) | only root/block-level form classification triggers it, so total folds into auxiliary scans O(n) |
findTagArgClose(...) returns null (H2) |
failed scan O(remaining window) + tag-head degradation | under the current 1.4.4 recovery model this folds into auxiliary scans + recovery-extra O(n) |
gating rejection → skipTagBoundary(...)
|
O(tag-boundary length) | total skipped boundary length is ≤ n, so this folds into O(n) |
depthLimit degradation |
O(tag-boundary length) via skipTagBoundary(...)
|
still absorbed by total boundary length O(n) |
| ordinary text fallback | O(1) or batch O(run length) | folds into source consumption O(n) |
These recovery branches need to be listed explicitly, because they determine whether recovery can reintroduce superlinear work. The currently covered recovery branches are:
| Recovery branch | Current implementation | Global absorption |
|---|---|---|
| shorthand push-time owner conflict |
resolveShorthandOwnershipPush(...) can reject immediately |
O(1), or amortized probe O(n) |
| shorthand close-site owner conflict |
resolveShorthandOwnershipClose(...) + downgradeInlineIntoParent(...)
|
T_defer-parent = O(n) |
| EOF malformed inline / shorthand chain |
buildMalformedInlineReplayPlan(...) + replayMalformedInlineChainAtEof(...)
|
T_replay-plan + T_eof-rescan = O(n) |
| historical bad path: frame-by-frame re-entry on the same suffix | existed before 1.4.4
|
explicitly analyzed here as old O(n^2), new O(n) |
| Render-loop branch | Local cost | Global absorption |
|---|---|---|
frame end frame.index >= frame.nodes.length
|
flushTextBuf + pop/resume; flushTextBuf is O(buffered text length) |
every buffered text segment is joined/merged once, so this folds into output-text volume O(n) |
consumeNextLB trims leading line break |
O(trimmed newline length), typically O(1) | total trimmed line-break length is bounded by rendered text length, so O(n) |
text / escape / separator
|
bufferText(...) amortized O(1); root-mode readEscaped(...) is O(raw escape length) |
total text-like output volume folds into O(rendered text size) ≤ O(n) on the library-owned side |
inline node |
flushTextBuf + push child render frame |
folds into O(number of nodes) ≤ O(n) |
raw node |
flushTextBuf + renderRawNode(...)
|
see next table; total raw-content work folds into O(n) |
block node |
flushTextBuf + push child render frame |
folds into O(number of nodes) ≤ O(n) |
| Function / branch | Local cost | Global absorption |
|---|---|---|
renderInlineNode(...): !supportsInlineForm(...) → degradeToSource(...)
|
O(node source-span length) | in the normal parseRichText pipeline many illegal forms are already filtered by structural gating; remaining degradation cost is charged to degraded source span and kept inside O(n) |
renderInlineNode(...): unknown-tag passthrough |
one materializeTextTokens(childTokens) + appendToken loop |
each identical TextToken[] object is fully materialized at most once, so this folds into O(total token materialization) |
renderInlineNode(...): handler.inline / fallback |
handler call itself is outside the library guarantee; library-side createToken / appendToken is O(size of produced token draft) |
library-owned part folds into O(rendered token count); user handler-body work is separate |
renderRawNode(...): no handler.raw → degradeToSource(...)
|
O(raw-node source span) | charged to degraded source span, folding into O(n) |
renderRawNode(...): escaped raw-close restoration |
O(rawContent length) with lazy parts allocation |
total rawContent length is bounded by source length, so O(n) |
renderRawNode(...): normalizeBlockTagContent(...)
|
O(rawContent length) | folds into raw-content total O(n) |
renderBlockNode(...): no handler.block → degradeToSource(...)
|
O(block-node source span) | charged to degraded source span, folding into O(n) |
renderBlockNode(...): fast path in trimBlockBoundaryTokens(...)
|
O(1) | folds into O(number of block nodes) ≤ O(n) |
renderBlockNode(...): trimming path with cloneToken(...)
|
O(size of the trimmed edge-token subtree) | only modified edge tokens are cloned; for normal parser output the total folds into overall token size O(n) |
bufferText(...) / flushTextBuf(...)
|
O(1) amortized per push; O(buffer length) per flush | every buffered segment is joined/merged once, so total cost folds into text-output volume O(n) |
mergeTextToken(...) / appendToken(...)
|
O(1), including adjacent-text merge | folds into token-append count O(n) |
| Helper path | Covered here? | Boundary note |
|---|---|---|
materializeTextTokens(...) |
yes | analysis assumes library-generated token arrays; the same array object is fully materialized at most once |
degradeToSource(...) |
yes, with boundary note | for the normal parseRichText pipeline, degraded span total is charged as O(n); if a user manually feeds render a highly overlapping external tree, that is outside the normal parser guarantee |
trimBlockBoundaryTokens(...) / cloneToken(...)
|
yes | under normal parser output, cloning touches only trimmed edge-token subtrees and folds into total token size |
createToken(...) |
only the library-side wrapper cost | if a user handler constructs a huge draft object, that additional construction cost is not the library's own complexity |
user handler.inline/raw/block bodies |
not covered | user code can itself do O(m^2) work, which obviously exceeds the library-owned Θ(n) guarantee |
The more precise statement is now:
- structural top-level branches and their sub-branches are covered, not only the headline main loop;
-
the
1.4.4ownership / EOF recovery path is covered in both old and new form; - render-loop, node-dispatch, text-buffer, raw/block trim/degrade/materialization branches are covered on the library-owned side;
- user handler bodies are explicitly excluded rather than smuggled into the library's Θ(n) claim.
So the accurate umbrella statement becomes:
T_library-owned(parseRichText, n) = Θ(n)
where “library-owned” explicitly excludes extra work performed inside user-provided handlers. Under that scope, the current structural / render / recovery / helper branches are now all brought under the complexity coverage of this page.
This appendix is historical only. It explains why the old bad path was quadratic and which frame-by-frame re-entry mechanism the current implementation removes. It is not part of the main theorem statement.
When the changelog says “a quadratic path was found”, it is not talking about the normal well-formed fast path, nor about every malformed input in the abstract. It is talking about the specific combination of:
- deep shorthand / inline nesting,
- ownership competition with an ancestor full-form close,
- and malformed state carried all the way into EOF recovery.
Putting old and new readings side by side makes the significance clearer:
| Dimension | Bad-path reading before 1.4.4
|
Reading since 1.4.4
|
|---|---|---|
| ancestor close ownership | had to be rediscovered after parent re-entry at the same close position |
ancestorEndTagOwnerIndex gives the nearest owner in O(1) |
| shorthand yielding | could degrade into “yield one layer, then let the parent reconsider the whole suffix” |
resolveShorthandOwnershipClose(...) + downgradeInlineIntoParent(...) resolve it in place and continue from nextParentI
|
| malformed EOF chain | could keep feeding control back into the main loop layer by layer |
buildMalformedInlineReplayPlan(...) computes once, replay pops once |
| suffix rescans | could accumulate as a triangular series | reduced to one recovery resume |
| worst-case cost of that path | O(n^2) | O(n) |
So the 1.4.4 change is not merely “a constant-factor optimization”. It restores the complexity
class of a real recovery path from quadratic back to linear.
The analysis above describes the current implementation. But if this page is meant to function as a design note or paper-style explanation, the old model should remain visible too. Otherwise the reader sees that the quadratic path is gone, but not why it used to exist.
The old multiplier can be abstracted as this chain:
deep shorthand / inline close competition
→ current layer cannot stabilize ownership
→ return to the parent and observe the same suffix again
→ parent carries part of that suffix back into the main loop
→ the next layer repeats the same pattern
If the still-unresolved suffix length at a given moment is k, the accumulated work can be written
as:
W_old(k) = k + (k - δ1) + (k - δ1 - δ2) + ...
where each δi > 0 means “some progress happened”. But as long as the parent still has to
re-examine essentially the same tail after each recovery bounce, this is a triangular series; in
the worst uniform case it becomes:
W_old(k) = k + (k - 1) + (k - 2) + ... + 1 = Θ(k^2)
In other words, the old problem was not “there exists an O(n) scan helper”. The real problem was:
- the same suffix re-entered the main loop too many times,
- ownership information was not local enough, so parent re-entry had to re-establish it,
- EOF recovery did not settle the whole malformed chain in one shot.
1.4.4 cuts all three multipliers:
| Old multiplier |
1.4.4 cut point |
|---|---|
| nearest owner had to be rediscovered through parent re-entry |
ancestorEndTagOwnerIndex makes nearest ownership frame-local O(1) data |
| shorthand yielding returned too much of the tail to the parent |
downgradeInlineIntoParent(...) only replays the tag head, keeps already parsed child nodes, and jumps to nextParentI
|
| EOF recovery propagated pressure layer by layer |
buildMalformedInlineReplayPlan(...) + replayMalformedInlineChainAtEof(...) plan first, replay once |
Keeping the old reasoning visible matters because it shows:
- how the old complexity actually grew, not just that it disappeared,
- which exact causal chain the new implementation breaks,
- why the changelog says “restores the linear guarantee” rather than merely “improves constants”.
The constant C is not uniform — different input shapes stress different branches. Here are the three worst-case axes and the concrete inputs that maximize them.
\z\z\z\z\z\z\z\z\z\z...
Every \ triggers the escape-prefix hit path. readEscapedSequence matches the escape char, then
checks all 9 escapableTokens in length-descending order (default syntax: )$$, )%, )*,
%%, **, (, ), |, \). Since z doesn't start any of them, all 9 miss. The \ is
then treated as a plain character (not an escape). This is the input that maximizes
per-character branch count: k ≈ 12–15 per position.
A valid escape like \(\(\(\(... matches ( after 6 misses (the first 5 tokens plus %%/**
do not start with (), so the per-\ cost is lower but still significant.
In both cases there is no rescan — the pointer still advances by at least 1 per iteration, so total work remains linear.
$$a($$a($$a($$a($$a($$a(x)$$)$$)$$)$$)$$)$$
Each $$a( pushes one frame onto the explicit stack. With default syntax, minimum tag overhead is
4 chars ($$ + 1 name char + (), so maximum stack depth ≤ n/4. Since 1.1.2, the parser uses
an explicit array stack — not the call stack — so this is bounded only by heap memory. No stack
overflow at any depth.
The 50-million-layer stress test in the Performance page demonstrates this empirically.
$$a(x)$$$$a(x)$$$$a(x)$$$$a(x)$$...
Every tag sits at root level, so each one triggers getTagCloserType → findTagArgClose. The
form-detection scan overlaps with characters later parsed by child frames, so those regions incur
an additional constant-factor linear overhead. This does not change the Θ(n) result.
No single input can maximize all three axes simultaneously. Here is the formal argument.
Proposition. Let E be the number of characters consumed by escape sequences, T be the number of characters consumed by tag structures (tag head + arguments + end tags), and P be the number of plain-text characters (neither escaped nor tag structure) in an input of length n. Then E + T + P = n.
Axis 1 vs. Axis 2 mutual exclusivity. Each escape sequence \x occupies 2 characters and
is not recognized as a tag head (the escape check precedes the tag-head check, structural.ts
line 659). As E increases, the characters available for tag structures decrease: T ≤ n − E.
Maximum stack depth ≤ T/4 (each tag head ≥ 4 characters), so:
stack_depth ≤ (n − E) / 4
Axis 2 vs. Axis 3 mutual exclusivity. Deep nesting requires tags to recursively nest as
inline form. Inside inline frames (H1 path), getTagCloserType is not called (structural.ts
lines 892–918), so no form-detection overhead is incurred. Only root/block-level frame tags
trigger getTagCloserType. At nesting depth d, the number of root-level tags in a given region
is ≤ n/(4d), reducing total form-detection scan accordingly.
Axis 1 vs. Axis 3 mutual exclusivity. Escape characters do not form tag heads, so they never trigger form detection. More escapes means fewer form-detection calls.
Conclusion. The three axes' costs form a convex combination bounded linearly:
C_total = c₁ · E + c₂ · T + c₃ · P ≤ max(c₁, c₂, c₃) · n = O(n)
where c₁, c₂, c₃ are the per-character costs of the escape branch, tag processing (including stack operations + form detection), and plain-text branch, respectively. All three are constants independent of n. The realized worst case is always a trade-off among them, and the total stays within C · n. ∎
The main loop at each character position i executes a fixed cascade of checks. Here is the
exact sequence, counted from source (structural.ts line 624+):
| Step | Check | Calls | When |
|---|---|---|---|
| 0 | Fast text skip: findNextBoundaryChar → charCodeAt comparisons (up to 5 boundary lead codes) |
≤ 5 | Always (since 1.4.3); also checks isTagStartChar when shorthand enabled |
| 1 | Frame end: frame.i >= frame.textEnd
|
1 cmp | Always |
| 2 | Escape: readEscapedSequence → startsWith(escapeChar)
|
1 | Always |
| 2b | Escapable scan (only if escape char matches) | ≤ B | B = lead-char bucket size (since 1.4.3: tokens bucketed by first char, no longer scanning all E) |
| 3 | Inline argClose: inlineCloseToken check + startsWith(tagClose)
|
1–2 | Inline frame only |
| 3b | Suffix: startsWith(endTag), startsWith(rawOpen), startsWith(blockOpen)
|
≤ 3 | tagClose hit in full DSL frame |
| 4 | Unexpected endTag: startsWith(endTag)
|
1 | Non-inline frame only |
| 5 | Separator: startsWith(tagDivider)
|
1 |
insideArgs only |
| 6 | Tag head: readTagStartInfo → startsWith(tagPrefix) + char class checks |
1+ | Always (if no earlier match) |
| 6b | Inline shorthand: readInlineShorthandStart → tag name + ( check |
1 | Inline frame, no tag-head match |
| 7 | (removed in 1.3: bare ( parenDepth tracking no longer needed) |
— | — |
1.4.3 changes. Step 0's fast text skip bulk-skips consecutive non-boundary characters, so plain-text spans no longer walk steps 1–6b character by character. Step 2b's escapable scan changed from "iterate all E tokens" to "bucket by leading character and only test the matching bucket" — sequences like
\zwhose first character does not match any token's leading character now return immediately instead of scanning all 9 tokens.
Totals by input class (default syntax):
| Input class | k before 1.4.3 | k in 1.4.3 (with fast skip) | Path taken |
|---|---|---|---|
| Plain text (no syntax chars) | 3–4 | ≤ 5 (boundary scan only, bulk skip) | fast skip jumps entire span → next iteration hits boundary |
| Normal rich text | 4–6 | 4–6 | boundary hit → normal branch cascade |
| Dense inline | 5–8 | 5–8 | frequent tagClose / tagPrefix hits, fast skip has limited benefit |
| Dense inline + shorthand | 6–9 | 6–9 | tagClose miss → tagPrefix miss → shorthand check → appendBuf or push |
| Escape-prefix heavy | 12–15 | 5–7 | lead-char bucketing: \z bucket miss returns immediately, no longer scans all tokens |
Fast skip in 1.4.3 mainly benefits the plain-text and escape-heavy extremes; tag-dense input sees roughly unchanged k. For the dominant real-world case (normal rich text), k ≈ 4–6.
n = text.length // UTF-16 code units
For ASCII DSL fixtures, bytes ≈ code units ≈ characters. The benchmark inputs:
| Benchmark group | n |
|---|---|
| Full-document parse | 204,803 |
| Position tracking | 204,840 |
| Lightweight utilities | 200,067 |
For deep-nesting fixtures the test parameter is nesting depth d, but each layer adds a constant number of source characters, so n = c · d + O(1) and Θ(n) = Θ(d).
This section is kept as a historical baseline from the v1.1.9 era. It is useful for
understanding constant-factor scale under a fixed methodology, but it should not be read as the
current 1.4.3 release number set.
For current release-era numbers, use the Performance page. The table below stays here only as an older same-methodology reference point.
Environment: Kunpeng 920 / aarch64 / Node v24.14.0. The numbers below all come from
the same benchmark script (.tmp/bench-pos-single.mjs) under identical methodology:
- Input: fixed fixture, n = 204,803 bytes (~200 KB)
- Each cell = 20 fresh cold-boot node processes, run strictly sequentially with no concurrency, so JIT / heap state never leaks across trials
- Inside each process: 100 warmups (so V8 fully tier-ups the parser — anything less leaves the measurement bimodal because the optimization threshold may or may not have fired by call 31) → 15 measurement runs, take the per-process median as that process's representative sample
- The reported ms = arithmetic mean of the 20 per-process medians; ns/cu = ms × 10⁶ / n
| API | tracking off | tracking on | overhead |
|---|---|---|---|
parseRichText |
25.58 ms (124.9 ns/cu) | 34.06 ms (166.3 ns/cu) | +33.2% |
parseStructural |
17.75 ms (86.7 ns/cu) | 22.61 ms (110.4 ns/cu) | +27.4% |
These are lower than the older "30.6 / 23.3 ms" wiki numbers for two reasons at
once: the v1.1.9 baseline already included several rounds of render / scanner
constant-factor wins on top of the version those older numbers came from; and
this methodology (100 warmup + 15-sample median × 20 cold processes)
is more stable than the old "3 warmup + 20 in-process samples for all 4 cases
in one process", where the in-process JIT state penalised whichever case ran
first. To reproduce, run node .tmp/bench-pos-single.mjs <target> directly.
Data represents the average of 5 independent process runs, each containing 50 internal samples.
| Operation | Time | Per unit |
|---|---|---|
printStructural |
~2.00 ms | ~10.0 ns/char — Θ(total node text size) |
buildZones |
~0.74 ms | Θ(top-level nodes); 1,897 zones from 6,321 nodes |
walkTokens |
~0.50 ms | Θ(total tokens); 9,165 visits |
buildZones iterates only the top-level StructuralNode[] array (zones.ts), so its cost scales
with the number of top-level nodes, not with source length directly. The per-char figure
(~3.7 ns/char) is a derived number from this specific benchmark and should not be extrapolated to
inputs with different node-to-character ratios.
These constants are machine-specific and benchmark-specific; they are not comparable across groups.
The two parser entry points (parseStructural, parseRichText) and materializeTextTokens are
fully stack-iterative (explicit stack + while loop). All utility functions are now also
stack-iterative:
| Function | Recursion style | Stack-safe? |
|---|---|---|
printStructural |
Iterative (explicit stack + while loop) |
Yes |
walkTokens |
Iterative (explicit stack + while loop) |
Yes |
mapTokens |
Iterative (explicit stack + while loop) |
Yes |
For typical document depths (hundreds of levels) this is not a concern. It will not overflow even for pathological inputs in the tens-of-thousands range.
The complexity is always Θ(n), but the wall-clock time for a given n depends on how much work each source character causes. Four input-shape dimensions dominate:
How often the scanner hits a tag prefix and enters open/close/gating logic.
| Input profile | Effective multiplier |
|---|---|
| Plain text / low DSL | ~1× |
| Normal rich text | ~1.5–3× |
| Dense inline tags | ~3–6× |
| Dense inline + deep nest | ~4–8× |
More nodes per character → more object allocation, more children array work, more token
materialization on the parseRichText path.
The flushBuffer path in the structural scanner uses a segment-pair optimization: for the common
case of 1–2 text segments it concatenates directly without allocating a temporary array
(structural.ts lines 312–315). This keeps the per-flush constant low for typical inputs.
Dense siblings and deep chains have similar n but different cache behavior and different
children attachment patterns. The Performance memory profile already shows this:
200 KB dense inline and 20k nested inline have different heap/RSS profiles at similar input scales.
parseRichText = structural scan + render materialization, so its constant is always higher than
parseStructural on the same input.
For engineering estimation, the total cost decomposes as:
T ≈ α · n_chars + β · n_tags + γ · n_nodes + δ · depth_max
For parseRichText, add + ε · n_tokens.
All these derived quantities are upper-bounded by n_chars, so the asymptotic class stays Θ(n).
But in practice these are the knobs that determine fast vs. slow for a given source length.
| Derived metric | What it captures |
|---|---|
n_chars |
Baseline scan cost |
n_tags |
Tag-prefix hits, open/close/gating branches |
n_nodes |
Object allocation, children arrays |
depth_max |
Explicit stack frame depth, cache locality |
n_tokens |
Render-layer materialization (rich text only) |
n is always source length; what varies across inputs is the per-character work density — driven by tag density, node density, nesting depth, and the render path.
Sections 1.1–1.2.3 analyse the initial parse, which is always Θ(n). Incremental parsing targets a fundamentally different cost model: after a document has been parsed once, subsequent edits should cost proportional to the changed region, not the full document.
The proof object in this section is the call-time cost of the update APIs. It is not “whatever lazy work may happen later is retroactively charged to the same call.”
Theorem U (single incremental update-path call)
T_update-call(n) = O(n)
Here T_update-call(n) means the worst-case time of one incremental update-path call
itself (core path: updateIncrementalInternal(...)). It includes:
assertValidEdit(...)- fingerprint comparison
findDirtyRange(...)reparseDirtyWindowUntilStable(...)isSafeRightReuse(...)deferShiftZone(...)- document assembly plus
installLazyDocument(...) -
cloneParseOptions(...)when the incremental path is actually committed
It explicitly excludes two later-stage costs:
-
lazy materialization triggered by a subsequent first read of
doc.tree/doc.zones; -
incremental diff added by
session.applyEditWithDiff(...)after the update completes.
Theorem S (session call)
T_session-applyEdit-call(n) = O(n)
because createIncrementalSession(...).applyEdit(...) only wraps updateIncrementalInternal(...)
with constant-cost strategy gating, sampling, and fallback orchestration.
This section uses the same structure as the main theorem body: cost decomposition first, support lemmas second, closure last. The spine is not the pipeline diagram by itself, but the additive formula:
T_edit-call
= T_validate
+ T_fingerprint
+ T_dirtyFind
+ T_reparse
+ T_probe
+ T_shift
+ T_assemble
+ T_clone
The subsections below discharge these terms one by one. The final closure then uses Δ <= n,
Z <= n, Z_right <= Z <= n, and the boundary assumption that H does not grow with document
length n, to derive the worst-case O(n) bound.
The linear guarantee in this section first covers the update path itself (the path used by
session.applyEdit(...)):
applyEditWithDiff(...)is not a standalone exported function; it is a method on the object returned bycreateIncrementalSession(...). Starting in1.3.8, it adds an extra structural-diff phase after the update completes; that diff stage has its own complexity model and should not be conflated with the parser/updater Θ(n) / O(n) guarantee.
The implementation core is updateIncrementalInternal(...). Public update calls
reach it through createIncrementalSession(...), whose returned session object
layers adaptive strategy gating on top.
Variable definitions used throughout this section:
| Symbol | Definition |
|---|---|
| n |
newSource.length — full document length after the edit |
| Δ |
edit.newText.length — inserted text length |
| d_old |
edit.oldEndOffset - edit.startOffset — deleted span in previous source |
| Z | number of zones in the previous snapshot (prevZones.length) |
| z_dirty | number of zones in the dirty range after expansion |
| H | total handler data size across parseOptions.handlers
|
| W_dirty | character width of the dirty window (the reparsed slice) |
| Z_right | number of right-side reused zones |
| W_probe | character width of the seam-probe reparse window |
flowchart LR
subgraph INCR["updateIncrementalInternal pipeline"]
direction LR
S1["① assertValidEdit<br/>O(Δ)"]
S2["② fingerprint check<br/>O(H) / O(1)"]
S3["③ zone guards<br/>O(1)"]
S4["④ findDirtyRange<br/>O(Z)"]
S5["⑤ reparseDirtyWindow<br/>≤ O(n)"]
S6["⑥ seam probe<br/>O(W_probe)"]
S7["⑦ deferShiftZone<br/>O(Z_right)"]
S8["⑧ assemble<br/>O(Z)"]
S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> S7 --> S8
end
S3 -. " empty zones / low zone / tail gap " .-> FB["fullRebuild Θ(n)"]
S2 -. " fingerprint mismatch " .-> FB
S5 -. " budget exceeded " .-> FB
S6 -. " signature mismatch " .-> FB
updateIncrementalInternal(...) executes a fixed sequence of pipeline stages.
Each stage is analysed independently; no stage feeds back into an earlier one.
Three checks:
- Range bounds: O(1) comparisons.
- Expected-length: one subtraction + one addition → O(1).
- Text match:
newSource.slice(edit.startOffset, edit.startOffset + edit.newText.length)produces a string of length Δ, then a===comparison of that string againstedit.newText→ O(Δ).
T_validate(Δ) = O(Δ)
Hashes the syntax fields (8 keys → O(1)), tagName (2 function refs → O(1)),
and allowForms (array of length k → O(k)). The handlers fingerprint is
folded in via a single u32 feed per handler entry.
T_fingerprint(H) = O(H) where H = handler count + |allowForms| + |syntax keys|
For a repeated call on the same doc.parseOptions object the result is
cached via getCachedOptionsFingerprint (WeakMap lookup → O(1)).
In the session path, options rarely change between edits, so the common case is
O(1).
Single-pass linear scan of prevZones that simultaneously tracks the overlap
range and the insertion index (for pure-insertion edits with no overlap).
The scan touches each zone at most once → O(Z).
After overlap detection the range is expanded by LEFT_LOOKBEHIND_ZONES = 1
(incremental.ts:431) on the left. The right boundary is the last overlapping
zone.
T_dirtyFind(Z) = O(Z)
This is the critical path. The function parses the slice
newSource[dirtyStartNew..dirtyEndNew] and checks whether the right boundary
of the produced tree aligns with the old zone boundary. If not, it expands
dirtyTo by one zone and re-parses the (now wider) window.
Each iteration:
- computes window boundaries → O(1)
- calls
parseWithPositionson a window of width W_i → O(W_i) - calls
buildZoneson the resulting tree → O(nodes in W_i) - checks boundary stability → O(1)
In the worst case, expansion runs k times with successive window widths W_1, W_2, …, W_k. Each W_i ≥ W_{i−1} because the window only grows. The cumulative reparsed byte count is therefore Σ_{i=1}^{k} W_i — a potential O(k · W_k) quadratic term.
But: the cumulative budget guard enforces a fixed internal limit:
cumulativeBudget = Math.max(newSource.length * 2, 1024)
This is an implementation constant, not a public tuning knob. The loop checks
nextCumulativeReparsedBytes > cumulativeBudget.
Once exceeded, the function returns budgetExceeded = true immediately, and
updateIncrementalInternal falls back to fullRebuild() — which is a fresh
parseIncremental(newSource), itself O(n) (proven in §1.2).
Therefore the total reparse work across all expansion iterations is bounded
by min(2n, Σ W_i). If the budget is reached, the subsequent full rebuild is
O(n). In either path:
T_reparse(n) ≤ O(n)
Validates that the zones to the right of the dirty window can be reused without re-parsing. Detailed analysis follows in the next section.
Each surviving right zone is wrapped with a deferred offset delta. Per-zone cost is O(1) (see §Lazy delta shifting cost model below).
T_shift(Z_right) = O(Z − z_dirty)
const nextRawZones = [...leftZones, ...dirtyZones, ...rightZones];Three array spreads. Total elements = Z_left + z_dirty_new + Z_right ≤ Z + z_dirty_new. Since z_dirty_new ≤ Z and all zone counts ≤ Z + O(z_dirty_new):
T_assemble(Z) = O(Z)
This section proves the key invariant that prevents quadratic blowup in
reparseDirtyWindowUntilStable.
Claim. The total number of source bytes reparsed across all expansion
iterations of a single updateIncrementalInternal call is at most 2n, where
n = newSource.length.
Proof.
-
The cumulative byte counter
cumulativeReparsedBytesis initialised to 0 at the start ofupdateIncrementalInternal(...). -
Each iteration of
reparseDirtyWindowUntilStable(...)adds the current window widthreparsedWindowSizeto the counter. -
Before parsing, the guard checks:
if (nextCumulativeReparsedBytes > cumulativeBudget) return { budgetExceeded: true, … }where
cumulativeBudget = Math.max(newSource.length * 2, 1024). -
If the guard fires, the caller invokes
fullRebuild(), which is a single full-document parse — O(n) by the analysis of §1.2. -
If the guard never fires, the sum of all window widths is ≤ 2n. Each window is parsed in O(W_i) time (linear-time parser, §1.2), so the total parse work is O(Σ W_i) ≤ O(2n) = O(n).
-
Therefore, regardless of whether the budget is exhausted:
T_reparse(n) ≤ O(n)
Worst-case trigger. The budget fires when every zone boundary in the dirty window is "just unstable" — the right-boundary check at lines 791–793 fails each time, forcing one more zone into the window. Even in this pathological case, the budget caps total work at 2n before falling back to a fresh O(n) parse. The combined cost is O(n) + O(n) = O(n). ∎
isSafeRightReuse(...) re-parses a small window around
the seam (the boundary between dirty zones and reused right zones) and compares
the result against the old zones via structural signatures.
Probe window width.
probeZoneCount = min(RIGHT_REUSE_PROBE_ZONES, oldRightZones.length)
= min(2, oldRightZones.length)
probeWindowZones = min(probeZoneCount + RIGHT_REUSE_PROBE_EXTRA_ZONES,
oldRightZones.length)
= min(probeZoneCount + 1, oldRightZones.length)
The constants are currently hard-coded as RIGHT_REUSE_PROBE_ZONES = 2 and
RIGHT_REUSE_PROBE_EXTRA_ZONES = 1, so the probe covers at most 3 old zones
mapped to new-source offsets.
Let W_probe be the character width of those 3 zones in the new source.
The probe calls parseWithPositions(newSource.slice(…), …) on a slice of
length W_probe → O(W_probe) parse cost.
Signature comparison cost.
After parsing, the probe compares at most probeZoneCount (≤ 2) zones by
structural signature. The signature budget is:
RIGHT_REUSE_PROBE_SIGNATURE_NODE_BUDGET = 4096 // line 332
Initialised at line 510:
const signatureBudget: SignatureBudget = {remaining: 4096};nodeSignature (incremental.ts:500–560) consumes 1 budget unit per node
(line 410). If the budget reaches 0, signature computation returns undefined
and the probe bails — forcing a full rebuild.
Per node, nodeSignature calls fnvFeedStringBounded (hash.ts:36–49) which
touches at most BOUNDED_HEAD + BOUNDED_TAIL = 64 characters regardless of
content length (hash.ts:34–35). The remaining hash operations (fnvFeedU32,
fnvInit) are O(1).
T_signature_per_node = O(1) (bounded by 64-char window)
T_signature_total ≤ O(4096) = O(1) (constant budget)
Old-zone signatures are typically cached (WeakMap zoneSignatureCache,
incremental.ts:448) from the initial parseIncremental call (line 982), so
zoneSignature(expected) at line 521 is O(1).
Probe reparse cost.
W_probe spans at most 3 zone widths. In the typical case, zones are much smaller than n, so W_probe ≪ n. However, in the degenerate case of very few, very large zones, W_probe can approach n.
This is acceptable because the probe's reparsed bytes are not added to the
cumulativeBudget counter of reparseDirtyWindowUntilStable (it is tracked
separately as probeSliceBytes), but the full-rebuild fallback at line 989
exceeds O(n) for the reparse plus O(n) for the fallback rebuild:
T_probe(n) = O(W_probe) + O(1)
≤ O(n) + O(1) (worst case)
= O(n)
cloneParseOptions (incremental.ts:258–284) deep-copies the parse options only
after all early full-rebuild guards have passed (1.2.5+). If any guard triggers
a full rebuild, no clone occurs — parseIncrementalInternal clones internally.
- Top-level spread: O(|fields|) = O(1) (fixed set of known keys).
-
cloneHandlersSnapshot(incremental.ts:239–251): iterates handler keys and recursively deep-clones each handler'sdataproperty. Cost: O(H) where H = Σ (handler key count + nested data size). -
allowFormsspread: O(|allowForms|), folded into H. -
frozenSnapshotsWeakSet guard (line 179): if the options object has already been cloned and frozen, the function returns immediately → O(1).
In the session path, if options is undefined (the common case — options
don't change between edits), the old snapshot is reused directly — no clone
at all.
T_clone(H) = O(H) per edit, where H is independent of n
When right zones survive reuse, their source offsets must be shifted by
delta = newSource.length − doc.source.length. Eagerly traversing every node
in every right zone would cost O(total_nodes_in_right_zones) — potentially
O(n). The implementation defers this work.
deferShiftZone (incremental.ts:768–782):
-
pendingShiftDeltas.get(zone)— WeakMap lookup → O(1). - Creates a new
Zoneobject with shiftedstartOffset/endOffsetbut the samenodesarray reference → O(1). -
pendingShiftDeltas.set(newZone, totalDelta)→ O(1). - Copies signature cache entry if present → O(1).
T_deferShift_per_zone = O(1)
T_deferShift_total = O(Z_right) = O(Z − z_dirty)
materializeZone (incremental.ts:786–797):
Triggered lazily when a consumer first accesses doc.zones (line 682).
Calls shiftNode on every node in the zone. shiftNode uses an iterative
explicit stack, visiting every node in the subtree once:
T_materialize_per_zone = O(nodes_in_zone)
T_materialize_total = O(total_nodes) = O(n)
Crucially, materialization happens on first consumer access, not during the
edit. If the consumer never reads doc.zones (e.g., only reads doc.source),
materialization cost is 0.
If the consumer reads after k successive edits, the accumulated deltas are
composed: each deferShiftZone adds its delta to the existing pending delta
(line 635–649). Materialization still traverses each node exactly once,
applying the composed delta in a single pass:
T_materialize_after_k_edits = O(n) (same as after 1 edit)
Aggregate amortized analysis. Consider an editing session with k consecutive applyEdit
calls, after which the consumer reads doc.tree, triggering a single materialization.
- Each
applyEditcosts ≤ O(W_dirty_i + Z + H), where W_dirty_i is the dirty window width for the i-th edit. The materialization flag is set (dirty = true), but no materialization is performed. - The final materialization traverses all nodes, executing
shiftNode's explicit-stack iteration: each node is visited once. Total cost = O(node_count) = O(n).
Therefore, the total cost for k edits + 1 materialization:
Σᵢ T_edit(i) + T_materialize = Σᵢ O(W_dirty_i + Z + H) + O(n)
If each edit is local (W_dirty_i ≪ n), edit total ≤ k · O(W_dirty_max + Z + H), and the O(n) materialization cost is paid only once. Amortizing materialization across k edits:
amortized_per_edit = O(W_dirty + Z + H) + O(n/k)
When k ≥ n / W_dirty, the amortized materialization cost ≤ O(W_dirty), matching the edit cost. This formally proves the economy of lazy materialization: frequent edits + deferred reads = optimal amortization.
Combining all pipeline stages:
T_edit(n, Δ, Z, H)
≤ T_validate(Δ) + T_fingerprint(H) + T_dirtyFind(Z)
+ T_reparse(n) + T_probe(n) + T_shift(Z)
+ T_assemble(Z) + T_clone(H)
≤ O(Δ) + O(H) + O(Z) + O(n) + O(n) + O(Z) + O(Z) + O(H)
= O(n + Δ + Z + H)
Since Δ ≤ n (inserted text cannot exceed new source length) and Z ≤ n (each zone spans ≥ 1 character) and H is independent of n:
T_edit(n) ≤ O(n) (worst case, dominated by reparse budget or full-rebuild fallback)
For the typical case where the edit is localized (small Δ, small dirty window, most zones reused):
T_edit_typical ≤ O(W_dirty + Z_right + H)
where W_dirty is the dirty window width (proportional to edit locality, not to n) and Z_right is the number of right zones getting lazy-shifted (O(1) each). In an editor scenario with single-character keystrokes, W_dirty is typically a few hundred characters and Z_right is the number of zones after the cursor — each costing O(1) via deferred shifting.
Formal conditions for "typical local edit". The W_dirty ≪ n assumption does not hold for all edits. The following are sufficient conditions for sub-linear behavior:
- Bounded edit span: The replacement interval
[replaceStart, replaceEnd)satisfies >replaceEnd − replaceStart ≤ δ, where δ is a constant independent of n (e.g., δ = 0 for > single-character insertion, δ ≤ line_width for single-line replacement).- Edit does not cross zone boundaries: The replacement interval falls entirely within a > single zone z_i. This ensures
findDirtyRangeonly inspects z_i and its neighbors, giving > W_dirty ≤ |z_i| + |z_{i±1}|.- Edit does not break tag closure structure: If the edit inserts or deletes syntax > delimiters (
$$,(,), etc.), it may trigger large-scale reparsing. "Typical" edits > modify characters within text regions, not syntax boundary characters.- Reparse budget not exhausted:
cumulativeBudget = Math.max(newSource.length * 2, 1024)is > not consumed by a single reparse. If the dirty window's reparsing exceeds this budget, it > falls back to a full rebuild (O(n)).Under these four conditions: W_dirty ≤ |z_max| + |z_max| = 2 · |z_max|, where |z_max| is indirectly bounded by
softZoneNodeCap(default 64 nodes; invalid user values are normalized back to that default) — each zone contains at most 64 nodes, and each node's source span has a practical upper bound in typical documents. Therefore W_dirty = O(1) relative to n, giving T_edit_typical = O(Z + H) = O(Z).
The closed-form bound makes the Z dependency explicit: findDirtyRange,
deferShiftZone, and assemble all cost O(Z). When the edit is highly
localised (W_dirty ≪ n), Z becomes the dominant term. More zones ≠ faster.
Zone splitting changes the source of Z. Before 1.2.4, Z was entirely
determined by the distribution of raw / block nodes ("hard zone breakers")
in the document. 1.2.4 introduces softZoneNodeCap (default 64): non-breaker
nodes are flushed into a new zone every time the accumulator reaches the cap.
Invalid or non-positive user values are normalized back to the default, so this
cap remains active. The resulting zone count is approximately:
Z ≈ N_breaker + ⌈N_nonbreaker / softNodeCap⌉
where N_breaker is the raw/block node count and N_nonbreaker is the
text/inline/escape/separator node count. For a given document, smaller
softNodeCap → more zones → narrower dirty windows (W_dirty ↓) but higher
scaffold cost (Z ↑).
Empirical data (1 MB document, Kunpeng 920 aarch64, Node v24.14.0, head-of-document single-character replacement):
| Zone count | Source | Incremental median | Full parse | Speedup | Notes |
|---|---|---|---|---|---|
| 264 | Pure inline, softCap=64 | ~12 ms | ~127 ms | ~10× | Sweet spot |
| 1 571 | Sparse raw (every 50 lines) | ~12 ms | ~130 ms | ~10× | Sweet spot |
| 3 757 | Moderate raw (every 20 lines) | ~15 ms | ~130 ms | ~9× | Still good |
| 7 015 | Dense raw (every 10 lines) | ~19 ms | ~130 ms | ~7× | Scaffold rising |
| 17 871 | Ultra-dense raw (every 3 lines) | ~38 ms | ~136 ms | ~3.5× | Z dominates |
Head-of-document edits are worst-case direction (Z_right ≈ Z); tail edits are faster.
Sweet spot: a few hundred to a few thousand zones. In this range the dirty
window is narrow enough while scaffold overhead remains negligible, yielding
~10× speedup. Above ~10 000 zones, the scaffold cost (primarily the single-pass
scan in findDirtyRange and per-zone object creation in deferShiftZone)
consumes most of the savings.
This is why SOFT_ZONE_NODE_CAP defaults to 64 rather than a smaller value —
it keeps Z in the hundreds-to-thousands range for typical documents, landing
squarely in the sweet spot.
When prevZones.length ≤ 1 (the entry guard in updateIncrementalInternal(...)), the incremental path has no
reusable zone boundaries — the dirty window is the entire document. In this
case the findDirtyRange + seam probe + assemble overhead is pure waste.
1.2.4 adds a guard at the entry of updateIncrementalInternal: if
prevZones.length ≤ 1, it immediately calls fullRebuild(), skipping the
entire incremental pipeline. This guarantees that single-zone documents (pure
text, pure inline, no handlers, etc.) are never slower than a bare full
parse.
createIncrementalSession (src/incremental/incremental.ts) wraps
tryUpdateIncrementalInternal with an adaptive strategy layer.
applyEdit:
| Step | Cost |
|---|---|
Strategy check (strategy === "full-only") |
O(1) |
| Edit ratio computation | O(1) |
Ratio gate (editRatio > maxEditRatioForIncremental) |
O(1) |
Cooldown gate (preferFullMode && cooldownRemaining > 0) |
O(1) |
tryUpdateIncrementalInternal call |
T_edit |
Timing: now() before + after |
O(1) |
recordBounded(incrementalDurations, …) |
O(1) |
recordBounded(fallbackMarks, …) |
O(1) |
maybeAdaptPolicy() |
see below |
maybeAdaptPolicy:
- Early exit if
strategy !== "auto"→ O(1). -
average(fallbackMarks)iterates at mostsampleWindowSizeelements → O(sampleWindowSize). -
average(incrementalDurations)andaverage(fullDurations)→ same bound. -
sampleWindowSizeis bounded by session options (default 24); therecordBoundedhelper enforcesbucket.length ≤ sampleWindowSizeby shifting excess entries.
T_adaptPolicy = O(sampleWindowSize) = O(24) = O(1)
Therefore the total session overhead per edit is O(1) — constant, independent of n:
T_session_edit = T_edit + O(1) = same asymptotic class as the raw update core path
applyEditWithDiff(...) has two parts:
T_applyEditWithDiff = T_applyEdit + T_diff
T_applyEdit has already been proven worst-case O(n). The additional cost comes
from T_diff.
From 1.4.0+, this diff path is also explicitly budgeted by default:
refinementDepthCap = 64maxComparedNodes = 20_000maxAnchorCandidates = 128maxOps = 512maxSubtreeNodes = 256maxMilliseconds = 8
There is also a hard cutoff:
MAX_FULL_FALLBACK_DIFF_REFINEMENT_SOURCE_LENGTH = 20_000
If the session is already in full-fallback mode and either side exceeds that size,
the implementation skips deep refinement and goes directly to conservative diff.
These budgets and cutoffs are there to bound degradation and allow coarser output.
They do not turn session.applyEditWithDiff(...) into a linear-time guarantee.
When fine-grained refinement throws, or when the session has already taken the
full-fallback route on a large document, the implementation switches to
buildConservativeTokenDiff(...).
That path does only three meaningful things:
- reads
previousDoc.tree.length/nextDoc.tree.length→ O(1) - builds the whole-tree
patches/opsenvelope → O(1) - copies
oldNodes/newNodesonce withslice()→ O(N_old + N_new)
Let:
N_old = previousDoc.tree.length
N_new = nextDoc.tree.length
N_root = N_old + N_new
Then:
T_diff_conservative = O(N_root)
So the conservative route remains linear.
The refined route is driven by diffNodeArrays(...) plus
processNestedNodeArrayDiffs(...). It is not a single left-to-right pass: it
may perform anchor discovery, partition a range, then continue refining the
remaining subranges and nested containers.
Define:
M = total old/new nodes actually refined by this diff
P = total emitted structural diff ops
For any one container-range pair with combined length m,
diffNodeArrays(...) performs:
- Prefix/suffix trimming — linear scan from both ends → O(m)
-
Anchor search via
findAnchors(...)- old/new signature-count maps → O(m)
- candidate sorting → O(a log a),
a ≤ m - LIS extraction → O(a log a)
- so the anchor phase is O(m log m)
-
If no anchors survive
- equal-length branch: aligned comparison + optional nested-task enqueue → O(m)
- non-equal-length branch: splice fallback + node copying → O(m)
Therefore one range satisfies:
R(m) ≤ O(m log m) + child-range refinement cost
In the worst case, each anchor step isolates only a tiny stable island and
leaves one large subrange of size m - 1 for the next iteration. That gives:
R(m) = R(m - 1) + O(m log m)
Unrolling the recurrence:
R(m) = O(Σ_{i=1..m} i log i) = O(m² log m)
So fine-grained structural diff is not Θ(n) in the worst case. Its goal is high-quality local change description, not the same strict linear bound the parser itself provides.
-
diffRefinementDepthCaponly limits how far refinement continues into nestedchildren/args. - Once the depth cap is exceeded, the algorithm degrades to a container-level
splice. - This bounds deep-chain refinement cost, but it does not remove the worst-case O(m² log m) behaviour caused by repeated anchor partitioning on wide ranges.
- After refinement, the final
opsarray is sorted once:
T_sort_ops = O(P log P)
So for applyEditWithDiff(...):
T_applyEditWithDiff
= T_applyEdit + T_diff
≤ O(n) + min(
O(N_root), // conservative diff
O(M² log M + P log P) // refined diff
)
In the common local-edit case, M is usually limited to a small root slice plus
a small number of nested containers, so real cost is far below this upper
bound. But the strict linear-time guarantee belongs to the parser / updater,
not to fine-grained diff refinement.
| Operation | Worst case | Typical (local edit) | Notes |
|---|---|---|---|
parseIncremental (initial) |
Θ(n) | Θ(n) | Equivalent to parseStructural + buildZones
|
session.applyEdit |
O(n) | O(W_dirty) | Plus O(1) strategy overhead |
session.applyEditWithDiff |
Typically near O(n + T_diff_local); under default budgets the worst case is forced back toward an O(n)-bounded update path via coarse degradation | O(W_dirty + T_diff_local) | Fine-grained diff is not strictly linear, but the default session now hard-stops once compare / anchor / op / subtree / wall-clock budgets are exceeded |
Materializing doc.zones
|
O(n) | O(n) | Lazy delta applied in single pass |
In the common editor scenario (keystroke-level edits, W_dirty ≪ n):
Full parse per keystroke: Θ(n) × k keystrokes = Θ(k·n)
Incremental per keystroke: O(W_dirty) × k = O(k·W_dirty) (+ one O(n) materialization)
The default session diff path also carries hard refinement budgets: if the root pass exceeds budget it immediately falls back to a whole-tree conservative diff; if nested refinement exceeds budget it keeps the root patch shape but degrades to root-level splice ops. In other words, **the worst case no longer keeps paying unbounded cost just to chase a smaller patch**.
Those budgets are accumulated through an internal `DiffBudgetState`; exhaustion is not treated as an exception, but as an ordinary control-flow branch that immediately coarsens the diff.
In addition, when an edit stays on the incremental update path, root-level fine diff is now constrained to the source window of the incremental dirty zone before anchor / nested refinement begins. That keeps the normal-case cost closer to `O(W_dirty + T_diff_local)` instead of refining the entire root tree.
If you care which version introduced this as the default behavior, see [Version Semantics Notes](en-Version-Semantics).
The incremental path wins whenever k · W_dirty + n < k · n, which holds for
any W_dirty < n and k ≥ 2.
Three orthogonal axes produce worst-case behaviour specific to the incremental path.
Construction. A document where every zone's last node is a partial tag that can absorb content from the next zone. Each expansion step adds one zone, and the right-boundary stability check fails every time.
Analysis. Expansion proceeds until either:
- The boundary stabilises (loop terminates), or
-
cumulativeReparsedBytes > cumulativeBudget = max(2n, 1024).
Window widths form an increasing sequence W_1 < W_2 < … < W_k. Even without
overlap, Σ W_i ≤ 2n before the budget fires. The subsequent fullRebuild() is
O(n). Total cost:
T_worst_expand = O(2n) + O(n) = O(n)
The quadratic Σ_{i=1}^{k} W_i term is not quadratic because the budget prevents the sum from exceeding 2n.
Construction. A document with Z = 3 zones of roughly equal width n/3 each. An edit touches zone 1; zones 2–3 are right candidates. The probe window spans min(3, 2) = 2 zones of width ≈ 2n/3.
1.2.4 note: The low-zone guard (
prevZones.length ≤ 1inupdateIncrementalInternal(...)) divertsprevZones.length ≤ 1to full rebuild before the probe is reached. Therefore the degenerate case of a single giant zone never hits the seam probe in practice. However, Z = 2–3 still reaches the probe with large windows.
Analysis. Probe reparse: O(2n/3) = O(n). Signature comparison: bounded by
RIGHT_REUSE_PROBE_SIGNATURE_NODE_BUDGET = 4096 nodes, each costing
O(1) via fnvFeedStringBounded (hash.ts:36–49, bounded to 64 characters).
Signature cost = O(4096) = O(1).
T_worst_probe = O(n) + O(1) = O(n)
Combined with reparse budget: the total across dirty-window reparse + probe is at most O(n) + O(n) = O(n). If both hit their worst case simultaneously, the full-rebuild fallback caps the total at O(n).
Construction. The caller passes a different parseOptions object on every
applyEdit call, causing the fingerprint check (incremental.ts:1052–1076) to
detect a mismatch every time.
Analysis. Mismatch triggers fullRebuild() immediately — before any dirty
range computation, reparse, or probe. Each full rebuild is O(n).
T_worst_fingerprint = O(n) per edit
This is the expected fallback behaviour, not a bug. The fingerprint check is
O(H) — one hash computation — which is negligible compared to the O(n) rebuild
it triggers. The session's maybeAdaptPolicy will detect the repeated
fallbacks and switch strategy to "full-only" after minSamplesForAdaptation
(default 6) samples, eliminating the fingerprint overhead entirely.