en Version Semantics - chiba233/yumeDSL GitHub Wiki

Version Semantics Notes

Incremental Parsing | Linear-Time Complexity | Home

This page collects version-specific semantic or observable behavior changes.

It does not repeat the full API reference. Field signatures, usage details, and complexity derivations still live on their topic pages. This page focuses on:

  • behavior differences that matter when upgrading
  • contract changes across versions of the same public API
  • user-visible recovery / degradation differences outside fully valid input

Compatibility notes

These are upgrade-sensitive notes that are often asked about even when they are not specific to incremental parsing. For the API details, jump to the linked wiki pages:

Topic Since Notes
Same tag name supports both inline + block/raw Earlier than 1.0.7 The capability already existed before 1.0.7; that release fixed an inline-close bug where inline $$tag(...)$$ could incorrectly consume the following newline by applying block normalization rules. See DSL Syntax / Writing Tag Handlers
createParser partial-override deep merge 1.0.11+ See API Reference / DslContext
declareMultilineTags supports inline form (MultilineForm: "inline") 1.0.14+ See Handler Helpers
Implicit inline shorthand name(...) 1.3.0+ See DSL Syntax / Error Handling

onError version semantics (1.1.x)

This is a compatibility note, not a performance note. The summary below compares the same fixture set across parseRichText and stripRichText, looking only at the error-code sequence delivered to the onError callback.

Behavior group Versions
A 1.1.0, 1.1.5
B 1.1.1
C 1.1.2, 1.1.3, 1.1.4

So onError semantics inside 1.1.x did not evolve monotonically. There were at least two behavior drifts. Compatibility checks now use 1.1.2 / 1.1.3 / 1.1.4 as the onError baseline.

The practical differences can be summarized in three lines:

  • 1.1.0 and 1.1.5 form one group: on parseRichText / stripRichText they more readily report block-degrade paths as INLINE_NOT_CLOSED
  • 1.1.1 is its own group: compared with 1.1.2 to 1.1.4, it does not report BLOCK_CLOSE_MALFORMED for the "[Block/Unknown] text mixed with unknown block" class
  • 1.1.2 / 1.1.3 / 1.1.4 form one group: those three versions share the same onError semantics, and that group is now treated as the compatibility baseline

One more horizontal check across the final mirror baselines (1.1.10 / 1.2.7 / 1.3.9 / 1.4.1 / current workspace):

  • the onError code sequences for parseRichText / stripRichText stay aligned across that set

Incremental parsing / diff timeline

Version What changed Integration impact
1.2.0 Introduced parseIncremental plus low-level update helpers Incremental structural cache first appeared
1.2.1 Added createIncrementalSession(...) + adaptive auto strategy Session-first integration became recommended
1.2.2 Added right-reuse seam probe, option fingerprint checks, internal rebuild accounting Behavior became safer but more conservative
1.2.3 Removed low-level updater exports, trimmed session-only types Public integration is session-first
1.2.4 Right-side lazy delta shifting; bounded signature sampling; zone splitting for pure-inline (softZoneNodeCap); low-zone guard Head-of-file edit performance greatly improved; pure-inline documents benefit from incremental
1.2.5 cloneParseOptions deferred; full-rebuild reuses position tracker; findDirtyRange fused into a single pass Reduced constant overhead on full-rebuild paths
1.3.8 Added applyEditWithDiff(...); multi-island diff; path-aware ops Callers can consume structural diffs directly
1.3.9 Added TokenDiffResult.isNoop; clarified empty-tree no-op semantics for conservative diff; under tight budgets the diff is still more willing to emit fine-grained set-text ops Callers can distinguish "edit happened but structure did not change" from an ordinary non-empty diff; integrations that depend on fine-grained diff payloads are closest to the older style here
1.4.0 Incremental APIs are documented as a stable integration surface; session fallback and diff expectations are versioned as stable behavior; default session diff now uses hard budgets plus dirty-window-localized refinement Existing 1.3.x integrations can upgrade directly to 1.4.x, but should treat session-first behavior as the stable contract; worst-case diff now coarsens earlier and more readily collapses to conservative splice output
1.4.1 Deep malformed inline edits no longer fall onto an extremely slow EOF-recovery path; EOF recovery semantics are unified across full parse / dirty-window reparse / seam probe Editor-style deletion of many trailing )$$ closers no longer tends to stall; whole-document and incremental recovery behavior now stay aligned
1.4.2 extractText(singleToken) fix; createEasySyntax({ closeMiddle }) now actually affects close-derivation; added filterTokens Helper / convenience API runtime behavior aligns with type-level story; no incremental-path changes
1.4.3 Main-loop fast text skip (findNextBoundaryChar, charCodeAt boundary-code comparisons); escape token first-char bucketing; getTagCloserTypeWithCache for large frames; ParseFrame slimmed Internal constant-factor optimization only, no change to incremental public semantics; full-parse throughput improved
1.4.4 Fixes the deep shorthand close-ownership regression: ParseFrame now carries ancestorEndTagOwnerIndex, so shorthand close conflicts defer again to the nearest ancestor full-form endTag. At the same time, malformed shorthand recovery is intentionally stricter: once shorthand degrades, it is treated as text and no longer rebuilt through owner-chain recovery; EOF-unclosed shorthand/inline hands control back to the parent from argStart. Renderer gains text-merge buffering and lazy-alloc raw-close unescape scanning Deep shorthand inputs such as =bold<bold<bold<...>>>>>>= no longer steal the ancestor full-form close; malformed shorthand now produces a coarser, more text-like parseStructural shape than 1.4.3; renderer throughput improved on text-heavy and raw-heavy documents; no incremental public semantics change
1.4.5 Fixed inline blockTag render-position end mapping: when an inline tag is explicitly declared as a blockTag (forms: ["inline"]) and consumes the first post-close line break, token position.end now includes that consumed break, matching the existing raw/block path; both LF and CRLF are counted by actual consumed width (+1 / +2) In parseRichText, inline(blockTag) token position now matches the normalized render semantics, and aligns exactly with the next text token start; no public API changes and no incremental public-semantic changes

Public type / API-surface changes

This section tracks upgrade notes where runtime behavior may be unchanged, but exported helpers, callback signatures, config fields, or the visible TypeScript surface become stricter or more explicit.

Changes to existing public surface

Version Change Integration impact
0.1.4 ParseError.code narrowed from string to ErrorCode Error-code matching, serialization, and pattern-matching can become union-based
0.1.7 TextToken gained an index signature ([key: string]: unknown) Handler-added fields became visible to the type system
1.0.0 Removed ParseOptions.mode = "highlight" Older code still passing "highlight" must migrate to parseStructural
1.0.4 Existing TagHandler signatures gained optional ctx?; multiple existing helpers/utilities also gained ctx? Older handlers still work, but wrappers and hand-written types should include the new parameter
1.0.6 ParserBaseOptions gained baseOffset? / tracker? Wrapper types around substring/slice parsing should include the new fields
1.0.8 withSyntax, withCreateId, withTagNameConfig, and getSyntax gained explicit { suppressDeprecation?: boolean } The old ambient-state helpers had a compatible public-signature expansion
1.0.14 MultilineForm expanded from raw | block to raw | block | inline Code that declares or exhaustively matches MultilineForm must include "inline"
1.1.0 Parser.print() gained per-call PrintOptions override support; deriveBlockTags / resolveBlockTags parameter types narrowed from Record<string, unknown> to Record<string, TagHandler> Print wrappers and direct helper calls should follow the newer signatures
1.2.3 Removed old low-level updater root exports; IncrementalDocument no longer exposes optionsFingerprint Incremental integration moved to session-first, and code depending on old exports/fields must migrate
1.2.4 IncrementalSessionOptions gained softZoneNodeCap? Pure-inline zone granularity became caller-configurable
1.3.5 createEasyStableId gained disambiguationScope?: "parse" | "lifetime", and the default semantics changed to parse-scoped reset Reusing one stable-ID generator across parses now has different default ID semantics
1.3.9 TokenDiffResult gained isNoop Downstream code can branch explicitly on "edit happened but structure did not change"
1.4.0 IncrementalSessionOptions gained diff?, introducing IncrementalDiffRefinementOptions Diff-budget refinement moved into the caller-visible config surface
1.4.x IncrementalDocument.zones / tree became readonly Code that mutates snapshot arrays may now fail to type-check
1.4.2 createEasySyntax parameter type expanded from Partial<SyntaxInput> to Partial<SyntaxInput> & { closeMiddle?: string } New optional field, backward compatible; when present it participates in rawClose / blockClose derivation, but does not leak onto the returned runtime object

Newly added public surface

Version Added surface Why it matters
0.1.5 createParser(), Parser Pre-bound parser configuration became a formal public entry
0.1.8 allowForms, TagForm, createSimpleInlineHandlers, createSimpleBlockHandlers, createSimpleRawHandlers, declareMultilineTags, createPassthroughTags Tag-form gating and bulk handler helpers first entered the public surface
0.1.12 tagName, TagNameConfig, DEFAULT_TAG_NAME, createTagNameConfig, MultilineForm, BlockTagInput, BlockTagLookup Custom tag-name rules and multiline config types became public
0.1.180.1.19 parseStructural(...), parser.structural(), ParserBaseOptions, StructuralNode, StructuralParseOptions Structural parsing became a first-class public entry
1.0.2 trackPositions, SourcePosition, SourceSpan, TextToken.position?, StructuralNode.position? Source-position tracking entered the public result surface
1.0.5 createPipeHandlers(...), createTextToken(...), convenience readers on PipeArgs Pipe-helper integration was reorganized around newer public utilities
1.0.9 createEasyStableId(options?), EasyStableIdOptions Stable-ID generation entered the public API surface
1.0.15 buildZones(...), Zone Zone-level structural caching first gained a public helper/type pair
1.1.0 NarrowToken, NarrowDraft, NarrowTokenUnion, createTokenGuard Token-narrowing helpers entered the public surface
1.2.0 parseIncremental, updateIncremental, tryUpdateIncremental Incremental structural updates first became public
1.2.1 createIncrementalSession(...) plus session result types/strategy knobs Session-first editor integration first became public
1.3.0 implicitInlineShorthand, InlineShorthandOption, implicitInlineShorthand?: true, SHORTHAND_NOT_CLOSED Shorthand config, node markers, and shorthand-specific diagnostics entered the public surface
1.3.8 applyEditWithDiff(...), IncrementalSessionApplyWithDiffResult, TokenDiffResult, StructuralDiffOp, StructuralDiffPath, StructuralDiffPathSegment Session diff moved from root-level patches to a structured diff contract
1.4.2 filterTokens(tokens, predicate), FilterVisitor<T> Shorthand for mapTokens keep/drop with type-predicate narrowing support; semantics are tree pruning rather than flat filtering

Bug fixes with public observable impact

Version Fix Observable impact
1.0.7 Inline $$tag(...)$$ no longer consumes the following newline for tags that already supported both inline and block/raw forms This was a fix on existing behavior, not the first release of the capability
1.3.1 Unbalanced parentheses no longer make the whole nested tag tree collapse into plain text Invalid-input recovery becomes more local and more predictable
1.4.1 Deep malformed inline edits no longer fall onto an extremely slow EOF-recovery path, and full / incremental recovery semantics are now aligned Deleting many trailing )$$ closers no longer tends to stall in editors, and whole-document vs incremental recovery stays consistent
1.4.2 extractText(singleToken) no longer returns an empty string; createEasySyntax({ closeMiddle }) now actually affects close-derivation Runtime helper behavior now matches the widened type-level story, while closeMiddle remains input-only rather than a returned config key
1.4.4 Deeply nested shorthand no longer steals ancestor full-form endTag close; malformed shorthand recovery also changes to "degrade means text", without rebuilding already-textified shorthand heads through owner recovery Inputs like =bold<bold<bold<...>>>>>>= no longer steal the ancestor full-form close; parseStructural text-node segmentation for malformed / unclosed shorthand differs from 1.4.3 and is now closer to "whole text span + parent-scope rescan"
1.4.5 The first consumed post-close line break for inline(blockTag) is now included in inline token position.end, aligning with raw/block render-position semantics Previously the line break was consumed but not covered by any token render span; now inline(blockTag) spans include it, with correct LF/CRLF width mapping

1.4.5: inline(blockTag) trailing line-break position alignment

1.4.5 fixes a render-semantics vs position-mapping split:

  • raw(blockTag) and block(blockTag) already included the consumed post-close line break in token position.end.
  • inline(blockTag) previously consumed the same break but still left token position ending at inline close.

So in parseRichText(...), there was a gap where a line break had been consumed but did not belong to any token's render span.

From 1.4.5 onward:

  • For tags declared as blockTags: [{ tag, forms: ["inline"] }], inline token position.end includes the consumed line break.
  • LF contributes +1, CRLF contributes +2.
  • That end position now aligns with the next text token position.start.

Minimal example:

  • Input: $$bold(x)$$\nbar
    • Old: inline span about [0, 10), next text starts at 11 (consumed \n was unowned)
    • New: inline span about [0, 11), next text starts at 11

Upgrade read:

  • This is a publicly observable behavior fix (position mapping), not an API expansion.
  • It mostly affects integrations that rely on trackPositions (editor mapping, highlighting anchors, source-aligned diagnostics).
  • It only affects inline form for tags explicitly declared with forms: ["inline"]; plain inline semantics stay unchanged.

1.4.4: parseStructural text segmentation changed for malformed inline / shorthand

This is an intentional semantic change relative to 1.4.3, and it is easy to spot in version-behavior comparisons.

The current rule set is:

  • full-form syntax is the only real structural container
  • shorthand is syntax sugar
  • once shorthand / inline degrades during recovery, it does not keep any extra structural recovery rights
  • the nearest still-valid full-form parent scope rescans from argStart and decides the final structure

Because of that, parseStructural(...) now more often returns coarser merged text nodes for malformed or unclosed inline / shorthand input, instead of splitting the result into "tag head text" + "body text" the way 1.4.3 often did.

Typical examples:

  • $$bold(hello
    • 1.4.3: ["$$bold(", "hello"]
    • 1.4.4: ["$$bold(hello"]
  • before $$bold(hello) after
    • 1.4.3: ["before ", "$$bold(", "hello) after"]
    • 1.4.4: ["before ", "$$bold(hello) after"]

This is an AST-shape change caused by the new recovery model, not content loss; the returned text still covers the same source span.

How to read 1.4.4 when upgrading

The 1.4.4 semantic change can be summarized in four points:

  1. Deep shorthand close ownership is fixed again
    • deeply nested shorthand no longer steals the ancestor full-form close token.
  2. Malformed shorthand now has stricter recovery rights
    • once shorthand degrades, it is treated as text;
    • already-textified shorthand heads are no longer rebuilt back into structure through owner recovery.
  3. parseStructural therefore returns a coarser AST shape
    • this mainly shows up in text-node segmentation for malformed / unclosed inline and shorthand input;
    • compared with 1.4.3, the result is now closer to "whole text span + parent-scope rescan from argStart".
  4. This is not a new public API surface and not a valid-input grammar drift
    • valid-input structural results should not change because of this alone;
    • the observable change is concentrated in recovery / degradation behavior outside fully valid input.

The practical upgrade read is:

  • If you care about valid-input parseRichText / stripRichText / parseStructural output, 1.4.4 is not a new syntax rewrite.
  • If you care about the text-node segmentation shape of malformed-input parseStructural trees, then 1.4.4 is an intentional semantic change relative to 1.4.3.
  • If you use incremental sessions, the public session contract did not gain new modes or fields here; but whenever an edit lands on full parse / full-fallback, the resulting structural tree follows this newer recovery model.

1.4.1: EOF recovery for unclosed inline input

1.4.1 changes one very specific but editor-relevant recovery behavior:

  • the input reaches EOF with an unclosed inline tag
  • some inner content has already been parsed successfully
  • older recovery tended to merge the remaining tail back into one text node
  • now the already-parsed tail content is preserved instead of being merged back into the innermost unclosed tag head

Observable difference (more complete EOF-unclosed inline examples)

Input Old New
$$bold(hello text("$$bold(hello") text("$$bold("), text("hello")
$$bold($$italic(hello text("$$italic(hello") text("$$italic("), text("hello")
$$bold($$italic($$code(x text("$$code(x") text("$$code("), text("x")
$$bold(hello $$italic(world text("$$bold(hello "), text("$$italic(world") text("$$bold("), text("hello "), text("$$italic("), text("world")
$$bold(a $$italic(b $$code(c)$$ d text("$$bold(a "), text("$$italic(b "), inline(code, 1ch), text(" d") text("$$bold("), text("a "), text("$$italic("), text("b "), inline(code, 1ch), text(" d")
$$bold(text)$$$$italic(unclosed inline(bold, 1ch), text("$$italic(unclosed") inline(bold, 1ch), text("$$italic("), text("unclosed")

What this means

  • No change for valid input: this only affects EOF recovery when an inline tag is left unclosed.
  • Finer-grained recovery: already-parsed tail content is no longer collapsed back into the innermost unclosed tag head.
  • Full / incremental consistency: the important 1.4.1 change is that whole-document and incremental recovery now use the same semantics.
  • The final mirror baselines split cleanly here: 1.1.10 / 1.2.7 / 1.3.9 share the older shape, while 1.4.1+ share the newer one.

Upgrade note

If your integration asserts the exact recovery tree shape for invalid input (for example editor fixtures, recovery snapshots, or compatibility layers that depend on "merge the whole tail into one text node"), re-check those assertions when upgrading to 1.4.1.

If you only rely on valid-input semantics, or you treat invalid input merely as a transient editing state, you usually do not need any change.

⚠️ **GitHub.com Fallback** ⚠️