en Token Structure - chiba233/yumeDSL GitHub Wiki
ParseOptions | Stable Token IDs
The parser turns DSL text into a token tree. Every node in the tree is a TextToken.
Your handler returns a TokenDraft (half-finished), the parser adds id and position to make the final TextToken.
Your DSL text
│
▼
Parser scans
│
├─ Plain text → TextToken { type: "text", value: "Hello ", id: "rt-0" }
│
└─ $$bold(world)$$ → calls your handler
│
▼
Handler returns TokenDraft
{ type: "bold", value: [...children] }
│
▼
Parser adds id + position
│
▼
TextToken { type: "bold", value: [...], id: "rt-1" }
interface TextToken {
type: string; // "text" or the type returned by your handler
value: string | TextToken[]; // text content or child token array
id: string; // unique within a parse
position?: SourceSpan; // source coordinates (only with trackPositions)
[key: string]: unknown; // extra fields from handler
}| Field | What it is |
|---|---|
type |
"text" for plain text, or the type your handler returned (usually the tag name like "bold", but can be any string like "version-note") |
value |
Text node → string; inline/block tag → TextToken[] (children); raw tag → string (raw content) |
id |
Sequential by default rt-0, rt-1, .... For stable IDs → Stable Token IDs
|
position |
Only when trackPositions: true. See Source Position Tracking
|
[key] |
Whatever extra fields your handler returns. e.g. link's url, code's lang
|
Discriminate value type: typeof token.value === "string" → text/raw; otherwise → child token array.
interface TokenDraft {
type: string;
value: string | TextToken[];
[key: string]: unknown;
}What handlers return. Same as TextToken but without id and position — the parser adds those.
Handler must set type and value. Any extra fields are preserved:
return {
type: "link",
value: childTokens, // required
url: "https://example.com", // extra field — kept on final TextToken
};The base TextToken uses an index signature for flexibility, but you can — and should — define precise types for your tags.
-
Public library boundary:
parseRichText()returns genericTextToken[](open for unknown extra fields) -
App boundary: your renderer/handlers narrow tokens with a local
TokenMap
Keeping these two layers separate gives you both extensibility and strict type checks.
Define a token map, then use createTokenGuard for zero-boilerplate narrowing:
import {
type NarrowToken,
type NarrowDraft,
type NarrowTokenUnion,
createTokenGuard,
type TextToken,
} from "yume-dsl-rich-text";
// 1. Define a token map — each key is a type, each value is extra fields
interface MyTokenMap {
text: Record<string, never>;
bold: Record<string, never>;
link: { url: string };
code: { lang: string };
}
type MyToken = NarrowTokenUnion<MyTokenMap>;
// 2. Create a type guard
const is = createTokenGuard<MyTokenMap>();
const renderChildren = (value: TextToken["value"]) =>
Array.isArray(value) ? value.map(render).join("") : value;
// 3. Narrow in if branches — TypeScript infers extra fields automatically
function render(token: TextToken): string {
if (is(token, "text")) return typeof token.value === "string" ? token.value : renderChildren(token.value);
if (is(token, "bold")) return `<b>${renderChildren(token.value)}</b>`;
if (is(token, "link")) return `<a href="${token.url}">${renderChildren(token.value)}</a>`;
if (is(token, "code")) return `<pre data-lang="${token.lang}">${renderChildren(token.value)}</pre>`;
return "";
}
// 4. If you want a discriminated union in specific modules:
const tokens = parseRichText(input, { handlers }) as MyToken[];Utility types:
| Type | What it does |
|---|---|
NarrowToken<TType, TExtra?> |
Narrow a TextToken to a specific type literal + known extra fields |
NarrowDraft<TType, TExtra?> |
Narrow a TokenDraft for handler return type annotations |
NarrowTokenUnion<TMap> |
Generate a union of NarrowToken from a token map — useful for exhaustive switch
|
createTokenGuard<TMap>() |
Create a runtime type guard that narrows TextToken by type key |
Tip: for token types with no extra fields, prefer
Record<string, never>over{}to avoid strict ESLintno-empty-object-typewarnings.
Handler-side type safety with NarrowDraft:
import { type NarrowDraft, type TagHandler, parsePipeArgs } from "yume-dsl-rich-text";
type LinkDraft = NarrowDraft<"link", { url: string }>;
const linkHandler: TagHandler = {
inline: (tokens, ctx): LinkDraft => {
const args = parsePipeArgs(tokens, ctx);
return {
type: "link",
url: args.text(0), // ← forget this and TS reports an error
value: args.materializedTailTokens(1),
};
},
};If you prefer explicit interfaces:
// 1. Define interfaces per tag
interface PlainText extends TextToken { type: "text"; value: string; }
interface BoldToken extends TextToken { type: "bold"; value: TextToken[]; }
interface LinkToken extends TextToken { type: "link"; url: string; value: TextToken[]; }
interface CodeBlockToken extends TextToken { type: "code"; lang: string; value: string; }
// 2. Union type
type MyToken = PlainText | BoldToken | LinkToken | CodeBlockToken;
// 3. Cast once at parse boundary
const tokens = parseRichText(input, { handlers }) as MyToken[];
// 4. Exhaustive switch
function render(token: MyToken): string {
switch (token.type) {
case "text": return token.value;
case "bold": return `<b>${token.value.map(t => render(t as MyToken)).join("")}</b>`;
case "link": return `<a href="${token.url}">${token.value.map(t => render(t as MyToken)).join("")}</a>`;
case "code": return `<pre data-lang="${token.lang}">${token.value}</pre>`;
default: { const _: never = token; return String(_); }
}
}Don't want to define interfaces? Runtime typeof works too:
if (token.type === "link" && typeof token.url === "string") {
console.log("Link to:", token.url);
}Less safe (no exhaustiveness check), but fine for ad-hoc access.