en Writing Tag Handlers - chiba233/yumeDSL GitHub Wiki

Writing Tag Handlers

Stable Token IDs | Handler Utilities

Most tags should use Handler Helpers for bulk registration. Only write a manual TagHandler when helpers can't express your logic — conditional output, validation, side effects.

About ctx: The ctx parameter in handler callbacks is a context object passed by the parser. You don't need to know what it is — just include it. See DslContext if you're curious.

Signature note: This page covers the raw TagHandler interface, where the first parameter is raw tokens (inline) or arg (raw/block). If you use createPipeHandlers, it pre-parses pipe arguments so the callback signature is different — the first parameter becomes PipeArgs. See the comparison table below.


What a TagHandler looks like

TagHandler = {
    inline?  → called when user writes $$tag(content)$$
    raw?     → called when user writes $$tag(arg)%...%end$$
    block?   → called when user writes $$tag(arg)*...*end$$
}
interface TagHandler {
    inline?: (tokens: TextToken[], ctx?: DslContext) => TokenDraft;
    raw?: (arg: string | undefined, content: string, ctx?: DslContext) => TokenDraft;
    block?: (arg: string | undefined, content: TextToken[], ctx?: DslContext) => TokenDraft;
}

Only implement the forms your tag supports. User writes an unimplemented form → entire markup degrades to literal text, no error.

Two signature systems compared

Form Raw TagHandler (this page) createPipeHandlers wrapper
inline (tokens: TextToken[], ctx?) => TokenDraft (args: PipeArgs, ctx?) => TokenDraft
raw (arg: string|undefined, content: string, ctx?) => TokenDraft (args: PipeArgs, content: string, ctx?, rawArg?) => TokenDraft
block (arg: string|undefined, content: TextToken[], ctx?) => TokenDraft (args: PipeArgs, content: TextToken[], ctx?, rawArg?) => TokenDraft

The raw version's first parameter is raw data (tokens or arg string). The wrapper pre-parses pipe arguments, so the first parameter becomes PipeArgs.


Parameters at a glance

inline

inline: (tokens, ctx) => TokenDraft
Param What it is
tokens Recursively parsed children from inside the parens. Note: escape sequences in text leaves are still raw.
ctx Parse context — forward to utility functions

On escapes: user writes $$bold(hello \| world)$$, you get text value "hello \\| world" (backslash still there). Want clean text? Use materializeTextTokens(tokens, ctx) or parsePipeArgs.

raw

raw: (arg, content, ctx) => TokenDraft
Param What it is
arg Text between ( and )%. undefined when empty. Raw string, not pipe-parsed
content Raw body — verbatim string, no nested tag parsing
ctx Parse context

For: code blocks, math, embedded JSON — anything that shouldn't be recursively parsed.

block

block: (arg, content, ctx) => TokenDraft
Param What it is
arg Same raw arg string as raw
content TextToken[]already recursively parsed block body
ctx Parse context

Full example: helpers + manual mix

import {
    createSimpleInlineHandlers,
    createPipeHandlers,
    parseRichText,
    parsePipeTextArgs,
    type TagHandler,
    type TokenDraft,
    type DslContext,
} from "yume-dsl-rich-text";

// Simple tags via helpers
const simple = createSimpleInlineHandlers(["bold", "italic", "underline"]);

// Pipe-aware tags via createPipeHandlers
const piped = createPipeHandlers({
    link: {
        inline: (args, ctx) => ({
            type: "link",
            url: args.text(0),
            value: args.materializedTailTokens(1),
        }),
    },
});

// Custom logic → manual handler
const manual: Record<string, TagHandler> = {
    code: {
        raw: (arg, content, ctx): TokenDraft => {
            const pipeArgs = parsePipeTextArgs(arg ?? "", ctx);
            return {
                type: "code",
                lang: pipeArgs.text(0, "text"),
                label: pipeArgs.text(1, ""),
                value: content.trim(),
            };
        },
    },
};

// Merge
const handlers = { ...simple, ...piped, ...manual };
const tokens = parseRichText("$$bold(Hello)$$ $$link(https://example.com | click)$$", { handlers });

PipeArgs

The structured view returned by parsePipeArgs / parsePipeTextArgs. See Handler Utilities.

interface PipeArgs {
    parts: TextToken[][];
    has: (index: number) => boolean;
    text: (index: number, fallback?: string) => string;
    materializedTokens: (index: number, fallback?: TextToken[]) => TextToken[];
    materializedTailTokens: (startIndex: number, fallback?: TextToken[]) => TextToken[];
}
Method What it does
parts Raw token segments (escapes not yet resolved)
has(i) Does segment i exist?
text(i) Plain text of segment i (unescaped + trimmed)
materializedTokens(i) Tokens of segment i (text unescaped, structure preserved)
materializedTailTokens(start) All segments from start merged — for "everything after this is free-form text that may contain pipes"

Typical usage

const args = parsePipeArgs(tokens, ctx);
return {
    type: "link",
    url: args.text(0),                         // "https://example.com"
    value: args.materializedTailTokens(1, []),  // "Click | here | for details" all merged
};

parsePipeTextList

Simplest pipe split — string in, string array out:

parsePipeTextList("ts | Demo | Label");  // → ["ts", "Demo", "Label"]
parsePipeTextList("a \\| b | c");        // → ["a | b", "c"]

Go-to for raw/block handlers splitting the arg parameter:

code: {
    raw: (arg, content, ctx) => {
        const parts = parsePipeTextList(arg ?? "");
        return { type: "code", lang: parts[0] || "text", value: content };
    },
}