en Security - chiba233/yumeDSL GitHub Wiki

Security Policy

Supported Versions

Version	Supported
1.x	Yes
0.x	No (development)

For detailed security considerations and a safe UGC tutorial, see the Security wiki page and Safe UGC Chat tutorial.

Reporting a Vulnerability

Please do not open a public issue for security vulnerabilities.

Instead, report them privately via one of the following:

GitHub private vulnerability reporting: Go to the Security tab and click "Report a vulnerability".
Email: Send details to the repository maintainer (see GitHub profile).

What to Include

Description of the vulnerability
Steps to reproduce
Affected version
Impact assessment (if known)

What to Expect

Acknowledgment within 48 hours
Status update within 7 days
A fix or mitigation plan for confirmed vulnerabilities

Scope

This policy covers yume-dsl-rich-text. It does not cover:

Vulnerabilities in rendering layers you build on top of the parser (that is your application code)
Pure performance regressions or resource amplification that still stay within the documented fallback / budget / coarse-diff semantics
Denial of service via extremely large input, extremely large single edits, or highly repetitive adversarial input. Enforce depthLimit, input-size limits, edit-size limits, and request-level timeouts in your application

Known Security Considerations

1. The rendering layer remains the primary trust boundary

URLs are not validated: if you render link tags as <a>, sanitize the url field yourself
raw content is passed through as-is: if you render it as HTML, you must escape or sanitize it
In other words, successful parsing does not mean safe content. yume-dsl-rich-text enforces structural correctness, not HTML / URL policy

2. `handlers` / `syntax` / `tagName` are trusted configuration, not user input

This library intentionally allows the host application to provide:

handlers
syntax
tagName
createId
tracker

Those hooks exist for application-controlled customization, not for end users to supply parsing code or parsing rules dynamically.

From the implementation:

handler functions execute directly inside the parse / render pipeline
createId participates in parse lifecycle
syntax and tagName directly affect scanning and recognition
tracker and baseOffset directly affect reported positions

So do not expose these objects as untrusted input, and do not mutate them in place across a session lifetime. The security model assumes they are trusted application-side configuration.

3. `depthLimit` is only the first resource guardrail

The source code defaults depthLimit to 50. That helps bound unusually deep nesting, but it does not mean:

total document size is bounded
single-edit cost is bounded
diff cost is bounded

For untrusted input, you still need separate limits for total source size, single-edit size, and request-level execution time.

4. Resource risk must be understood in heap / CPU terms, not just input bytes

Structural parsing, position tracking, incremental documents, zones, signatures, and diff payloads can all consume substantially more memory than the original source text.

In practice, the real budget you need to control is not only "how many KB / MB came in", but also:

post-parse AST / token-tree size
position-tracking overhead
previous/current snapshots retained by incremental sessions
ops / patches / dirtySpan data produced by diff

For untrusted-input deployments, treat heap budget and concurrent parse count as first-class limits, not just request-body size.

5. Incremental sessions are correctness-first, not constant-cost APIs

createIncrementalSession(...) is designed to advance to the correct next snapshot first. Because of that, the API explicitly allows:

mode to be either incremental or full-fallback
fallbackReason values such as INTERNAL_FULL_REBUILD, AUTO_LARGE_EDIT, and AUTO_COOLDOWN
auto mode to move toward full rebuilds when recent samples say incremental reuse is no longer worth it

The default adaptive settings make that clear:

sampleWindowSize = 24
minSamplesForAdaptation = 6
maxFallbackRate = 0.35
switchToFullMultiplier = 1.1
fullPreferenceCooldownEdits = 12
maxEditRatioForIncremental = 0.2
softZoneNodeCap = 64

These values are runtime heuristics for "fall back sooner when incremental is no longer a good trade", not security guarantees.

6. `applyEditWithDiff(...)` budgets guarantee bounded degradation, not stable fine-grained diffs

The current default diff-refinement budgets are:

refinementDepthCap = 64
maxComparedNodes = 20000
maxAnchorCandidates = 128
maxOps = 512
maxSubtreeNodes = 256
maxMilliseconds = 8

Their purpose is to let refinement degrade early when fine-grained diffing is no longer worth it. The guarantee is bounded fallback, not continued promises of small dirty spans, fine patches, or low latency.

From the code, the following outcomes are expected behavior rather than security bugs by themselves:

a root-level splice
a single coarse replace patch
empty unchangedRanges
dirtySpanOld / dirtySpanNew expanding to most or all of the document
refinement throwing and degrading to conservative diff

Consumers therefore must not treat applyEditWithDiff(...) as a guarantee that it will:

always stay fine-grained
always remain near the edited window
always be cheaper than a full rebuild
always be safe to expose directly to untrusted traffic

7. Repetitive `raw` / `block` input and deep inline trees are still resource hotspots

The current implementation already has fallback, budgeting, and coarse-diff paths, but highly repetitive raw / block input, deep inline structures, and anchor-unfriendly edits can still amplify:

CPU comparison cost
dirty-span size
diff degradation frequency
full-fallback frequency

Treat those as deployment-level resource risks to constrain and monitor, not something the parser alone can fully neutralize.

8. Minimum deployment requirements for untrusted input

If you use this library for UGC, collaborative editing, chat messages, drafts, or any other untrusted-input workflow, you should at minimum:

cap total document size
cap single-edit size or edit ratio
enforce wall-clock timeouts and cancellation for requests / jobs / workers
use more conservative applyEditWithDiff(...) budgets on the server side, or disable diff entirely
never treat raw as trusted HTML / DOM content
never let end users directly control handlers, syntax, tagName, createId, or tracker
monitor full-fallback frequency, unusually large dirty spans, abnormal latency, and abnormal heap growth as resource warning signals

License

This project is licensed under the MIT License. Security fixes are provided on a best-effort basis.