en Security - chiba233/yumeDSL GitHub Wiki
Security Policy
Supported Versions
| Version | Supported |
|---|---|
| 1.x | Yes |
| 0.x | No (development) |
For detailed security considerations and a safe UGC tutorial, see the Security wiki page and Safe UGC Chat tutorial.
Reporting a Vulnerability
Please do not open a public issue for security vulnerabilities.
Instead, report them privately via one of the following:
- GitHub private vulnerability reporting: Go to the Security tab and click "Report a vulnerability".
- Email: Send details to the repository maintainer (see GitHub profile).
What to Include
- Description of the vulnerability
- Steps to reproduce
- Affected version
- Impact assessment (if known)
What to Expect
- Acknowledgment within 48 hours
- Status update within 7 days
- A fix or mitigation plan for confirmed vulnerabilities
Scope
This policy covers yume-dsl-rich-text. It does not cover:
- Vulnerabilities in rendering layers you build on top of the parser (that is your application code)
- Pure performance regressions or resource amplification that still stay within the documented fallback / budget / coarse-diff semantics
- Denial of service via extremely large input, extremely large single edits, or highly repetitive adversarial input. Enforce
depthLimit, input-size limits, edit-size limits, and request-level timeouts in your application
Known Security Considerations
1. The rendering layer remains the primary trust boundary
- URLs are not validated: if you render
linktags as<a>, sanitize theurlfield yourself rawcontent is passed through as-is: if you render it as HTML, you must escape or sanitize it- In other words, successful parsing does not mean safe content.
yume-dsl-rich-textenforces structural correctness, not HTML / URL policy
2. handlers / syntax / tagName are trusted configuration, not user input
This library intentionally allows the host application to provide:
handlerssyntaxtagNamecreateIdtracker
Those hooks exist for application-controlled customization, not for end users to supply parsing code or parsing rules dynamically.
From the implementation:
- handler functions execute directly inside the parse / render pipeline
createIdparticipates in parse lifecyclesyntaxandtagNamedirectly affect scanning and recognitiontrackerandbaseOffsetdirectly affect reported positions
So do not expose these objects as untrusted input, and do not mutate them in place across a session lifetime. The security model assumes they are trusted application-side configuration.
3. depthLimit is only the first resource guardrail
The source code defaults depthLimit to 50. That helps bound unusually deep nesting, but it does not mean:
- total document size is bounded
- single-edit cost is bounded
- diff cost is bounded
For untrusted input, you still need separate limits for total source size, single-edit size, and request-level execution time.
4. Resource risk must be understood in heap / CPU terms, not just input bytes
Structural parsing, position tracking, incremental documents, zones, signatures, and diff payloads can all consume substantially more memory than the original source text.
In practice, the real budget you need to control is not only "how many KB / MB came in", but also:
- post-parse AST / token-tree size
- position-tracking overhead
- previous/current snapshots retained by incremental sessions
ops/patches/dirtySpandata produced by diff
For untrusted-input deployments, treat heap budget and concurrent parse count as first-class limits, not just request-body size.
5. Incremental sessions are correctness-first, not constant-cost APIs
createIncrementalSession(...) is designed to advance to the correct next snapshot first. Because of that, the API explicitly allows:
modeto be eitherincrementalorfull-fallbackfallbackReasonvalues such asINTERNAL_FULL_REBUILD,AUTO_LARGE_EDIT, andAUTO_COOLDOWNautomode to move toward full rebuilds when recent samples say incremental reuse is no longer worth it
The default adaptive settings make that clear:
sampleWindowSize = 24minSamplesForAdaptation = 6maxFallbackRate = 0.35switchToFullMultiplier = 1.1fullPreferenceCooldownEdits = 12maxEditRatioForIncremental = 0.2softZoneNodeCap = 64
These values are runtime heuristics for "fall back sooner when incremental is no longer a good trade", not security guarantees.
6. applyEditWithDiff(...) budgets guarantee bounded degradation, not stable fine-grained diffs
The current default diff-refinement budgets are:
refinementDepthCap = 64maxComparedNodes = 20000maxAnchorCandidates = 128maxOps = 512maxSubtreeNodes = 256maxMilliseconds = 8
Their purpose is to let refinement degrade early when fine-grained diffing is no longer worth it. The guarantee is bounded fallback, not continued promises of small dirty spans, fine patches, or low latency.
From the code, the following outcomes are expected behavior rather than security bugs by themselves:
- a root-level
splice - a single coarse
replacepatch - empty
unchangedRanges dirtySpanOld/dirtySpanNewexpanding to most or all of the document- refinement throwing and degrading to conservative diff
Consumers therefore must not treat applyEditWithDiff(...) as a guarantee that it will:
- always stay fine-grained
- always remain near the edited window
- always be cheaper than a full rebuild
- always be safe to expose directly to untrusted traffic
7. Repetitive raw / block input and deep inline trees are still resource hotspots
The current implementation already has fallback, budgeting, and coarse-diff paths, but highly repetitive raw / block input, deep inline structures, and anchor-unfriendly edits can still amplify:
- CPU comparison cost
- dirty-span size
- diff degradation frequency
- full-fallback frequency
Treat those as deployment-level resource risks to constrain and monitor, not something the parser alone can fully neutralize.
8. Minimum deployment requirements for untrusted input
If you use this library for UGC, collaborative editing, chat messages, drafts, or any other untrusted-input workflow, you should at minimum:
- cap total document size
- cap single-edit size or edit ratio
- enforce wall-clock timeouts and cancellation for requests / jobs / workers
- use more conservative
applyEditWithDiff(...)budgets on the server side, or disable diff entirely - never treat
rawas trusted HTML / DOM content - never let end users directly control
handlers,syntax,tagName,createId, ortracker - monitor
full-fallbackfrequency, unusually large dirty spans, abnormal latency, and abnormal heap growth as resource warning signals
License
This project is licensed under the MIT License. Security fixes are provided on a best-effort basis.