en Security - chiba233/yumeDSL GitHub Wiki

Security Policy

Home | Contributing


Supported Versions

Version Supported
1.x Yes
0.x No (development)

For detailed security considerations and a safe UGC tutorial, see the Security wiki page and Safe UGC Chat tutorial.


Reporting a Vulnerability

Please do not open a public issue for security vulnerabilities.

Instead, report them privately via one of the following:

  • GitHub private vulnerability reporting: Go to the Security tab and click "Report a vulnerability".
  • Email: Send details to the repository maintainer (see GitHub profile).

What to Include

  1. Description of the vulnerability
  2. Steps to reproduce
  3. Affected version
  4. Impact assessment (if known)

What to Expect

  • Acknowledgment within 48 hours
  • Status update within 7 days
  • A fix or mitigation plan for confirmed vulnerabilities

Scope

This policy covers yume-dsl-rich-text. It does not cover:

  • Vulnerabilities in rendering layers you build on top of the parser (that is your application code)
  • Pure performance regressions or resource amplification that still stay within the documented fallback / budget / coarse-diff semantics
  • Denial of service via extremely large input, extremely large single edits, or highly repetitive adversarial input. Enforce depthLimit, input-size limits, edit-size limits, and request-level timeouts in your application

Known Security Considerations

1. The rendering layer remains the primary trust boundary

  • URLs are not validated: if you render link tags as <a>, sanitize the url field yourself
  • raw content is passed through as-is: if you render it as HTML, you must escape or sanitize it
  • In other words, successful parsing does not mean safe content. yume-dsl-rich-text enforces structural correctness, not HTML / URL policy

2. handlers / syntax / tagName are trusted configuration, not user input

This library intentionally allows the host application to provide:

  • handlers
  • syntax
  • tagName
  • createId
  • tracker

Those hooks exist for application-controlled customization, not for end users to supply parsing code or parsing rules dynamically.

From the implementation:

  • handler functions execute directly inside the parse / render pipeline
  • createId participates in parse lifecycle
  • syntax and tagName directly affect scanning and recognition
  • tracker and baseOffset directly affect reported positions

So do not expose these objects as untrusted input, and do not mutate them in place across a session lifetime. The security model assumes they are trusted application-side configuration.

3. depthLimit is only the first resource guardrail

The source code defaults depthLimit to 50. That helps bound unusually deep nesting, but it does not mean:

  • total document size is bounded
  • single-edit cost is bounded
  • diff cost is bounded

For untrusted input, you still need separate limits for total source size, single-edit size, and request-level execution time.

4. Resource risk must be understood in heap / CPU terms, not just input bytes

Structural parsing, position tracking, incremental documents, zones, signatures, and diff payloads can all consume substantially more memory than the original source text.

In practice, the real budget you need to control is not only "how many KB / MB came in", but also:

  • post-parse AST / token-tree size
  • position-tracking overhead
  • previous/current snapshots retained by incremental sessions
  • ops / patches / dirtySpan data produced by diff

For untrusted-input deployments, treat heap budget and concurrent parse count as first-class limits, not just request-body size.

5. Incremental sessions are correctness-first, not constant-cost APIs

createIncrementalSession(...) is designed to advance to the correct next snapshot first. Because of that, the API explicitly allows:

  • mode to be either incremental or full-fallback
  • fallbackReason values such as INTERNAL_FULL_REBUILD, AUTO_LARGE_EDIT, and AUTO_COOLDOWN
  • auto mode to move toward full rebuilds when recent samples say incremental reuse is no longer worth it

The default adaptive settings make that clear:

  • sampleWindowSize = 24
  • minSamplesForAdaptation = 6
  • maxFallbackRate = 0.35
  • switchToFullMultiplier = 1.1
  • fullPreferenceCooldownEdits = 12
  • maxEditRatioForIncremental = 0.2
  • softZoneNodeCap = 64

These values are runtime heuristics for "fall back sooner when incremental is no longer a good trade", not security guarantees.

6. applyEditWithDiff(...) budgets guarantee bounded degradation, not stable fine-grained diffs

The current default diff-refinement budgets are:

  • refinementDepthCap = 64
  • maxComparedNodes = 20000
  • maxAnchorCandidates = 128
  • maxOps = 512
  • maxSubtreeNodes = 256
  • maxMilliseconds = 8

Their purpose is to let refinement degrade early when fine-grained diffing is no longer worth it. The guarantee is bounded fallback, not continued promises of small dirty spans, fine patches, or low latency.

From the code, the following outcomes are expected behavior rather than security bugs by themselves:

  • a root-level splice
  • a single coarse replace patch
  • empty unchangedRanges
  • dirtySpanOld / dirtySpanNew expanding to most or all of the document
  • refinement throwing and degrading to conservative diff

Consumers therefore must not treat applyEditWithDiff(...) as a guarantee that it will:

  1. always stay fine-grained
  2. always remain near the edited window
  3. always be cheaper than a full rebuild
  4. always be safe to expose directly to untrusted traffic

7. Repetitive raw / block input and deep inline trees are still resource hotspots

The current implementation already has fallback, budgeting, and coarse-diff paths, but highly repetitive raw / block input, deep inline structures, and anchor-unfriendly edits can still amplify:

  • CPU comparison cost
  • dirty-span size
  • diff degradation frequency
  • full-fallback frequency

Treat those as deployment-level resource risks to constrain and monitor, not something the parser alone can fully neutralize.

8. Minimum deployment requirements for untrusted input

If you use this library for UGC, collaborative editing, chat messages, drafts, or any other untrusted-input workflow, you should at minimum:

  1. cap total document size
  2. cap single-edit size or edit ratio
  3. enforce wall-clock timeouts and cancellation for requests / jobs / workers
  4. use more conservative applyEditWithDiff(...) budgets on the server side, or disable diff entirely
  5. never treat raw as trusted HTML / DOM content
  6. never let end users directly control handlers, syntax, tagName, createId, or tracker
  7. monitor full-fallback frequency, unusually large dirty spans, abnormal latency, and abnormal heap growth as resource warning signals

License

This project is licensed under the MIT License. Security fixes are provided on a best-effort basis.