Add configurable processing limits for JSON parser (`StreamReadConstraints`) #637

cowtowncoder · 2020-08-23T03:20:19Z

(note: related to/inspired by FasterXML/jackson-databind#2816)

Some aspects of input document are prone to possible abuse, so that malicious sender can create specifically crafted documents to try to overload server. This includes things like:

Ridiculously deeply nested documents (it only takes 2 characters to create Array context so 8k document can have 4000 levels of nesting) -- implemented via Add StreamReadConstraints.maxNestingDepth() to constraint max nesting depth (default: 1000) #943
Very long documents (in general): while streaming decoder has no problems with length per se, higher level databind can easily exhaust memory even with somewhat normal amplification (that is, input content of 10 megabytes can result in a data structure of 100 megabytes) -- implemented via Add configurable limit for the maximum number of bytes/chars of content to parse before failing #1046
Very long names: Jackson's name decoding is optimized for short names and while it can handle any length (within bounds of total memory available) performance characteristics are not great beyond, say, couple of thousands of characters -- implemented via Add configurable limit for the maximum length of Object property names to parse before failing (default max: 50,000 chars) #1047
Very long numbers: textual length of tens of thousands of numbers might be problematic for some uses -- implemented via Add numeric value size limits via StreamReadConstraints (fixes sonatype-2022-6438) -- default 1000 chars #827
(possibly?) Huge number of properties per JSON Object -- for some use cases construction of data structures with tens or hundreds of thousands of distinct keys can be problematic -- not implemented, no immediate plans; can create new issue if this is still desired (ditto for big Arrays)
Extra long text segments (megabytes of content) can similarly become problematic -- this could also result from broken encoding (missing closing quote) -- implemented via Add StreamReadConstraints limit for longest textual value to allow (default: 5M) #863

and although streaming parser can typically handle many of these cases quite well, they can be very problematic for higher-level processing -- and even for streaming, for highly parallel processing.

So. It would be good to create a configurable set of options that:

Default to same safe set of limits for likely problematic cases (like limit nesting to what is known to typically fit in wrt stack frames; limit maximum property names)
Leave more speculative limits (text length) to unlimited (or very high)
Offer a simple way to configure limits (possibly only per JsonFactory, although it'd be really nice if per-parser overrides were possible)

Further reading: related material.

Here are some relevant links:

https://github.com/zio/zio-json#security outlines some general problem areas (overlaps with some of points above)

The text was updated successfully, but these errors were encountered:

cowtowncoder · 2022-02-13T22:14:46Z

Since 2.13 has been released marking as 2.14 (earliest) due to need for API additions and possible compatibility concerns for some maximum limits.

Also: I think this issue is still important even if databind tackles some of more immediate concerns (via FasterXML/jackson-databind#2816).

cowtowncoder · 2022-02-14T00:48:41Z

Some thinking out aloud:

Limits for total input document size can probably be checked at points when input blocks are read (blocking) or fed (async): this is what Woodstox does and appears to work beautifully with minimal overhead. Limits probably need to be in units of "whatever input unit is, byte/char"
Limit for nesting is a much bigger challenge but my first instinct is to integrate this to JsonStreamContext/JsonReadContext (2.x) (TokenStreamContext in 3.0) -- that is, check it when trying to add next context. This can get tricky due to inability to throw checked (in 2.x) JsonProcessingException but we'll cross that bridge if and when we get that far.... there's also the obvious challenge of how to pass limit configuration in the first place.

As to configuration settings, there probably needs to be a shared base type (StreamProcessingLimits), as concrete implementation, and with a singleton "default" or "empty" instance to feed in case where no explicit configuration is used. And then with format-specific extensions allowed.
Should probably use Builder-style construction, limits instances fully immutable (obviously) to be shareable.

One immediate concern I have, however, is that of adding state to JsonReadContext: seems like we need both current level (int) and configuration settings; latter to be passed when creating "root" instance, then passed to children.
Alternative would be to require config to be passed on every "createXxx()" call -- this would prevent need to hold on to a reference (save 1 field) but would be much more intrusive as callers would need to be changed.
So probably better to add 2 fields.

cowtowncoder · 2022-10-23T22:30:01Z

Will not happen for 2.14 but should be the first thing for 2.15.

cowtowncoder · 2023-02-14T05:12:36Z

Implemented for inclusion in 2.15:

Maximum number lengths: Add numeric value size limits via StreamReadConstraints (fixes sonatype-2022-6438) -- default 1000 chars #827
Maximum String value length: Add StreamReadConstraints limit for longest textual value to allow (default: 5M) #863

pjfanning · 2023-02-22T13:17:08Z

@cowtowncoder are there any more of these that you want to do for 2.15?

pjfanning · 2023-02-22T15:06:32Z

The total size could readily be limited using stream wrappers like those in https://github.com/pjfanning/json-size-limiter/tree/main/src/main/java/com/github/pjfanning/json/util . That code is borrowed. It should also be possible to find ByteBuffer equivalents. There is an argument that many web service frameworks have support for limiting the acceptable size of inputs - and that users should prefer those approaches. The earlier that the large input size is spotted, the better

cowtowncoder · 2023-02-22T18:45:01Z

Yes, ideally we'd be able to target (1) by JsonStreamContext, and (2) for input and output sizes.
But I don't know how timing would work.

I think existence of stream wrappers is useful and can work for now even if eventually in-core limits were added.

But deeper nesting checks should definitely be added in core at some point. Just not sure if timing works for 2.15.

And field names... that's an interesting thing. Not commonly reported as attack but conceivably it would be vector same way (... and more) as max String token length.

pjfanning · 2023-02-23T09:56:44Z

It seems like (2) is something that can be left to users. I think it is reasonable for jackson to concentrate on the cases where comparatively fewer bytes in input can lead to bad outcomes.

cowtowncoder · 2023-02-24T03:04:06Z

@pjfanning It could be, altough I implemented this with Woodstox few years back and it really is rather easy to add. Then again Jackson-core supports decorators so it's... even easier to wrap by users.

But I agree it's not a high priority at this point.

cowtowncoder · 2023-06-12T22:09:10Z

Filed #1046 for max document length -- I think it is worth adding, after thinking it through. Hope to work on it for 2.16.

cowtowncoder · 2023-10-26T17:48:54Z

Out of 6 ideas, 5 now implemented and included in 2.16.0-rc1; no plans yet for last one (max # of properties per Object); can file a new issue if need be.

Closing this issue as completed.

cowtowncoder added 2.13 performance Issue related to performance problems or enhancements labels Aug 23, 2020

cowtowncoder mentioned this issue Sep 25, 2020

nested JSON causes StackOverflowError FasterXML/jackson-databind#2861

Closed

cowtowncoder mentioned this issue Oct 25, 2020

OutOfMemoryError parsing long field name (or string value) with UTF8StreamJsonParser #607

Closed

cowtowncoder added 2.14 Issue planned (at earliest) for 2.14 and removed 2.13 labels Feb 13, 2022

cowtowncoder mentioned this issue Feb 13, 2022

Optimize UntypedObjectDeserializer wrt recursion [CVE-2020-36518] FasterXML/jackson-databind#2816

Closed

cowtowncoder mentioned this issue Mar 13, 2022

possible look at json edge cases FasterXML/jackson-future-ideas#63

Open

cowtowncoder mentioned this issue Apr 4, 2022

Optimize JsonNodeDeserialization wrt recursion FasterXML/jackson-databind#3397

Open

cowtowncoder mentioned this issue Sep 23, 2022

Add configurable limit for the maximum number of bytes or chars that we will parse in a number #815

Closed

cowtowncoder added 2.15 Issue planned (at earliest) for 2.15 and removed 2.14 Issue planned (at earliest) for 2.14 labels Oct 23, 2022

cowtowncoder mentioned this issue Jun 12, 2023

Add configurable limit for the maximum number of bytes/chars of content to parse before failing #1046

Closed

This was referenced Jun 12, 2023

Add configurable limit for the maximum length of Object property names to parse before failing (default max: 50,000 chars) #1047

Closed

Add configurable processing limits for JSON generator (StreamWriteConstraints) #1048

Closed

cowtowncoder changed the title ~~Add configurable "document/processing limits" for JSON parser~~ Add configurable processing limits for JSON parser (StreamReadConstraints) Jun 12, 2023

cowtowncoder added the 2.16 Issue planned (at earliest) for 2.16 label Oct 26, 2023

cowtowncoder closed this as completed Oct 26, 2023

cowtowncoder added this to the 2.16.0-rc1 milestone Oct 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configurable processing limits for JSON parser (`StreamReadConstraints`) #637

Add configurable processing limits for JSON parser (`StreamReadConstraints`) #637

cowtowncoder commented Aug 23, 2020 •

edited

cowtowncoder commented Feb 13, 2022 •

edited

cowtowncoder commented Feb 14, 2022 •

edited

cowtowncoder commented Oct 23, 2022

cowtowncoder commented Feb 14, 2023

pjfanning commented Feb 22, 2023

pjfanning commented Feb 22, 2023

cowtowncoder commented Feb 22, 2023

pjfanning commented Feb 23, 2023

cowtowncoder commented Feb 24, 2023

cowtowncoder commented Jun 12, 2023

cowtowncoder commented Oct 26, 2023

Add configurable processing limits for JSON parser (StreamReadConstraints) #637

Add configurable processing limits for JSON parser (StreamReadConstraints) #637

Comments

cowtowncoder commented Aug 23, 2020 • edited

cowtowncoder commented Feb 13, 2022 • edited

cowtowncoder commented Feb 14, 2022 • edited

cowtowncoder commented Oct 23, 2022

cowtowncoder commented Feb 14, 2023

pjfanning commented Feb 22, 2023

pjfanning commented Feb 22, 2023

cowtowncoder commented Feb 22, 2023

pjfanning commented Feb 23, 2023

cowtowncoder commented Feb 24, 2023

cowtowncoder commented Jun 12, 2023

cowtowncoder commented Oct 26, 2023

Add configurable processing limits for JSON parser (`StreamReadConstraints`) #637

Add configurable processing limits for JSON parser (`StreamReadConstraints`) #637

cowtowncoder commented Aug 23, 2020 •

edited

cowtowncoder commented Feb 13, 2022 •

edited

cowtowncoder commented Feb 14, 2022 •

edited