JSTEP 3 - FasterXML/jackson-future-ideas GitHub Wiki

(Back to JSTEP page)

JsonNode improvements (for Jackson 3, some in 2.x)

Related change issues: [fill me in]

Author

Tatu Saloranta (@cowtowncoder)

Version history

  • 2019-01-26: first skeletal revision
  • 2019-02-10: copy existing plan over from earlier wiki page, with minimal edits
  • 2019-09-10: Update wrt work on databind#2237
  • 2021-10-11: Update potential JsonNodeFeatures
  • 2022-01-04: Add ALLOW_ARRAY_MERGE, OBJECT_MERGE JsonNodeFeatures.
  • 2022-01-16: Add links to JSTEP-7 (new JsonNodeFeature as one of DataTypeFeatures)

Background

Although "Tree Model" -- operating on json content via JsonNode-based object model which represents content exactly as-is, without transformations -- has been around since Jackson 1.0, it has not been worked on as extensively as full databinding to/from POJOs. But it is extensively used and valued by users due to its flexibility. Over time some of original design choices have proven problematic (for example methods not being able to throw JsonProcessingException for invalid coercions; or returning of JsonNode for methods in contaner nodes, preventing chaining of many calls), and in a way that can not be changed without breaking backwards compatibility.

With 3.0 we have a perfect opportunity to further improve JsonNode API, as well as fix problems in return type declarations. We can also benefit from other changes, in particular JSTEP-4 which will make it possible for all methods to throw properly typed exceptions.

Changes: Configurability

As of Jackson 2.x, only some of DeserializationFeatures (and few if any SerializationFeatures) affect handling of JsonNode. This is due to most of them being POJO-centric, as JsonNode is meant to be faithful representation of JSON that was read, or is to be written out, with few changes.

But users do have legitimate need/desire to make SOME changes. It's just that these changes are not necessarily aligned with changes to POJO handling.

With that, I suggest addition of...

JsonNodeFeature

NOTE: being moved under JSTEP-7 -- will be removed from here soon (as of 2022-01-16).

This would be enumeration like DeserializationFeature with choices like:

  • Null skipping:
    • For ObjectNode properties:
      • READ_NULL_PROPERTIES (default: true) -- are null valued properties represented as NullNodes in resulting ObjectNode or skipped?
      • WRITE_NULL_PROPERTIES (default: true) -- are null valued fields in ObjectNode written out as JSON or skipped?
    • For ArrayNode elements:
      • READ_NULL_ELEMENTS (default: true) -- are null elements in JSON arrays represented as NullNodes in resulting ArrayNode or skipped?
      • WRITE_NULL_ELEMENTS (default: true) -- are null elements in ArrayNode written out as JSON or skipped?
  • Property sorting (Similar to MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, but for ObjectNodes)
    • SORT_KEYS_ON_READ (default: false) - use TreeMap for ObjectNode when reading
    • SORT_KEYS_ON_WRITE (default: false) - dynamically check for SortedMap, construct intermediate TreeMap
    • Could force use of TreeMap on reads; on writes probably need dynamically determine
  • CONVERT_POJOS_FULLY (default: ?) -- when converting values (mapper.convertValue(), treeToValue()), are opaque values contained in POJONode:
    1. Serialized explicitly by matching serializer (true) -- which will essentially transform it into non-opaque value (like Map, List etc)
    2. Written out as "writePOJO", which in case of TokenBuffer will retain opaque value exactly as is
  • Coercion/Leniency: while 2.12 added "Coercion Config", it does not work well with JsonNode. So how about:
    • LENIENT_SCALAR_CONVERSION (default: false?): do we allow more speculative coercion, like boolean from int (zero -> false, otherwise true)
    • or, maybe better yet: ALLOW_LENIENT_NUMBERS, ALLOW_LENIENT_BOOLEANS etc? We can afford granularity
  • Numeric, floating-point/decimal
    • TRIM_DECIMAL_TRAILING_ZEROES (default: true in 2.x; false for 3.x?): do we force dropping of trailing zeroes for BigDecimal valued nodes?
      • Note: hard-coded in JsonNodeFactory currently, should not be
    • Something about forcing use of BigDecimal? There is DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS already, but should we have something separate?
  • Merging: instead of relying on configOverrides, allow preventing merging
    • ALLOW_ARRAY_MERGE (default: true)
    • ALLOW_OBJECT_MERGE (default: true)

One thing to note is that while there could be NodeReadFeature/NodeWriteFeature, I think set of values we need is small enough, and things closely related so that it makes more sense to have just one set. It is also bit easier to implement.

Changes: Exception reporting

With 3.0, JsonProcessingException becomes unchecked (see JSTEP-4)

This is relevant for JsonNode, too, since we can start throwing formal Jackson exceptions, without having to declare them separately for all (or just some) accessors. This is relevant for a few entries here:

  • JsonNode.toString() can pass any exceptions for real serialization as-is, with no additional wrapping

However: I think it also makes sense to introduce at least one new Jackson exception type, like:

  • ValueCoercionException (similar to new Streaming API exception, InputCoercionException)

which can then be used by methods that attempt coercion (like JsonNode.asIntValue()), but fail due to incompatible types (or perhaps parsing error, from String to number). This exception type should retain and expose information like:

  • Source token type (JsonToken.VALUE_STRING)
  • Target shape (as token, like JsonToken.VALUE_NUMBER_INT)
  • If available, target Java type (java.lang.Integer#type for int, for example)
  • We do not necessarily have information on path, but possibly exception catch/rethrow could re-create this

Changes: Additional traversal, mutation

Removal

  • For ContainerNode (ObjectNode, ArrayNode):
    • removeNullValues() (or: separate for object ("removeNullProperties()") vs array ("removeNullElements")?)
    • clear() / removeAll() (remove all entries)

Traversal

It should be possible to stream through both values (array elements, values of properties), and in case of ObjectNode, entries (name and value). Not sure what is the best Java 8 pattern to follow, perhaps take consumers?

Transformation

Maybe allow a way to "map"? Might become problematic for ObjectNode, if and when modifying while traversing: maybe should just create a new ObjectNode / ArrayNode if and when making any change(s)?

Will definitely need help defining good Java 8 API additions here

Changes: Extended set of scalar value accessors

As of 2.x, there are a few accessors for type like, say, int:

  • intValue(): return int if (and only if!) we got numeric integer node (JsonToken.VALUE_NUMBER_INT) -- but won't throw exception, returns default value (0) otherwise
  • asInt(): same as asInt(0)
  • asInt(int defaultValue): return int value from JsonToken.VALUE_NUMBER_INT (if within value range) OR if coercible (from double or String); otherwise return defaultValue

But none actually throws exception: partly due to historical reasons (was not done initially), and also since methods do not expose JsonProcessingException (or IOException). So, due to backwards compatibility.

So there are couple of things to improve:

  1. Should allow throwing of exceptions, for case where no default is specified. We can now do this more easily as we throw unchecked exception -- meaning accessors are still safe with Java 8 streaming
  2. Should allow use of Java 8 Optional, as that is useful for stream() operations (and now we can use Java 8 features)

This would leave to bigger set of methods, once again for int:

  1. intValue() as before, return value if JsonToken.VALUE_NUMBER_INT, within Java int range -- but if not, throw exception
  2. intValue(int defaultValue) as intValue() except returns defaultValue instead of exception
  3. optIntValue() similar to intValue() but returns OptionalInt, either present (as per intValue()) or absent (instead of exception)
  4. asInt() (or asIntValue()?) similar to intValue() but allows coercion from double and String (and possible with config, from true/false). But if not, throw exception
  5. asInt(int defaultValue) like asInt() but instead of exception, return defaultValue() for case of no valid value
  6. asOptInt() (maybe?): similar to asInt() but instead of exception, return "absent" value

Now: set of operations may be smaller for other more esoteric types. We should also have generic

  • opt(String / int)

as counterpart to get() / path(), which would return Optional<JsonNode>.

Changes: Misc coercion, checking?

Is there need for casting with something like

  • asObject()? (requires forward-ref to subtype, not really optimal -- although could maybe use local type bound of ` -- need to test)
  • similarly, asArray()?

Changes: Misc other

  • Rename methods with "text" to use "string", similarly types ("TextNode" -> "StringNode"): deprecate old methods in cases where significant usage exists (textValue(), asText())
  • "Bulk" add methods for scalars? ("addAll()", "putAll()") -- and maybe just for simple things like String, int, long values?
    • Maybe via JsonNodeFactory?

Changes: Misc added in 2.10

  • JsonNode.toString() will guarantee valid JSON output (assuming default vanilla ObjectMapper settings) * JsonNode.isEmpty() works as alias for idiom size() == 0

Changes: "required" methods (added in 2.10)

(note: these were added via databind#2237)

  • required(String / int) for basically "get, throw exception if no node with name/index)
  • requiredAt(JsonPointer): same as above, but with JsonPointer
  • require() -- returns JsonNode (or subtype, co-variant) unless MissingNode; if MissingNode, throw exception (unchecked, Jackson-specific)
  • requireNonNull() -- same as require(), but fail both for MissingNode and NullNode
⚠️ **GitHub.com Fallback** ⚠️