JSTEP 7 - FasterXML/jackson-future-ideas GitHub Wiki

(Back to JSTEP page)

New DataTypeFeature configuration options (Jackson 2.x)

Related JSTEPs:

  • JSTEP-3 contains/-ed aspects of JsonNodeFeature
  • JSTEP-5 discusses Unification of Date/Time handling

Related Jackson project issues:

Author

Tatu Saloranta (@cowtowncoder)

Version history

  • 2022-01-16: The first draft version
  • 2022-03-20: First parts implemented for 2.14

Background

Over time, various XxxFeature on/off options have proven useful and popular with developers: they are easy to set, change and (for the most part), understand.

Original set of "Features" were configurable at Streaming API and Databind level, including:

  • JsonParser.Feature / JsonGenerator.Feature / JsonFactory.Feature for Streaming API
  • MapperFeature / SerializationFeature / DeserializationFeature for databind

and for Streaming API, further split of format-specific (often JSON-specific) vs. generic (across all or most formats) features, resulted in:

  • JsonParser.Feature split into generic StreamReadFeature and XxxReadFeature (like JsonReadFeature)
  • JsonGenerator.Feature split into generic StreamWriteFeature and XxxWriteFeature (like JsonWriteFeature).

But while this split allowed better support for Format-specific features (via Streaming API), there is no similar mechanism for more granular configuration for datatype-specific features. This has lead to inclusion of some "too [datatype] specific" features at databind level; for example:

  • DeserializationFeature.READ_ENUMS_USING_TO_STRING
  • DeserializationFeature.READ_DATE_TIMESTAMPS_AS_NANOSECONDS
  • SerializationFeature.WRITE_DATES_AS_TIMESTAMPS

Such configuration is against the idea that these Features should be cross-cutting across dataformats AND datatypes. But it is worth noting that there is need for such configuration, at some other granularity. This lead to the idea of "datatype-specific" features, to cover up to 3 initial datatypes:

  1. JsonNode (Tree Model) configuration -- since it differs a lot from POJO configuration, and is difficult to configure
  2. Enum configuration -- a few entries already leaked into generic features, and a few configuration aspects missing
  3. Date/Time configuration -- similar to Enums, some configurability exists in general features, as well as via @JsonFormat, but there's need for more.

Approach

There are a few things to consider with respect to the general idea. For example:

  • Would a single DataTypeFeature enum suffice? Based on having multiple general "datatypes", settings, not really.
    • So, need multiple concrete Feature Enums
    • But would like to avoid need to add all relevant plumbing -- would prefer general-purpose extension mechanism
  • How similar should configuration interface (API) be compared to, say, DeserializationFeature?
    • Seems like usage should be very similar, perhaps almost identical
    • Possible to implement if we make DataTypeFeature an interface that actual Feature enum implements
  • Do these features need to be per-call, or per-mapper?
    • Since some of the settings should be per-call, it seems necessary to make them ALL per-call
    • Needs to be considered when adding new feature entries: should adding Features that do not work on per-call scope
  • Should we support pluggable, per-datatype-module way of adding DataTypeFeatures?
    • Ideally that would be great, but coordination of new types seems difficult (depending on mechanism)
    • Would likely lead to worse issues wrt. cross-version (different minor version between Datatype and Core modules)
    • At least initially the plan is NOT to support anything other than DataTypeFeatures defined by Databind module itself
  • Should there be 3-state definition -- true, false, DEFAULT -- or 2-state true, false? Former has benefit of allowing distinction between Default value and explicit "true" or "false".
    • Keeping track of 3 states adds some complexity
    • Since we need to consider both existing global defaults and possible per-call overrides, it seems necessary that we can differentiate between explicit and default values.

So, to summarize:

  1. We would have DataTypeFeature general interface for each actual feature (say, JsonNodeFeature) to implement: this interface mostly exists to keep Databind API simple and does not affect users directly (no functionality to access through it)
  2. There will be multiple Enum subtypes of DataTypeFeature, and more can (and likely will) be added over future Jackson versions
  3. Datatype Modules cannot define additional subtypes: these features will relate to either somewhat abstract/general types (Date/Time), or JDK- (Enum) and Jackson-specific (JsonNode) types
  4. We should keep track of difference between explicitly set true/false vs. default setting

Proposed concrete Features

Initial set of likely DataTypeFeature implementations consists of following features.

Feature: EnumFeature

This feature seems it could likely to be implemented for Jackson 2.14: ideas include:

  • DB#3054 -- serialize Enum values lower-cased (if using Enum.name())

Feature: JsonNodeFeature

This feature is cleaved off of JSTEP-3 and is the one closest to actual implementation.

Implement features are:

  • Null skipping:
    • For ObjectNode properties:
      • READ_NULL_PROPERTIES (default: true) -- are null valued properties represented as NullNodes in resulting ObjectNode or skipped? DB#3421

Features proposed for implementation:

  • Null skipping:
    • For ObjectNode properties:
      • WRITE_NULL_PROPERTIES (default: true) -- are null valued fields in ObjectNode written out as JSON or skipped?
    • For ArrayNode elements:
      • READ_NULL_ELEMENTS (default: true) -- are null elements in JSON arrays represented as NullNodes in resulting ArrayNode or skipped?
      • WRITE_NULL_ELEMENTS (default: true) -- are null elements in ArrayNode written out as JSON or skipped?
  • Property sorting (Similar to MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, but for ObjectNodes)
    • SORT_KEYS_ON_READ (default: false) - use TreeMap for ObjectNode when reading
    • SORT_KEYS_ON_WRITE (default: false) - dynamically check for SortedMap, construct intermediate TreeMap
    • Could force use of TreeMap on reads; on writes probably need dynamically determine
  • CONVERT_POJOS_FULLY (default: ?) -- when converting values (mapper.convertValue(), treeToValue()), are opaque values contained in POJONode:
    1. Serialized explicitly by matching serializer (true) -- which will essentially transform it into non-opaque value (like Map, List etc)
    2. Written out as "writePOJO", which in case of TokenBuffer will retain opaque value exactly as is
  • Coercion/Leniency: while 2.12 added "Coercion Config", it does not work well with JsonNode. So how about:
    • LENIENT_SCALAR_CONVERSION (default: false?): do we allow more speculative coercion, like boolean from int (zero -> false, otherwise true)
    • or, maybe better yet: ALLOW_LENIENT_NUMBERS, ALLOW_LENIENT_BOOLEANS etc? We can afford granularity
  • Numeric, floating-point/decimal
    • TRIM_DECIMAL_TRAILING_ZEROES (default: true in 2.x; false for 3.x?): do we force dropping of trailing zeroes for BigDecimal valued nodes?
      • Note: hard-coded in JsonNodeFactory currently, should not be
    • Something about forcing use of BigDecimal? There is DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS already, but should we have something separate?
  • Merging: instead of relying on configOverrides, allow preventing merging
    • ALLOW_ARRAY_MERGE (default: true)
    • ALLOW_OBJECT_MERGE (default: true)

Feature: DateTimeFeature

The last but no least feature would be something to control default settings for Date/Time types of:

  1. "Classic"/Legacy JDK types -- java.util.Date, java.util.Calendar
  2. Java 8 Date/Time API
  3. Joda Date/Time

Note that JSTEP-5 discusses aspects of Date/Time handling unification and may overlap here.

It seems unlikely that this feature will be added in time for 2.14.