Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JsonParser.getNumberTypeFP() #1149

Closed
cowtowncoder opened this issue Nov 29, 2023 · 3 comments
Closed

Add JsonParser.getNumberTypeFP() #1149

cowtowncoder opened this issue Nov 29, 2023 · 3 comments
Labels
2.17 Issues planned (at earliest) for 2.17

Comments

@cowtowncoder
Copy link
Member

cowtowncoder commented Nov 29, 2023

Currently JsonParser has method getNumberType(), with semantics that are loose for many textual formats.
Basically formats like JSON do not have similar types as programming languages: so while we have separate NumberType entries representing float (32-bit binary FP), double (64-bit "double"precision binary FP) and BigDecimal (unlimited-precision, decimal FP), there is no efficient mechanism to actually produce correct NumberType for floating-point values.
Because of this, basically all FP values claim to be of NumberType.DOUBLE for such formats.
This can be problematic if values are converted first to double, then to BigDecimal, since former cannot accurately represent all decimal numbers.

However, binary formats often have specific storage representations that can provide this type information.

The problem comes when converting to Java types: both java.lang.Number (or generally java.lang.Object) and JsonNode.
In this case we would ideally use either:

  1. Exact type if known (binary formats) OR
  2. Well-known type -- Double OR BigDecimal, based on configuration
  3. In some edge cases (not-a-number aka NaN), Double as that can represent such values.

(further complicating things, we also have secondary means of producing NaN values: value overflow for double (and theoretically, but not practically, float) which can produce +INFINITY)

Given all above confusion, I think we need a new method like getNumberTypeFP() -- with matching enum NumberTypeFP (to be able to express value UNKNOWN, no need for integer types).
That will allow deserializers to know if "true number type", and base logic on that, specifically avoiding conversions in case of Binary formats and allowing their use for Textual formats (or in general formats without explicit type information for FP numbers).

EDIT: originally thought we'd need getNumberTypeExplicit(), but since the need is specifically for floating-point numbers, let's call it getNumberTypeFP() instead; no need for non-FP types. And can indicate semantics are for strict/explicit type.

@cowtowncoder cowtowncoder added the 2.17 Issues planned (at earliest) for 2.17 label Nov 29, 2023
@cowtowncoder
Copy link
Member Author

/cc @pjfanning This with #1137 might let us solve the problems: by using combination of isNan() (now reliable) and getNumberTypeExplicit() (which will not accidentally coerce type to DOUBLE) we should avoid most problems.
There may still be some complications wrt "Double value overflow".

@cowtowncoder
Copy link
Member Author

Initial part for JsonParser itself implemented; as well as support for parser created from TokenBuffer (of jackson-databind). Next steps: add overrides by binary format parsers; then make "untyped" and JsonNode deserializers use.

@cowtowncoder
Copy link
Member Author

One concern: FasterXML/jackson-databind#4410 -- for format backends outside FasterXML there's backwards-compatibility issue. So I'll see if there's a way to improve default handling for getNumberTypeFP().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.17 Issues planned (at earliest) for 2.17
Projects
None yet
Development

No branches or pull requests

1 participant