Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing ZonedDateTime with basic TZ offset notation (0000) #131

Closed
noway1979 opened this issue Aug 23, 2019 · 18 comments
Closed

Deserializing ZonedDateTime with basic TZ offset notation (0000) #131

noway1979 opened this issue Aug 23, 2019 · 18 comments
Milestone

Comments

@noway1979
Copy link

I stumbled upon deserialization issues when the given string equals to ISO8601 "basic" format.

2019-08-22T12:36:46.361+0000

Exception message:

com.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot deserialize value of type `java.time.ZonedDateTime` from String "2019-08-22T12:36:46.361+0000": Failed to deserialize java.time.ZonedDateTime: (java.time.format.DateTimeParseException) Text '2019-08-22T12:36:46.361+0000' could not be parsed at index 23
 at [Source: (String)""2019-08-22T12:36:46.361+0000""; line: 1, column: 1]

This format misses out a colon in offset in comparison to extended format (as described at: https://bugs.openjdk.java.net/browse/JDK-8176547).

2019-08-22T12:36:46.361+00:00

The issue states that java.time.format.DateTimeFormatter.ISO_ZONED_DATE_TIME does not support this basic format.

However, as this format is provided by Spring exception handler I was wondering if there exists any workaround to successfully parse this format. Maybe I am simply missing the obvious here?

As current workaround I had to register a custom InstantDeserializer and set com.fasterxml.jackson.datatype.jsr310.deser.InstantDeserializer.replaceZeroOffsetAsZ to true.
Any hint is very much appreciated.

Reproduce

		ZonedDateTime zdt1 = ZonedDateTime.now();
		ObjectMapper mapper = new ObjectMapper()                                         
                                        .registerModule(new ParameterNamesModule())
					.registerModule(new Jdk8Module())
                                        .registerModule(new JavaTimeModule());

		//happy path: serialize and deserialize
		String zdt1S = mapper.writeValueAsString(zdt1);
		ZonedDateTime zdt1R = mapper.readValue(zdt1S, ZonedDateTime.class);
		assertThat(zdt1R, notNullValue());
		gLogger.info("zdt1S: " + zdt1R);

		//ZonedDateTime with basic formatted offset leads to exception
		String basicTS = "\"2019-08-22T12:36:46.361+0000\"";
		ZonedDateTime zdt2 = mapper.readValue(basicTS, ZonedDateTime.class);
		gLogger.info("zdt2S: " + zdt2);
@kupci
Copy link
Member

kupci commented Dec 4, 2019

However, as this format is provided by Spring exception handler

Note to check to see what Spring is doing here, i.e. is this a Spring bug, or a different format, but still valid, because as described in the issue 1624, the colon is correct for ISO 8601, extended format:

“:” (colon) | the “:” colon character, in extended format, separates the time scale components for “hour” and “minute”, and “minute” and “second”.
https://www.iso.org/obp/ui#iso:std:iso:8601:-1:ed-1:v1:en

@oeystein
Copy link
Contributor

oeystein commented Mar 4, 2021

This is an older post but any recommendations on best practices here? I found one of my unit tests failing after refactoring from using Joda datetime to the Java 8 libraries, it's honestly quite annoying how they only support the extended 8601 format as mentioned above. All the purposed workarounds i've managed to google myself forward to have been quite honestly nasty.

@kupci
Copy link
Member

kupci commented Mar 4, 2021

I don't think we have best practice for this yet (and there are a few other cases like this, with Spring), though how about the following workaround?

As current workaround I had to register a custom InstantDeserializer and set com.fasterxml.jackson.datatype.jsr310.deser.InstantDeserializer.replaceZeroOffsetAsZ to true.

Also would be curious what other workarounds you found, just to get some ideas on what people are doing.

@oeystein
Copy link
Contributor

oeystein commented Mar 5, 2021

Unfortunately that's only a fix that works with UTC time strings. From what i can see on the jsr310 InstantDeserializer.java class, its hardcoded to check for "+0000" specifically, any timezone other then that it will ignore and thus fail. It can technically updated to be a bit more hardened towards other timezones, i guess that could be a fair fix?

Other fixes i have looked at usually includes Using DateTimeFormater to restructure the date string, but as far as i can tell, there is not a generic solution as most of them base themselves on parsing one specific ISO 8601 format. Since i unfortunately have an "open" API with many users i'd rather not have to specify which iso 8601 format they should use.

@cowtowncoder
Copy link
Member

@oeystein I hope I am not omitting some important piece of context, but to me a work-around to allow supporting "colon-less" offsets for things other than "+0000" does sound like a good addition, and PR would be welcome. If you (or anyone else) would want to try it, branch to base would be 2.13, I think, so we would get bit better verification via release candidates (compared to patch releases for 2.12.x).

@kupci
Copy link
Member

kupci commented Mar 8, 2021

I have some time to create a PR, given we have @oeystein to hopefully help make sure this works from the Spring side of things.

A few questions on implementation
(1) should this have it's own flag, like the replaceZeroOffsetAsZ? Caveat: we are trying to avoid adding more flags of course. Maybe it can be accepted under the isLenient() feature, or does it even need that, and instead should always be accepted?
(2) I see two general ways to fix
a. Create an entirely new DateTimeFormat, that accepts the colon and "colon-less" variety, as described here: Java DateTimeFormatter for time zone with an optional colon separator?
b. Similar to how the String value is modified for the replaceZeroOffsetAsZ case, use a regex and insert the colon so the parse doesn't fail, in the case where it is basic format (no colon)?

Thinking "b" to be more resilient to future changes to the DateTimeFormatter.ISO_ZONED_DATE_TIME (unlikely, but who knows), and the rest of the code, and thus just change it for this particular scenario.

@cowtowncoder
Copy link
Member

I feel that for default format, and for 2.13(.0), it'd be fine to just accept both. Users who want stricter handling / more control could specify DateTimeFormat to their liking.
For extra points we can of course consider default to be "lenient" (that is, there is "not defined [defaulting to lenient] and explicit "lenient", "strict") which should work pretty much that way BUT also allow users easier way to prevent accepting this setting.
Either way I think defaulting to what amounts to likely working better makes sense, given that there seems to be wide-spread confusion on colon/no-colon usage.

I do not have strong preference wrt 2a/2b, going with 2b works for me.

@oeystein
Copy link
Contributor

oeystein commented Mar 9, 2021

Apologizes for the late reply from my part, i had to use the hours i had in the weekend to get an overview of the library before proposing a solution.

I think it would be smart to make it a default setting, I've seen both versions used depending on the source and it took me some reading to figure out why one was accepted and the other wasn't, especially being used to the joda module from before. But this obviously depends on the policies you guys have for the library and how it should behave.

As for question number two, the example of the DateTimeFormatter unfortunately breaks one format which includes the physical timezone, for example "2016-10-27T16:36:08.993+02:00:00[Europe/Paris]".

The DateTimeFormatter.ISO_ZONED_DATE_TIME as you brought up currently includes this. We could obviously could just alter the proposed DateTimeFormatter format to also include the Timezone location, but as you mentioned previously resiliency against changes is a huge plus in the book.

This also brings up another point. Any changes should probably both be done ZonedDateTime and OffsetDateTime. The format without a physical timezone is more accurate to be used with OffsetDateTime, however ZonedDateTime accepts it as well so i see no point of limiting this in any way.

@kupci
Copy link
Member

kupci commented Mar 9, 2021

Any changes should probably both be done ZonedDateTime and OffsetDateTime.

That is a good point, in fact there's a bug recorded for this, #38. And I see though there's an interesting solution for that bug, it also has the UTC limitation you mentioned above.

Since you have some good ideas on this, would you like to work on the PR? We can assist with code reviews of course. (I think I should've waited for your reply above.)

@oeystein
Copy link
Contributor

oeystein commented Mar 9, 2021

Since you have some good ideas on this, would you like to work on the PR? We can assist with code reviews of course. (I think I should've waited for your reply above.)

Sure, i can have a look on it. That thread adds a lot of additional interesting information so i'll see if i can manage to pull something from it and come up with a solution that seems solid.

@cowtowncoder
Copy link
Member

Finally found time for one last check & merged. Thank you for contributing this @oeystein!

One last question here: do we want to allow skipping the check if "strict" mode is enabled? That's just one line guard statement, like:

if (isLenient()) {
...
}

in new addInColonToOffsetIfMissing() method.

But I am not sure if it makes sense to add or not: default is "lenient", for what that is worth; and this check is only used if no custom pattern is specified.

@oeystein
Copy link
Contributor

oeystein commented Apr 2, 2021

Thanks for the merge @cowtowncoder! This was an issue that really started to bother me, so glad to have it fixed. And obviously always happy to contribute to a project that i have used for 5+ years.

And re: lenient vs. strict - I personally have no strong feeling either way. Meanwhile having it be bypassed on a strict setting, i feel personally that as long as a pattern can be specified manually, and someone only wants to use a truly strict pattern they can already do so in a proper way rather then taking a guesswork on what the library will do.

I also took a brief look on the code base to see where it was used before, and found this code comment from you last year, so if it's a legacy setting it really should not be used further regardless?

// 21-Jun-2020, tatu: As of 2.12, leniency considered legacy setting, // but still supported. if (!isLenient()) { return _failForNotLenient(p, ctxt, JsonToken.VALUE_STRING); }

@cowtowncoder
Copy link
Member

@oeystein That comment is bit incomplete, and should perhaps be clarified. The scope of strict/lenient setting has been limited to concern value ranges -- like, say, February 30 (lenient? Ok - strict? fail) -- but not covering type/shape coercions, since for those there is new CoercionConfig setting.
Originally I thought that maybe leniency could cover those cases too, but I think it is bit too coarse on/off setting, and then came up with CoercionConfig approach to handle that aspect in a more flexible way.

On using leniency here (vs not): yes, we can leave it out. If there is desire to use it, it can be easily added.

And there is still time to consider what to do with all of these for Jackson 3.0...

@PikaBlue107
Copy link

Pardon my commenting on a previously closed issue. I'm having this problem myself and noticed something off in the discussion:

Isn't the problematic character (idx. 23) the '+', and not the missing colon?

com.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot deserialize value of type `java.time.ZonedDateTime` from String "2019-08-22T12:36:46.361+0000": Failed to deserialize java.time.ZonedDateTime: (java.time.format.DateTimeParseException) Text '2019-08-22T12:36:46.361+0000' could not be parsed at index 23

Label:  0         1         2
        0123456789012345678901234567
String: 2019-08-22T12:36:46.361+0000
                               ^
                               |

Please correct me if I'm mistaken, I could very easily be misunderstanding something.

@kupci
Copy link
Member

kupci commented Jun 8, 2021

I'm not a 100% sure on why the Java error has "could not be parsed at index 23", because you are correct, that is the '+' symbol, however I think it is that it is indicating it has trouble with that entire block, i.e. the time scale components.

What version of Jackson are you using? I see this is for ZonedDateTime, and I believe this was fixed, though in a relatively new version of Jackson (2.13) so suggestion is to try the newer release if not already. In case not, please send us a test case and we can reopen the bug.

[update] Ah, just checking and I don't think this fix has been included yet in the latest available releases which is 2.12, so likely the issue. You can subscribe to the mailing lists, which have the notifications about the new releases/release candidates, and I know it is always helpful to have a few folks testing the newer features out.

@cowtowncoder cowtowncoder added this to the 2.13.0 milestone Jun 8, 2021
@cowtowncoder
Copy link
Member

Yes, @kupci is correct. I added label and and milestone to indicate that this will be addressed in upcoming 2.13 (see https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.13).

@oeystein
Copy link
Contributor

oeystein commented Jun 8, 2021

Pardon my commenting on a previously closed issue. I'm having this problem myself and noticed something off in the discussion:

Isn't the problematic character (idx. 23) the '+', and not the missing colon?

Please correct me if I'm mistaken, I could very easily be misunderstanding something.

This is an error which is passed on from the ZonedDateTime.parse() method; which Jackson utilizes for the actual parsing. I'm not sure in what scenario that it would report the error at the incorrect place but my guess it could have something to do with java version and/or implementation.

I tested it briefly on my setup with openjdk-11, and for me it reports the error correctly at index 26.

@PikaBlue107
Copy link

PikaBlue107 commented Jun 8, 2021

Unfortunately I'm having trouble reproducing the exact bug listed in this post, and that I was (supposedly) experiencing earlier. If I run the code in the OP's initial post (changing only the log / printout and the assertion for notNull), I get the following output:

com.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot deserialize value of type `java.time.ZonedDateTime` from String "2019-08-22T12:36:46.361+0000": Failed to deserialize java.time.ZonedDateTime: (java.time.format.DateTimeParseException) Text '2019-08-22T12:36:46.361+0000' could not be parsed, unparsed text found at index 26

Label:  0         1         2
        0123456789012345678901234567
String: 2019-08-22T12:36:46.361+0000
                                  ^
                                  |

which seems to be the "correct" location reported for this error.
I've only been able to reproduce similar errors from my own deserialization, as well as directly trying ZonedDateTime.parse(dateText);

I can confirm that I'm using Jackson version 2.12.3, and Java version adoptopenjdk11.
Edit: I realize now that I'm on the "Jackson for Java 8" repository, so perhaps this isn't all that helpful 😅

Sidenote: I was also able to work around the problem by deserializing the entire model object containing the ZonedDateTime under test, making use of a custom @JsonDeserializer that my organization provided for me. I'll definitely still do anything I can to continue helping w/ any lingering issues here, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants