Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing START_OBJECT token in complex element starting with text #442

Closed
richardsonwk opened this issue Dec 14, 2020 · 2 comments
Closed
Milestone

Comments

@richardsonwk
Copy link

richardsonwk commented Dec 14, 2020

In version 2.12.0, I think XmlTokenStream might be omitting START_OBJECT tokens when a complex element starts with text content. The following has two START_OBJECT tokens but three END_OBJECT tokens:

        final String example = String.join("\n", Arrays.asList(
            "<SomeXml>", // START_OBJECT
            "  <ParentElement>", // *Missing* START_OBJECT
            "     text",
            "     <ChildElement someAttribute=\"value\"/>", // START_OBJECT/END_OBJECT
            "     further text",
            "  </ParentElement>", // END_OBJECT
            "</SomeXml>")); // END_OBJECT

        final XmlFactory factory = new XmlFactory();
        final JsonParser parser = factory.createParser(example);

        while (parser.nextToken() != null) {
            System.out.println(parser.currentToken()
                + (parser.getValueAsString() != null ? "(" + parser.getValueAsString() + ")" : ""));
        }

The output is

START_OBJECT
FIELD_NAME(ParentElement)
FIELD_NAME()
VALUE_STRING(
     text
     )
FIELD_NAME(ChildElement)
START_OBJECT
FIELD_NAME(someAttribute)
VALUE_STRING(value)
END_OBJECT
FIELD_NAME()
VALUE_STRING(
     further text
  )
END_OBJECT
END_OBJECT

Maybe there's a point of configuration I'm missing?

@richardsonwk richardsonwk changed the title XmlParser Missing START_OBJECT tokens in complex element starting with text Dec 14, 2020
@richardsonwk richardsonwk changed the title Missing START_OBJECT tokens in complex element starting with text Missing START_OBJECT token in complex element starting with text Dec 14, 2020
@cowtowncoder
Copy link
Member

It does sound like potential flaw in token stream, yes, probably related to 2.12 change to expose mixed content (see #405). There is no configurability of underlying stream so ideally this could be fixed without breaking any existing functionality.

@cowtowncoder cowtowncoder added 2.12 and removed 2.12 labels Dec 16, 2020
@cowtowncoder cowtowncoder added this to the 2.13.0 milestone Mar 30, 2021
@cowtowncoder cowtowncoder changed the title Missing START_OBJECT token in complex element starting with text Missing START_OBJECT token in complex element starting with text Mar 30, 2021
@cowtowncoder
Copy link
Member

Finally had a good solid reason to have to fix this: recent un-recursing of JsonNode deserialization made this a necessity (in the past that handled "orphan FIELD_NAME" case by implicit assumption of leading START_OBJECT; not any longer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants