Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjectMapper.readTree() methods do not return null on end-of-input #1406

Closed
fabriziocucci opened this issue Oct 8, 2016 · 8 comments
Closed
Milestone

Comments

@fabriziocucci
Copy link

Hi,

the Javadoc for the readTree(InputStream) method goes like this:

If a low-level I/O problem (missing input, network error) occurs, a IOException will be thrown. If a parsing problem occurs (invalid JSON), JsonParseException will be thrown. If no content is found from input (end-of-input), Java null will be returned.

But if I try to use it with an empty file (through the FileInputStream) I get the following exception:

com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input

I was wondering whether it is the code not to be aligned with the expected behavior or it is just the Javadoc.

Ideally, from a consumer prospective, it would have been nice to return a sort of "identity" JsonNode (e.g. an empty JsonNodeType.OBJECT or a JsonNodeType.MISSING) in case of empty file, stream, etc. But I can imagine that such a change (if it even makes sense) could cause some backward compatibility issue.

@cowtowncoder
Copy link
Member

@fabriziocucci Thank you for reporting this. My first thought here was that the documentation is wrong, since all similar read methods do this. But I also know that there is a difference in that whereas regular data-binding is limited to just null, Tree-variant could choose to return NullNode for JSON null, and Java null for missing content. So in a way both options are possible...

I will have to think about this bit more.

In the meantime, if you do need to allow for optional content, you may choose to use ObjectReader.readValues() as sometimes that does allow iteration over multiple root-level values; including cases of no content and just one value (i.e. not necessary to actually have more than 0).

@fabriziocucci
Copy link
Author

fabriziocucci commented Oct 9, 2016

@cowtowncoder thanks for your prompt reply.

I see your point but, probably, using readValues in my specific use case would be a bit misleading. In my use case, each InputStream should be the source of a single json/yaml root object which represents a configuration. Empty files should be treated as empty configurations.

If I'm not wrong, based on your suggestion, I would need to do something like this:

public static JsonNode fromInputStream(InputStream inputStream) {
    try {
        MappingIterator<JsonNode> configurations = YAML_OBJECT_MAPPER.reader(JsonNode.class).readValues(inputStream);
        if (configurations.hasNext()) {
            JsonNode configuration = configurations.next();
            if (configurations.hasNext()) {
                throw new IllegalStateException("Found multiple configurations!");
            } else {
                return configuration;
            }
        } else {
            return emptyConfiguration();
        }
    } catch (Exception exception) {
        throw new IllegalStateException(exception);
    }
}

which probably does the trick but...it looks a bit messy, doesn't it?

I was currently doing something like this:

public static JsonNode fromInputStream(InputStream inputStream) {
    try {
        return (inputStream.available() > 0) ?
                YAML_OBJECT_MAPPER.readTree(inputStream) :
                emptyConfiguration();
    } catch (Exception exception) {
        throw new IllegalStateException(exception);
    }
}

which is not good either because highly relies on how the InputStream subclass overrides the available method. In case of FileInputStream it seems to work just fine.

By the way, I'm asking because I'm kind of having fun with this library and I'm sure your insights will be more than helpful!

@cowtowncoder
Copy link
Member

I don't see why use of readValues() would have any specific meanings. I just mean that it would actually work, and do what you want -- it may seem odd to whoever reads the code, but that's what comments are for; or, creating a helper method that explains the logic similarly.

However: if you prefer alternate method, you could also create a YAMLParser via YAMLFactory, try to see if nextToken() returns null (end of content) or a token; and only call mapper.readValue(parser, type) in latter case. That is cleaner in a way.

@fabriziocucci
Copy link
Author

it may seem odd to whoever reads the code

Exactly, that was my point and yes, I prefer the second solution for now, thanks a lot for the suggestion! 😉

But while I was playing with the JsonParser API, I noticed something else (sorry, bear with me).

We said that the following line:

YAML_OBJECT_MAPPER.readTree(inputStream);

correctly throws in case of empty streams and its Javadoc will be aligned to reflect this.

But the following line:

YAML_OBJECT_MAPPER.getFactory().createParser(inputStream).readValueAsTree();

actually returns null in case of empty streams and its Javadoc is aligned with this!

Now my question is: shouldn't the two API be aligned?

@fabriziocucci
Copy link
Author

I was curious to see why that happens and I ended up swimming in debug mode! 😅

While the second option (i.e. JsonParser#readValueAsTree) moves straight to the ObjectMapper#readTree method:

public <T extends TreeNode> T readTree(JsonParser p)
    throws IOException, JsonProcessingException
{
    /* 02-Mar-2009, tatu: One twist; deserialization provider
     *   will map JSON null straight into Java null. But what
     *   we want to return is the "null node" instead.
     */
    /* 05-Aug-2011, tatu: Also, must check for EOF here before
     *   calling readValue(), since that'll choke on it otherwise
     */
    DeserializationConfig cfg = getDeserializationConfig();
    JsonToken t = p.getCurrentToken();
    if (t == null) {
        t = p.nextToken();
        if (t == null) {
            return null;
        }
    }
    JsonNode n = (JsonNode) _readValue(cfg, p, JSON_NODE_TYPE);
    if (n == null) {
        n = getNodeFactory().nullNode();
    }
    @SuppressWarnings("unchecked")
    T result = (T) n;
    return result;
}

the first one (i.e. ObjectMapper#readTree) stops by the ObjectMapper#_initForReading method:

protected JsonToken _initForReading(JsonParser p) throws IOException
{
    _deserializationConfig.initialize(p); // since 2.5

    /* First: must point to a token; if not pointing to one, advance.
     * This occurs before first read from JsonParser, as well as
     * after clearing of current token.
     */
    JsonToken t = p.getCurrentToken();
    if (t == null) {
        // and then we must get something...
        t = p.nextToken();
        if (t == null) {
            // Throw mapping exception, since it's failure to map,
            //   not an actual parsing problem
            throw JsonMappingException.from(p, "No content to map due to end-of-input");
        }
    }
    return t;
}

Now, it may be perfectly OK to have different behaviors but, as a user, I'm a bit surprised to discover this difference.

@cowtowncoder
Copy link
Member

@fabriziocucci at this point, I am sorry I ever added various JsonParser.readXxx() methods... they are problematic at many levels. Ideally behavior would be aligned, but given that they are not things get more complex since any change is, at least theoretically, breaking change. And more often than not someone somewhere observes a breakage. :)

Now. I am beginning to think that perhaps documentation-indicated behavior does make sense to support. Returning null is not ambiguous; and it seems unlikely anyone is actually relying on exception being thrown. And so while it does make behavior between readTree() and readValue()-with-JsonNode different, it seems like a lesser of evils.

@fabriziocucci
Copy link
Author

@cowtowncoder I'm sorry if my obsession for consistency gave you some headache 😅 but I understand how hard it is changing at this stage!

Although my preference is always for returning an empty JsonNode, I agree with you that returning null in this case is probably better then throwing an exception! 👍

@cowtowncoder cowtowncoder changed the title readTree(InputStream in) behavior not aligned with its contract ObjectMapper.readTree() methods do not return null on end-of-input Oct 28, 2016
@cowtowncoder cowtowncoder added this to the 2.9.0 milestone Oct 28, 2016
@cowtowncoder
Copy link
Member

Note to curious readers: there was yet more wailing and gnashing of teeth, for... #2211 was found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants