Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow serialization of InetAddress as simple numeric host address #1605

Closed
2is10 opened this issue Apr 15, 2017 · 8 comments
Closed

Allow serialization of InetAddress as simple numeric host address #1605

2is10 opened this issue Apr 15, 2017 · 8 comments
Milestone

Comments

@2is10
Copy link

2is10 commented Apr 15, 2017

An InetAddress is defined this way in its Javadoc:

This class represents an Internet Protocol (IP) address.
An instance ... consists of an IP address and possibly its corresponding host name.

The IP address is central. The hostname is optional, peripheral.

However, Jackson’s InetAddressSerializer just uses the hostname, if present, and discards the IP address. This is critical information loss. A hostname isn’t guaranteed to be resolvable to the same IP address (or at all) where/when deserialized.

A corresponding issue with the deserializer is that hostname resolution (network access) is attempted for any input string that isn’t a valid IP address. This violates the Principle of Least Astonishment.

These seem like important issues to fix, but I’m not sure how to do it in a backwards-compatible way. The JDK’s InetAddress is partly to blame, the minefield that it is.

I suggest taking inspiration from Guava’s InetAddresses and doing some of your own parsing.

You’ll need to decide whether to serialize to just the IP address or to try to also preserve the hostname, if present. For what it’s worth, my team’s present use cases only care about the IP address. However, any information loss could be surprising to people who use this library.

@cowtowncoder
Copy link
Member

Thank you reporting this. I agree in that handling is sub-optimal, but I am not quite sure what to do about that at this point. I do not really look forward to having to add elaborate handling.
One option would be to add default (de)serializers that just throw an exception, to force users to provide handler of their own, but that would not work well wrt backwards compatibility.

@2is10
Copy link
Author

2is10 commented Apr 18, 2017

What level of backwards compatibility is required for a minor version like 2.9? What about 3.0? If we can agree on a strategy, I’d be happy to prepare a pull request.

A simple improvement would be to always serialize to the address—never the hostname. This would be a net reduction in code and would eliminate the possibility of network access upon deserialization for JSON produced using this library, at least.

If hostname preservation is important, we could serialize to the InetAddress.toString() format ("hostname/address") if a hostname is known. The deserializer could handle the / delimiter if present without much code, and otherwise behave as it does today.

@cowtowncoder
Copy link
Member

@2is10 I assume that the critical think would be to support deserialization from both textual and numeric representations. Serialization compatibility is probably little bit less critical.
One possibility regarding serialization would be to allow configuration using @JsonFormat.shape, since use of, say, Shape.NUMBER (or Shape.ARRAY) vs Shape.STRING could be used to allow switching between serializing host name, for those who prefer it despite problems there are.
I like the simple idea of just serializing address, regardless; good point about performance ramifications as well.
Compatibility wrt 2.9 vs 3.0 is interesting: we could choose to honor old defaults for 2.9 and change in 3.0, as an example.

If you are interesting in tackling this, I'd be happy to help -- I am bit overloaded with issues in general, but can help with reviewing and getting things merged. This change could still make it in 2.9: I am planning to do pr3 soon, but that does not constitute feature freeze.

2is10 added a commit to 2is10/jackson-databind that referenced this issue Apr 19, 2017
@cowtowncoder cowtowncoder modified the milestones: 2.9.0.p3, 2.9.0.pr3 Apr 25, 2017
@cowtowncoder
Copy link
Member

Actually I think that maybe leaving default serialization unchanged makes sense for 2.9.
Based on my experiences there are often users who find changes unexpected and harmful, even despite there being solid reason for change like here.

So: what I will do is to support use of @JsonFormat(shape=), and its per-type defaulting:

        ObjectMapper mapper = new ObjectMapper();
        mapper.configOverride(InetAddress.class)
            .setFormat(JsonFormat.Value.forShape(JsonFormat.Shape.NUMBER));

to allow changing output to use host address when "use number" is indicated.

And for 3.0 we can and should change the default (whether to allow textual is an open question; but amount of code for that is rather small so not a big deal.

@cowtowncoder cowtowncoder changed the title InetAddressSerializer discards IP address, InetAddressDeserializer accesses network Allow serialization of InetAddress as simple numeric host address Apr 25, 2017
@2is10
Copy link
Author

2is10 commented Apr 25, 2017

Thanks! No changes to default behavior on a minor version release seems like a reasonable policy.

It could be a little surprising that shape=NUMBER and shape=ARRAY still produce a JSON string, albeit a different one. Maybe that already occurs for other types; I just haven’t seen it before. Still, it seems like a reasonable compromise for a minor version.

Here’s some brainstorming for 3.0 formats. Depending on whether the hostname is known:

  • shape=ARRAY could serialize to ["0.0.0.0"] or ["0.0.0.0", "hostname"].
  • shape=STRING could serialize to "0.0.0.0" or "hostname/0.0.0.0".
  • shape=NUMBER could always serialize to "0.0.0.0".

I suppose shape=STRING would be the natural default.

Notes:

  • InetAddress.toString() can be used to determine whether an InetAddress contains a hostname without triggering a lookup.
  • InetAddress.getByAddress(byte[]) and InetAddress.getByAddress(String, byte[]) can be used to instantiate without triggering a lookup.

@cowtowncoder
Copy link
Member

@2is10 I agree that such choice for "number" is bit too clever, but given that JsonFormat definition is quite stable I think that'll have to do.
ARRAY could be refined as suggested or perhaps even as array of numeric components.

Interesting wrt methods that can avoid lookup. Maybe something based on those could still be used for deserialization, even for 2.9?

@2is10
Copy link
Author

2is10 commented Apr 25, 2017

Interesting wrt methods that can avoid lookup. Maybe something based on those could still be used for deserialization, even for 2.9?

Deserialization currently uses InetAddress.getByName(String), which triggers a lookup if the argument isn’t an IP address (since the type requires an IP address), and does no lookup if the argument is an IP address. As long as serialization sometimes discards the IP address, I don’t see a way to improve the deserialization.

In 2.9, if shape=NUMBER is specified, serialization always preserves the IP address, and deserialization already doesn’t do a lookup for valid addresses (yay!). Deserialization would still attempt name resolution for invalid addresses. Avoiding that behavior would require logic to test whether a string is a valid IP address. As far as I know, this functionality isn’t exposed in the Java API. Behind the scenes in Oracle’s JRE, InetAddress.getByName uses sun.net.util.IPAddressUtil, whose methods are public but shouldn’t be used. I’m not sure if rejecting invalid addresses gracefully is worth adding your own IP address validation code or the weight of a dependency. In short, I’d probably leave deserialization alone for 2.9.

@cowtowncoder
Copy link
Member

@2is10 Thank you for thorough explanation: makes sense, good to know there are heuristics so things will work sort of acceptably for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants