Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a way to escape dots in property keys (add path separator configuration) #169

Closed
Yaytay opened this issue Dec 18, 2019 · 18 comments · Fixed by #332
Closed

Need a way to escape dots in property keys (add path separator configuration) #169

Yaytay opened this issue Dec 18, 2019 · 18 comments · Fixed by #332
Labels
Properties Issue related to (Java) Properties format backend
Milestone

Comments

@Yaytay
Copy link
Contributor

Yaytay commented Dec 18, 2019

I want to create a map from properties using JavaPropsMapper, which works as long as the keys for the map don't contain dots.

I have something like this:

    Config expResult = new Config(
            new Config.Server(17, "/path", "param1", "param2", "Wobble")
            , ImmutableMap.<String, String>builder().put("x.y", "y").put("a.b", "b").build()
    );

    System.setProperty("server.arg", "param1");
    System.setProperty("map.x.y", "y");

And when I run that through the JavaPropsMapper it falls over because it creates:
{"map":{"x":{"y":"y"}}}
when what I want is:
{"map":{"x.y":"y"}}
(I want to have domain names as keys)

It would be nice if there was some way to escape the dots in property names so that they weren't considered to be delimiters.
I could probably make this work by configuring a completely different delimiter, but that would just confuse anyone wanting to work with this.

Things I've tried, any of which would be an OK solution for me, are:

    System.setProperty("map[x.y]", "y");
    System.setProperty("map.[x.y]", "y");
    System.setProperty("map.x\\.y", "y");
    System.setProperty("map.'x.y'", "y");
    System.setProperty("map.x'.'y", "y");
@cowtowncoder
Copy link
Member

Please make sure to first read README of java properties backend at:

https://github.com/FasterXML/jackson-dataformats-text/tree/master/properties

(link to from main README of jackson-dataformats-text)

and specifically part about using JavaPropsSchema. You can change the separator from '.' to something else.

@Yaytay
Copy link
Contributor Author

Yaytay commented Dec 18, 2019

Aye, but I don't want to change the separator, because that will confuse anyone setting these properties (as they will expect them to use dots, 'cos that's what everyone does).
What I want is to be able to escape a separator in a key so that one specific instance is ignored.

I can't see any way to do that with the JavaPropsSchema as it is.

@cowtowncoder
Copy link
Member

Right, there is no escape mechanism currently, as escaping from external input is handled by default JDK Properties reader. I guess it would be nice to have configurable escaping mechanism, so if anyone wants to implement something (specified via JavaPropsSchema) I could help merge the patch.

The big problem here however is this: parsing/decoding is currently delegated to JDK Properties, and that will resolve backslash-based quoting. So to support escaped quoted separators, this would need to be re-implemented.
Second part, escaping the first character of separator (or more for some exotic separators, in theory), would seem like simpler thing.

I do not have time to work on this myself but it seems like a reasonable idea.

@cowtowncoder cowtowncoder added the Properties Issue related to (Java) Properties format backend label Dec 18, 2019
@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 2, 2022

Things may have changed a lot in the last nearly three years, but looking at the props mapper now I think the issue can be resolved in the various Splitter implementations (JPropPathSplitter and its children).
It'll be more complex, because at the moment they are just based on indexOf and handling escaping properly is a pain, but it doesn't look very complex.
Have I missed something fundamental?

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 2, 2022

Ah, this is the fundamental thing I was missing:

a non-zero even number of 2n contiguous backslashes before a line terminator (or elsewhere) encodes n backslashes after escape processing

(from https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/Properties.html)

I'm not sure we need to worry too much about that - the user simply has to use twice as many backslashes.
So, with a small (though messy) modification to the JPropPathSplitter, this works:

    public void testMapWithBranch() throws Exception
    {
        // basically "extra" branch should become as first element, and
        // after that ordering by numeric value
        final String INPUT = "map=first\n"
                +"map.b=second\n"
                +"map.xyz=third\n"
                +"map.ab\\\\.c=fourth\n"
                ;
        MapWrapper w = MAPPER.readValue(INPUT, MapWrapper.class);
        assertNotNull(w.map);
        assertEquals(4, w.map.size());
        assertEquals("first", w.map.get(""));
        assertEquals("second", w.map.get("b"));
        assertEquals("third", w.map.get("xyz"));
        assertEquals("fourth", w.map.get("ab.c"));
    }

@cowtowncoder
Copy link
Member

PRs welcome (against 2.14), sounds like a useful improvement.

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 2, 2022

There seem to be a few areas to consider in working on an improvement and I'd like your opinion on them:

  1. indexOf is going to be faster than parsing for escaped values, how much do I try to keep the simple code path as fast as it currently is?
  2. Do we need to consider multi-character escapes (and path separators)?
    I''m inclined to think that we don't because if you are not using the default path separator you already have more freedom to solve the problem.
  3. At the moment I think the code will fail if the properties file contains surrogate code points, am I OK to expect it to come in chars rather that code points?
  4. What should I do when faced with an escape char that is not escaping the separator?
    The simplest option is to leave it completely alone, so \ is only special when followed by a dot, but that might be counterintuitive if users expect \\ to end up as .
    Any other answer means that every splitter needs to check for escapes and will have some impact on performance.

Given that this only affects the properties mapper I may be worrying too much about performance.

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 3, 2022

PR #332 as a working, minimal, attempt.

@cowtowncoder
Copy link
Member

@Yaytay great questions! Let me see.

  1. True. I am bit torn -- if easy enough, it'd be nice to have default fast path but I suspect it'd mean that handling escapes might have to be separately enablable feature. On the other hand, properties format handler is not super fast anyway since we defer to JDK default handler. If an optimized alternative was implemented it'd be different. So I am not super worried I guess but at the same time do like optimizations as long as they are maintainable
  2. Agreed: no need for multi-character escapes. Path separators... we just need to support existing functionality.
  3. I think that not breaking things is fine; as in: if surrogates do not work, PR need not support them either
  4. Good question, not 100% sure. To me leaving escape unprocessed seems wrong (I think Maven does something like that and I dislike it). Most intuitive is probably accepting escape for all characters (so include whatever follows as-is). We could have JavaPropsParser.Feature.REPORT_BAD_ESCAPES (or such) to alternatively allow parsing exception to be thrown? (could even be enabled by default)

On PR, yes, will try to have a look soon -- thank you for contributing it!

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 4, 2022

There are (at least) two problems in the first version of the PR:

  1. The default behaviour should be unchanged, so the default escape char should be invalid.
  2. I've been thinking about the escaping of escapes and my conclusion is that it should be possible, but is only necessary when the escapes precede the separator.
    This keeps the existing behaviour (which is to leave escapes alone) intact in all cases except an escape separator but still enables all possible keys to be submitted.

More tests needed.

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 4, 2022

Updated the PR.
The default escape char is now '\0', which should be a safer default.
Now it allows the escaping of escapes when the last escape immediately precedes the separator - this avoids treating escapes as special in all circumstances (which would be a lot more work) but makes all outputs possible.
I've also increased the test inputs for it quite a lot because it wasn't going to work for nested levels in my first attempts.
I've added a test case that uses '#' as the escape because it makes it a lot clearer what my code is doing than the vast chain of backslash characters in the other test.

@cowtowncoder
Copy link
Member

Hmmh. Is the idea of default being NULL character that it would be same as having none? If so I can see it as "not defined" -- which is fine; I consider that invalid within properties file -- but wanted to double check.

@Yaytay
Copy link
Contributor Author

Yaytay commented Sep 6, 2022

Yes, that's precisely the intention - I think it should not be possible for my changes to affect any existing prop files being processed without developer intervention.

I could make it a Character instead, but then I'd either have to check for null or cast to an int in CharPathOnlySplitter.splitAndAdd on each return from indexOf.
I suspect casting to an int is essentially free, so that may be the better option.

i.e. change JavaPropsSchema to make pathSeparatorEscapeChar a Character, then in CharPathOnlySplitter store an int that is either the char value or -1 if it's null, and cast the result from charAt to an int before comparing with the escape char:

    public static class CharPathOnlySplitter extends JPropPathSplitter
    {
        protected final char _pathSeparatorChar;
        protected final int _escapeChar;

        public CharPathOnlySplitter(char sepChar, Character escapeChar, boolean useIndex)
        {
            super(useIndex);
            _pathSeparatorChar = sepChar;
            _escapeChar = escapeChar == null ? -1 : escapeChar;
        }        

        @Override
        public JPropNode splitAndAdd(JPropNode parent,
                String key, String value)
        {
            JPropNode curr = parent;
            int start = 0;
            final int keyLen = key.length();
            int ix;

            while ((ix = key.indexOf(_pathSeparatorChar, start)) >= start) {
                if (ix > start) { // segment before separator
                    if ((int) key.charAt(ix - 1) == _escapeChar) { //potentially escaped, so process slowly
                      return _continueWithEscapes(curr, key, start, value);
                    }

@cowtowncoder cowtowncoder added this to the 2.14.0 milestone Sep 11, 2022
@cowtowncoder cowtowncoder changed the title No way to have properties create a map with dots in the keys Need a way to escape dots in property keys (add path separator configuration) Sep 11, 2022
cowtowncoder added a commit that referenced this issue Sep 11, 2022
@Yaytay
Copy link
Contributor Author

Yaytay commented Oct 7, 2022

Hi again,

Now that 2.14.0-rc1 is out I've just found that there is no way for me to actually use this :(
I'm using the JavaPropsMapper to update an existing object, with JavaPropsMapper#readerForUpdating, but there isn't an overload of that that can take in a schema.
JavaPropsMapper#readerForUpdating just forwards to ObjectMapper#_newReader, but that's protected so I can't call it directly.
ObjectMapper#_newReader calls the constructor for ObjectReader, but that is also protected.

Not sure what the best solution is, but can we either add an override or change the access on something so that JavaPropsMapper can create a Reader with both a custom schema and a valueToUpdate?
Alternatively I can subclass JavaPropsMapper in my code, but that would only help me.

Jim

@Yaytay
Copy link
Contributor Author

Yaytay commented Oct 7, 2022

This is my hacky subclass which does work, so I'm not blocked, but it's not nice.

public class SchemaPropsMapper extends JavaPropsMapper {

  public static Builder builder() {
    return new JavaPropsMapper.Builder(new SchemaPropsMapper());
  }

  public ObjectReader newReader(DeserializationConfig config, JavaType valueType, Object valueToUpdate, FormatSchema schema, InjectableValues injectableValues) {
    return super._newReader(config, valueType, valueToUpdate, schema, injectableValues);
  }

}

@cowtowncoder
Copy link
Member

@Yaytay ObjectReader has with(FormatSchema) method for constructing reader instance with schema.
Wouldn't that work?

@Yaytay
Copy link
Contributor Author

Yaytay commented Oct 8, 2022

Oh, I'm sorry, I'd missed all of those.
Yes, that seems to work just fine.

@cowtowncoder
Copy link
Member

Ok no problem. Glad they work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Properties Issue related to (Java) Properties format backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants