Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure IonReader instances created within IonFactory are always resource-managed #325

Merged
merged 1 commit into from Jul 6, 2022

Conversation

tgregg
Copy link
Contributor

@tgregg tgregg commented May 18, 2022

This is intended to fix the memory leak issue described in #306, superseding that pull request.

This solution ensures that IonReader instances created by IonFactory are marked as "resource-managed", meaning that they are closed during IonParser.close(). Closing an IonReader causes it to close any resources it creates (e.g. a GZIPInputStream, as was the case in the original PR).

Note: IonFactory/IonParser currently ignores the StreamReadFeature.AUTO_CLOSE_SOURCE feature, which allows users to opt-in to disabling closure of resources (e.g. an InputStream) provided by the user. Before this change, user-provided InputStreams and Readers would not be closed by IonParser.close() because the IonReader created to wrap the resources was never closed. Now that that IonReader is closed, those user-provided resources will be closed transitively via IonReader.close(). Although this is technically a behavioral change, it seems unlikely to be harmful, as resuming parsing of a partially-consumed stream of Ion data without any context is only possible in specific circumstances. It seems far more likely that current users are either 1) manually closing these resources, in which case this change leads to a redundant close (not harmful in the InputStream/Reader implementations I'm aware of), or 2) forgetting to close these resources, which may leak memory, depending on the implementation. Option 2) is even more likely for users used to JsonFactory, where AUTO_CLOSE_SOURCE is (as far as I can tell) on by default. Therefore, I conclude that the benefits of this change outweigh the risks; however, I'm open to other opinions.

This solution isn't very aesthetically-pleasing, as it discards and replaces the IOContext passed in by the JsonFactory superclass. This amounts to one redundant IOContext instantiation per createParser call. This could probably be avoided, but it would require a larger refactor.

The included unit test verifies that IonReaders are always marked as resource-managed except when provided directly by the user. The test assumes correct behavior of IonParser.close(); namely, that resource-managed resources that are Closeable will be closed. Before this fix, the BYTE_ARRAY, CHAR_ARRAY, READER, and INPUT_STREAM cases failed.

@cowtowncoder
Copy link
Member

Sounds reasonable to me.

But I'd like to get a +1 from Ion maintainers. @mcliedtke @jobarr-amzn WDYT?
(also note to self: need to add actual active maintainer list somewhere, maybe pom.xml and README(s))

Copy link
Contributor

@jobarr-amzn jobarr-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tgregg@ is much more an Ion expert than I, see the contributors breakdown for ion-java. I'm happy to take a look though.

Thanks for doing this, Tyler! All comments optional, I tossed out a couple testing opinions you can take or leave as you like.

Comment on lines 22 to 58
private enum InputSource {
BYTE_ARRAY() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(BINARY_INT_0);
}
},
CHAR_ARRAY() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(TEXT_INT_0.toCharArray());
}
},
READER() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(new StringReader(TEXT_INT_0));
}
},
INPUT_STREAM() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(new ByteArrayInputStream(BINARY_INT_0));
}
},
ION_READER() {
@Override
IonParser newParser(IonFactory factory) {
return factory.createParser(factory.getIonSystem().newReader(BINARY_INT_0));
}
},
ION_VALUE() {
@Override
IonParser newParser(IonFactory factory) {
return factory.createParser(factory.getIonSystem().newInt(0));
}
};

abstract IonParser newParser(IonFactory factory) throws IOException;
}

@Parameterized.Parameters(name = "{0}")
public static InputSource[] parameters() {
return InputSource.values();
}

@Parameterized.Parameter
public InputSource inputSource;

@Test
public void testIonParserIsResourceManaged() throws IOException {
IonFactory factory = new IonFactory();
IonParser parser = inputSource.newParser(factory);
// When the user provides an IonReader, it is not resource-managed, meaning that the user retains the
// responsibility to close it. In all other cases, the IonReader is created internally, is resource-managed,
// and is closed automatically in IonParser.close().
assertEquals(inputSource != InputSource.ION_READER, parser._ioContext.isResourceManaged());
assertTrue(IonReader.class.isAssignableFrom(parser._ioContext.contentReference().getRawContent().getClass()));
parser.close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private enum InputSource {
BYTE_ARRAY() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(BINARY_INT_0);
}
},
CHAR_ARRAY() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(TEXT_INT_0.toCharArray());
}
},
READER() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(new StringReader(TEXT_INT_0));
}
},
INPUT_STREAM() {
@Override
IonParser newParser(IonFactory factory) throws IOException {
return (IonParser) factory.createParser(new ByteArrayInputStream(BINARY_INT_0));
}
},
ION_READER() {
@Override
IonParser newParser(IonFactory factory) {
return factory.createParser(factory.getIonSystem().newReader(BINARY_INT_0));
}
},
ION_VALUE() {
@Override
IonParser newParser(IonFactory factory) {
return factory.createParser(factory.getIonSystem().newInt(0));
}
};
abstract IonParser newParser(IonFactory factory) throws IOException;
}
@Parameterized.Parameters(name = "{0}")
public static InputSource[] parameters() {
return InputSource.values();
}
@Parameterized.Parameter
public InputSource inputSource;
@Test
public void testIonParserIsResourceManaged() throws IOException {
IonFactory factory = new IonFactory();
IonParser parser = inputSource.newParser(factory);
// When the user provides an IonReader, it is not resource-managed, meaning that the user retains the
// responsibility to close it. In all other cases, the IonReader is created internally, is resource-managed,
// and is closed automatically in IonParser.close().
assertEquals(inputSource != InputSource.ION_READER, parser._ioContext.isResourceManaged());
assertTrue(IonReader.class.isAssignableFrom(parser._ioContext.contentReference().getRawContent().getClass()));
parser.close();
}
@Test
public void byteArrayIsManaged() throws Throwable {
assertResourceManaged(true, parser(f -> f.createParser(BINARY_INT_0)));
}
@Test
public void charArrayIsManaged() throws Throwable {
assertResourceManaged(true, parser(f -> f.createParser(TEXT_INT_0.toCharArray())));
}
@Test
public void readerIsManaged() throws Throwable {
assertResourceManaged(true, parser(f -> f.createParser(new StringReader(TEXT_INT_0))));
}
@Test
public void inputStreamIsManaged() throws Throwable {
assertResourceManaged(true, parser(f -> f.createParser(new ByteArrayInputStream(BINARY_INT_0))));
}
@Test
public void ionValueIsManaged() throws Throwable {
assertResourceManaged(true, parser(f -> f.createParser(f.getIonSystem().newInt(0))));
}
@Test
public void ionReaderIsNotManaged() throws Throwable {
// When the user provides an IonReader, it is not resource-managed, meaning that the user retains the
// responsibility to close it. In all other cases, the IonReader is created internally, is resource-managed,
// and is closed automatically in IonParser.close().
assertResourceManaged(false, parser(f -> f.createParser(f.getIonSystem().newReader(BINARY_INT_0))));
}
private void assertResourceManaged(boolean expectResourceManaged, ThrowingSupplier<IonParser> supplier)
throws Throwable {
IonParser parser = supplier.get();
Assertions.assertEquals(expectResourceManaged, parser._ioContext.isResourceManaged());
Assertions.assertTrue(IonReader.class.isAssignableFrom(parser._ioContext.contentReference().getRawContent().getClass()));
parser.close();
}
private interface ThrowingFunction<T,R> {
R apply(T t) throws Throwable;
}
private static ThrowingSupplier<IonParser> parser(ThrowingFunction<IonFactory, JsonParser> f) {
return () -> (IonParser) f.apply(new IonFactory());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can drop a dozen lines and the parameterization at the same time without increasing the maintenance burden of the test cases imo. ThrowingSupplier and Assertions here come from JUnit 5 (in use by jackson-core so it ought to be fine to bring in). If you want to stick on JUnit 4.13 then ThrowingSupplier is another 3 lines.

If we really want parameterized tests then we can use Java 8's functional features to drop 20 lines from the enum and make it more readable:

    enum InputSource {
        BYTE_ARRAY(true, parser(f -> f.createParser(BINARY_INT_0))),
        CHAR_ARRAY(true, parser(f -> f.createParser(TEXT_INT_0.toCharArray()))),
        READER(true, parser(f -> f.createParser(new StringReader(TEXT_INT_0)))),
        INPUT_STREAM(true, parser(f -> f.createParser(new ByteArrayInputStream(BINARY_INT_0)))),
        ION_VALUE(true, parser(f -> f.createParser(f.getIonSystem().newInt(0)))),

        ION_READER(false, parser(f -> f.createParser(f.getIonSystem().newReader(BINARY_INT_0)))),
        ;

        private final boolean isResourceManaged;
        private final ThrowingSupplier<IonParser> supplier;

        InputSource(boolean isResourceManaged, ThrowingSupplier<IonParser> s) {
            this.isResourceManaged = isResourceManaged;
            this.supplier = s;
        }
    }
    @ParameterizedTest
    @EnumSource(InputSource.class)
    public void testIonParserIsResourceManaged(InputSource inputSource) throws Throwable {
        assertResourceManaged(inputSource.isResourceManaged, inputSource.supplier);
    }

I tried out a few other patterns here but didn't really love any of them. I think non-parameterized is the best for this case. Your mileage may vary.

@cowtowncoder
Copy link
Member

@jobarr-amzn Thanks! While we are discussing maintainers -- I'd be happy to add links to devs you think would qualify as co-authors for the Ion format module, from release-notes/VERSION-2.x. Both for others and just for myself so I can check who to cc (with so many modules and relatively low volume here I tend to forget this over time :) ).

@cowtowncoder
Copy link
Member

@tgregg Just let me know when you are happy with this and I will merge it -- timing is great since we still have time until 2.14.

@tgregg
Copy link
Contributor Author

tgregg commented Jul 6, 2022

@cowtowncoder I incorporated @jobarr-amzn's feedback, so this is ready to merge from our perspective. Thanks for your help.

Feel free to add me to the list of contributors you're compiling.

@cowtowncoder
Copy link
Member

Thank you @tgregg ! I did add you as a maintainer and happy to add anyone else who wants to volunteer. There isn't anything specific you need to do of course, it's more information so I know who to occasionally refer from issues et.

Will now merge this.

@cowtowncoder cowtowncoder changed the title Ensure IonReader instances created within IonFactory are always resource-managed. Ensure IonReader instances created within IonFactory are always resource-managed Jul 6, 2022
@cowtowncoder cowtowncoder merged commit a2bbbdc into FasterXML:2.14 Jul 6, 2022
cowtowncoder added a commit that referenced this pull request Jul 6, 2022
@cowtowncoder cowtowncoder added this to the 2.14.0 milestone Jul 6, 2022
@cowtowncoder
Copy link
Member

Hmmh. I should have read this more closely... it does expose a problem b/w problem of closing (or not) of the underlying input source and IonReader, and general Jackson contract -- wherein InputStream is not managed (even if IonReader should be since caller has no access).
But at least there are unit tests to now codify expectation.

Also, need to figure out how to merge to master/3.0 as there's significant refactoring.

@cowtowncoder
Copy link
Member

Was able to merge fine to master/3.0, with some tweaks, test passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants