Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ToXmlGenerator non-final #262

Closed
ghost opened this issue Sep 20, 2017 · 5 comments
Closed

Make ToXmlGenerator non-final #262

ghost opened this issue Sep 20, 2017 · 5 comments
Milestone

Comments

@ghost
Copy link

ghost commented Sep 20, 2017

Background

A YAML file contains definitions that reference each other using a simple syntax:

$ cat variables.yaml 
---
a: A
b: $a$ B
c: $b$ C $b$
d: $a$ $b$ $c$ D $a$

The goal is to read the YAML definitions, interpolate the values, then write them back out. The output formats include YAML and XML. YAML works fine. XML, however, does not.

For context, the YAML implementation intercepts the writeString method of YAMLGenerator. Note that the YAMLGenerator class is not declared final. The following code works:

  public ResolverYAMLGenerator(
    final YamlParser yamlParser,
    final IOContext ctxt,
    final int jsonFeatures,
    final int yamlFeatures,
    final ObjectCodec codec,
    final Writer out,
    final DumperOptions.Version version ) throws IOException {
    super( ctxt, jsonFeatures, yamlFeatures, codec, out, version );
    setYamlParser( yamlParser );
  }

  @Override
  public void writeString( final String text )
    throws IOException, JsonGenerationException {
    final YamlParser parser = getYamlParser();
    super.writeString( parser.substitute( text ) );
  }

The ResolverYAMLGenerator subclasses from YAMLGenerator and overrides the writeString method to perform variable interpolation via the substitute method call.

Problem

The equivalent XML generator class, ToXmlGenerator, is declared final. This precludes the possibility of overriding its writeString method to perform variable interpolation.

Resolution Ideas

Can ToXmlGenerator have its final declaration removed?

If not, what other mechanism exists to interpolate only element values immediately prior to writing them (that don't involve composition and delegation)?

@cowtowncoder
Copy link
Member

It'd be possible to remove final modifier, but it is worth mentioning that neither parser nor generator classes are designed as sub-classable by users. So that usage is sort of off-label and unsupported, so breakages between minor versions are likely.

That said, there are no good extension points for variable interpolation either during parsing or generation. I have plans to support something for parsing (and there's an issue for jackson-core), but hadn't considered it an issue for generation side.

The way I would probably try to tackle this, however, would not be at jackson-core (JsonGenerator and sub-classes) level, but by custom serializer for String.
It's quite easy to override handling there?

@ghost
Copy link
Author

ghost commented Sep 20, 2017

For what it's worth, here's a script that interpolates and transforms a YAML document into XML:

# Java Code - Interpolate variable substitution, YAML to YAML.
java -jar ../bin/yamlp.jar --input $SRC_YAML > $DST_SUBS

# Ruby Code - Convert YAML to JSON.
ruby -ryaml -rjson -e 'puts JSON.pretty_generate(YAML.load(ARGF))' \
  < $DST_SUBS > $DST_JSON

# Python Code - Convert JSON to XML.
../bin/json2xml.py < $DST_JSON > $DST_XML

I'd like to eliminate the Ruby and Python dependencies:

# Perform interpolation of variables.
java -jar ../bin/yamlp.jar --xml < $SRC_YAML > $DST_XML

It'd be possible to remove final modifier,

That'd be the simplest solution.

but hadn't considered it an issue for generation side.

It is not trivial to add interpolation during parsing. Consider:

d: $a$ $b$ $c$ D $a$
c: $b$ C $b$
b: $a$ B
a: A

Afterwards, though, interpolating the values is easy.

custom serializer for String.

Where can I find an example of serialization that performs operations on values during write operations in a generic way (i.e., not coupled to any instance variables specific to a known YAML document structure)? Here's a JsonSerializer for String values:

public class StringJsonSerializer extends JsonSerializer<String> {

  @Override
  public void serialize(String value, JsonGenerator jgen, SerializerProvider provider) 
    throws IOException, JsonProcessingException {
      jgen.writeString(interpolate(string));
  }
}

However, there's no instance variable that can have the @JsonSerialize annotation applied because the document structure is unknown at runtime. How would you apply such a serializer to all values, regardless of output format (e.g., JSON, YAML, XML)?

@cowtowncoder
Copy link
Member

Right, in streaming manner some interpolation is doable, others not. Hooks for allowing some level of modification are needed for other purposes as well (low-level coercion, f.ex, from loosely typed textual formats), so it's not just for this use case.

As to serializer, I am not sure I understand your question. Context that can be used to pass configuration, settings, would be SerializerProvider; but JsonGenerator (and its subtypes -- it's not JSON-specific, naming is historic legacy) also has path information (can traverse parent element names).
But in addition you can also make serializer configure itself based on annotations by implementing ContextualSerializer, which will be given BeanProperty -- it can then re-construct differently configured serializer to use for just that property.
Combination of these features allows quite a flexible handling.

But I may be missing some aspects you are looking for... ?

@ghost
Copy link
Author

ghost commented Sep 21, 2017

which will be given BeanProperty

A bean implies a POJO; there can be no POJOs in a general purpose solution. There are no POJOs in my code to annotate. To be clear, here's an example file called variables.yaml:

a: A
b: $a$ B
c: $b$ C $b$
d: $a$ $b$ $c$ D $a$

I'd like to transform the contents of variables.yaml into the following XML (or JSON, or YAML) document:

<ObjectNode><a>A</a><b>A B</b><c>A B C A B</c><d>A A B A B C A B D A</d></ObjectNode>

The input file variables.yaml could contain any valid YAML data. Here's another example:

one: 1
two: 2
three: $one$ + $two$

Transforming would produce:

<ObjectNode><one>1</one><two>2</two><three>1 + 2</three></ObjectNode>

The following class creates a serializer:

public class StringValueSerializer extends StdSerializer<String> {

  private YamlParser parser;

  public StringValueSerializer( final YamlParser parser ) {
    super( String.class );
    setParser( parser );
  }

  @Override
  public void serialize( final String value, final JsonGenerator jgen,
                         final SerializerProvider provider )
    throws IOException, JsonProcessingException {
    
    final String fieldName = jgen.getOutputContext().getCurrentName();
    final String fieldValue = getParser().substitute( value );

    jgen.writeStringField( fieldName, fieldValue );
  }

  private YamlParser getParser() {
    return this.parser;
  }

  private void setParser( final YamlParser parser ) {
    this.parser = parser;
  }
}

The serializer class is registered as follows:

    final ObjectMapper mapper = new ObjectMapper();
    final SimpleModule module = new SimpleModule();

    module.addSerializer( new StringValueSerializer( this ) );
    mapper.registerModule( module );

    mapper.writeValue( System.out, getDocumentRoot() );

The serialize method is never called, even though the document is written to standard output. Same behaviour using the following class definitions:

public class StringValueSerializer extends StdKeySerializers.StringKeySerializer { ...
public class StringValueSerializer extends JsonSerializer<String> { ...

Even if the serialize method gets called, how do I get the field name? This looks correct:

    final String fieldName = jgen.getOutputContext().getCurrentName();

I also tried using an XmlMapper subclass:

public final class ResolverXmlMapper extends XmlMapper {

  private YamlParser yamlParser;

  public ResolverXmlMapper( final YamlParser yamlParser ) {
    setYamlParser( yamlParser );
  }

  @Override
  public void writeValue( final XMLStreamWriter output, final Object value )
    throws IOException {
    super.writeValue( output, getYamlParser().substitute( value.toString() ) );
  }

  private YamlParser getYamlParser() {
    return this.yamlParser;
  }

  private void setYamlParser( final YamlParser yamlParser ) {
    this.yamlParser = yamlParser;
  }
}

Along with the following code:

    final ObjectMapper mapper = new ResolverXmlMapper(this);
    mapper.writeValue( System.out, getDocumentRoot() );

The method writeValue is never called.

Is there any other way to intercept the value immediately prior to being written?

If I'm missing something in the implementation, please let me know.

@cowtowncoder cowtowncoder changed the title ToXmlGenerator's final declaration Make ToXmlGenerator non-final May 22, 2020
@cowtowncoder cowtowncoder added this to the 2.12.0 milestone May 22, 2020
@cowtowncoder
Copy link
Member

This might be too little, too late, but I figured there is no reason ToXmlGenerator needs to be final, so changing this for 2.12.

alex-bel-apica pushed a commit to ApicaSystem/jackson-dataformat-xml that referenced this issue Sep 4, 2020
# Conflicts:
#	release-notes/CREDITS-2.x
#	release-notes/VERSION-2.x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant