JSON Reader - Haufe-Lexware/haufe.no-frills-transformation GitHub Wiki

Flatten JSON Files into Fields

The JSON Reader plugin enables reading JSON files as data sources in No Frills Transformation. It automatically flattens nested JSON structures into tabular records with dot-notation field names.

URI Format

The JSON Reader recognizes the following URI formats:

json://<filepath> - Read a JSON file
file://<filepath>.json - Read a JSON file (must have .json extension)

JSON Structure Support

The JSON Reader supports three types of JSON root structures:

1. Array of Objects (Most Common)

[
  { "id": 1, "name": "John" },
  { "id": 2, "name": "Jane" }
]

Each array element becomes a separate record.

2. Object with Array Property

{
  "users": [
    { "id": 1, "name": "John" },
    { "id": 2, "name": "Jane" }
  ]
}

The first array property found is used as the record source. Each array element becomes a record with field names prefixed by the property name and index (e.g., users.1.id, users.1.name).

3. Single Object

{
  "id": 1,
  "name": "John"
}

The entire object is treated as a single record.

Field Name Flattening

The JSON Reader flattens nested JSON structures using dot notation:

Simple Properties

{
  "firstName": "John",
  "lastName": "Doe"
}

Fields: firstName, lastName

Arrays

{
  "tags": ["important", "urgent", "review"]
}

Fields: tags.1, tags.2, tags.3

Array elements are numbered starting from 1.

Complex Nested Structures

{
  "user": {
    "id": 123,
    "name": "John Doe",
    "addresses": [
      {
        "type": "home",
        "street": "123 Main St",
        "city": "Springfield"
      },
      {
        "type": "work",
        "street": "456 Office Blvd",
        "city": "Shelbyville"
      }
    ]
  }
}

Fields:

user.id
user.name
user.addresses.1.type
user.addresses.1.street
user.addresses.1.city
user.addresses.2.type
user.addresses.2.street
user.addresses.2.city

Field Handling

All fields are discovered automatically by scanning all records in the JSON file
Fields are sorted alphabetically
If a field doesn't exist in a particular record, it returns an empty string
All values are converted to strings
The entire JSON file is loaded into memory - there is no streaming, so be aware of this

Example Configuration

<Source>json://customers.json</Source>

Example Input File: customers.json

[
  {
    "id": 1,
    "name": "Acme Corp",
    "contact": {
      "email": "[email protected]",
      "phone": "555-1234"
    },
    "tags": ["enterprise", "active"]
  },
  {
    "id": 2,
    "name": "TechStart Inc",
    "contact": {
      "email": "[email protected]"
    },
    "tags": ["startup"]
  }
]

Resulting Fields:

contact.email
contact.phone
id
name
tags.1
tags.2

Transformation File example

<Transformation>
  <Source>json://customers.json</Source>
  <Target config="delim=';'">file://customers.csv</Target>

  <Fields appendSource="true" />
</Transformation>

This will flatten any JSON file into a CSV file.

Limitations

The entire JSON file is loaded into memory
Very large JSON files may cause memory issues
All values are converted to strings; type information is not preserved
Arrays cannot be directly used as field names (only their indexed elements)
Field names are case-sensitive