Schema Types - Altinity/parquet-regression GitHub Wiki
Schema Types in Parquet-Java
When creating a Parquet file using parquet-java, you can specify the schema types for your data fields. The available schema types are:
- requiredGroup
- repeatedGroup
- required
- repeated
- optionalGroup
- optional
Below is a brief explanation of each schema type.
requiredGroup
A requiredGroup
is a group of fields (a nested schema) that must be present in every record and cannot be null
.
The fields within this group can have their own repetition levels (required, optional, or repeated).
repeatedGroup
A repeatedGroup
represents a group that can occur zero or more times,
effectively modeling a list or array of nested records.
required
A required
field must be present in every record and cannot be null
. This ensures that the field always contains a value.
repeated
A repeated
field can have zero or more values, modeling a list or array of values of the same type.
optionalGroup
An optionalGroup
is a group of fields that may or may not be present in a record. The entire group can be null
.
optional
An optional
field may or may not be present in a record. If the field is not present, it is considered null
.