Specifying Row and Page Size - Altinity/parquet-regression GitHub Wiki
Row Group Size and Page size
rowGroupSize
: Defines the maximum size (in bytes) of each row group when writing data to a Parquet file.pageSize
: Defines the size (in bytes) of each page within a column chunk.
Row group: A logical horizontal partitioning of the data into rows. There is no physical structure that is guaranteed for a row group. A row group consists of a column chunk for each column in the dataset.
Page: Column chunks are divided up into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). There can be multiple page types which are interleaved in a column chunk.
"options": {
"rowGroupSize": 256,
"pageSize": 1024
}