Data Format Parquet - keshavbaweja-git/guides GitHub Wiki
Columnar, binary data storage format.
Design Goals
Interoperability
Space efficiency
Query efficiency
Language agnostic format specification
Java converters available for following Object Models
Avro
Thrift
Protocol Buffer
Pig Hive
Hive SerDe
C++ encoding used by Impala
Columnar storage is more space efficient as homogeneous column values are stored together allowing for less encoding data and better compression performance.