Skip to content

Parquet Data Source

Apache Parquet is a columnar storage format for the Apache Hadoop ecosystem with support for efficient storage and encoding of data.

Spark SQL supports parquet-encoded data using ParquetFileFormat.

Parquet is the default data source format based on the spark.sql.sources.default configuration property.


Last update: 2021-05-23