Skip to content

Partitioning — Specification of Physical Operator's Output Partitions

Partitioning is an abstraction of partitioning schemes that hint the Spark Physical Optimizer about the number of partitions and data distribution of the output of a physical operator.

Contract

Number of Partitions

numPartitions: Int

Used when:

Implementations

BroadcastPartitioning

BroadcastPartitioning(
  mode: BroadcastMode)

compatibleWith: BroadcastPartitioning with the same BroadcastMode

guarantees: Exactly the same BroadcastPartitioning

numPartitions: 1

satisfies: BroadcastDistribution with the same BroadcastMode

DataSourcePartitioning

DataSourcePartitioning(
  partitioning: Partitioning,
  colNames: AttributeMap[String])

HashPartitioning

HashPartitioning

numPartitions: the given numPartitions

satisfies:

PartitioningCollection

PartitioningCollection(
  partitionings: Seq[Partitioning])

compatibleWith: Any Partitioning that is compatible with one of the input partitionings

guarantees: Any Partitioning that is guaranteed by any of the input partitionings

numPartitions: Number of partitions of the first Partitioning in the input partitionings

satisfies: Any Distribution that is satisfied by any of the input partitionings

RangePartitioning

RangePartitioning

compatibleWith: RangePartitioning when semantically equal (i.e. underlying expressions are deterministic and canonically equal)

guarantees: RangePartitioning when semantically equal (i.e. underlying expressions are deterministic and canonically equal)

numPartitions: the given numPartitions

satisfies:

RoundRobinPartitioning

RoundRobinPartitioning(
  numPartitions: Int)

compatibleWith: Always false

guarantees: Always false

numPartitions: the given numPartitions

satisfies: UnspecifiedDistribution

SinglePartition

compatibleWith: Any Partitioning with one partition

guarantees: Any Partitioning with one partition

numPartitions: 1

satisfies: Any Distribution except BroadcastDistribution

UnknownPartitioning

UnknownPartitioning(
  numPartitions: Int)

compatibleWith: Always false

guarantees: Always false

numPartitions: the given numPartitions

satisfies: UnspecifiedDistribution


Last update: 2020-08-24
Back to top