Skip to content

UnsafeFixedWidthAggregationMap

UnsafeFixedWidthAggregationMap is a tiny layer (extension) around Spark Core's <> to allow for <> keys and values.

Whenever requested for performance metrics (i.e. <> and <>), UnsafeFixedWidthAggregationMap simply requests the underlying <>.

UnsafeFixedWidthAggregationMap is <> when:

[[internal-registries]] .UnsafeFixedWidthAggregationMap's Internal Properties (e.g. Registries, Counters and Flags) [cols="1m,2",options="header",width="100%"] |=== | Name | Description

| currentAggregationBuffer | [[currentAggregationBuffer]] Re-used pointer (as an <> with the number of fields to match the <>) to the current aggregation buffer

Used exclusively when UnsafeFixedWidthAggregationMap is requested to <>.

| emptyAggregationBuffer | [[emptyAggregationBuffer-byte-array]] <> (encoded in UnsafeRow format)

| groupingKeyProjection | [[groupingKeyProjection]] UnsafeProjection for the <> (to encode grouping keys as UnsafeRows)

| map a| [[map]] Spark Core's BytesToBytesMap with the <>, <>, <> and performance metrics enabled |===

=== [[supportsAggregationBufferSchema]] supportsAggregationBufferSchema Static Method

boolean supportsAggregationBufferSchema(
  StructType schema)

supportsAggregationBufferSchema is a predicate that is enabled (true) unless there is a field (in the fields of the input schema) whose data type is not mutable.

[NOTE]

The mutable data types: BooleanType, ByteType, DateType, DecimalType, DoubleType, FloatType, IntegerType, LongType, NullType, ShortType and TimestampType.

Examples (possibly all) of data types that are not mutable: ArrayType, BinaryType, StringType, CalendarIntervalType, MapType, ObjectType and StructType.

[source, scala]

import org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap

import org.apache.spark.sql.types._ val schemaWithImmutableField = StructType(StructField("string", StringType) :: Nil) assert(UnsafeFixedWidthAggregationMap.supportsAggregationBufferSchema(schemaWithImmutableField) == false)

val schemaWithMutableFields = StructType( StructField("int", IntegerType) :: StructField("bool", BooleanType) :: Nil) assert(UnsafeFixedWidthAggregationMap.supportsAggregationBufferSchema(schemaWithMutableFields))


supportsAggregationBufferSchema is used when HashAggregateExec is requested to supportsAggregate.

Creating Instance

UnsafeFixedWidthAggregationMap takes the following when created:

  • [[emptyAggregationBuffer]] Empty aggregation buffer (as an InternalRow)
  • [[aggregationBufferSchema]] Aggregation buffer schema
  • [[groupingKeySchema]] Grouping key schema
  • [[taskMemoryManager]] Spark Core's TaskMemoryManager
  • [[initialCapacity]] Initial capacity
  • [[pageSizeBytes]] Page size (in bytes)

UnsafeFixedWidthAggregationMap initializes the <>.

=== [[getAggregationBufferFromUnsafeRow]] getAggregationBufferFromUnsafeRow Method

[source, scala]

UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key) // <1> UnsafeRow getAggregationBufferFromUnsafeRow(UnsafeRow key, int hash)


<1> Uses the hash code of the key

getAggregationBufferFromUnsafeRow requests the <> to lookup the input key (to get a BytesToBytesMap.Location).

getAggregationBufferFromUnsafeRow...FIXME

[NOTE]

getAggregationBufferFromUnsafeRow is used when:

  • TungstenAggregationIterator is requested to <> (exclusively when TungstenAggregationIterator is <>)

* (for testing only) UnsafeFixedWidthAggregationMap is requested to <>

=== [[getAggregationBuffer]] getAggregationBuffer Method

[source, java]

UnsafeRow getAggregationBuffer(InternalRow groupingKey)

getAggregationBuffer...FIXME

NOTE: getAggregationBuffer seems to be used exclusively for testing.

=== [[iterator]] Getting KVIterator -- iterator Method

[source, java]

KVIterator iterator()

iterator...FIXME

iterator is used when:

=== [[getPeakMemoryUsedBytes]] getPeakMemoryUsedBytes Method

[source, java]

long getPeakMemoryUsedBytes()

getPeakMemoryUsedBytes...FIXME

getPeakMemoryUsedBytes is used when:

=== [[getAverageProbesPerLookup]] getAverageProbesPerLookup Method

[source, java]

double getAverageProbesPerLookup()

getAverageProbesPerLookup...FIXME

getAverageProbesPerLookup is used when:

=== [[free]] free Method

[source, java]

void free()

free...FIXME

free is used when:

=== [[destructAndCreateExternalSorter]] destructAndCreateExternalSorter Method

[source, java]

UnsafeKVExternalSorter destructAndCreateExternalSorter() throws IOException

destructAndCreateExternalSorter...FIXME

destructAndCreateExternalSorter is used when:


Last update: 2021-04-13