Skip to content

DataSourceV2Relation Leaf Logical Operator

DataSourceV2Relation is a leaf logical operator that represents a data scan over tables with support for BATCH_READ (at the very least).

Creating Instance

DataSourceV2Relation takes the following to be created:

DataSourceV2Relation is created (indirectly) using create utility and withMetadataColumns.

Creating DataSourceV2Relation

create(
  table: Table,
  catalog: Option[CatalogPlugin],
  identifier: Option[Identifier]): DataSourceV2Relation
create(
  table: Table,
  catalog: Option[CatalogPlugin],
  identifier: Option[Identifier],
  options: CaseInsensitiveStringMap): DataSourceV2Relation

create replaces CharType and VarcharType types in the schema of the given Table with "annotated" StringType (as the query engine doesn't support char/varchar).

In the end, create uses the new schema to create a DataSourceV2Relation.

create is used when:

MultiInstanceRelation

DataSourceV2Relation is a MultiInstanceRelation.

Metadata Columns

metadataOutput: Seq[AttributeReference]

metadataOutput is part of the LogicalPlan abstraction.

metadataOutput requests the Table for the metadata columns (if it is a SupportsMetadataColumns).

metadataOutput filters out metadata columns with the same name as regular output columns.

Creating DataSourceV2Relation with Metadata Columns

withMetadataColumns(): DataSourceV2Relation

withMetadataColumns creates a DataSourceV2Relation with the extra metadataOutput (for the output attributes) if defined.

withMetadataColumns is used when:

Required Table Capabilities

TableCapabilityCheck is used to assert the following regarding DataSourceV2Relation and the Table:

  1. Table supports BATCH_READ
  2. Table supports BATCH_WRITE or V1_BATCH_WRITE for AppendData (append in batch mode)
  3. Table supports BATCH_WRITE with OVERWRITE_DYNAMIC for OverwritePartitionsDynamic (dynamic overwrite in batch mode)
  4. Table supports BATCH_WRITE, V1_BATCH_WRITE or OVERWRITE_BY_FILTER possibly with TRUNCATE for OverwriteByExpression (truncate in batch mode and overwrite by filter in batch mode)

Name

name: String

name is part of the NamedRelation abstraction.

name requests the Table for the name

Simple Node Description

simpleString(
  maxFields: Int): String

simpleString is part of the TreeNode abstraction.

simpleString gives the following (with the output and the name):

RelationV2[output] [name]
Back to top