DataSourceV2Strategy Execution Planning Strategy¶
DataSourceV2Strategy
is an execution planning strategy that SparkPlanner uses to <
[[logical-operators]] .DataSourceV2Strategy's Execution Planning [cols="1,1",options="header",width="100%"] |=== | Logical Operator | Physical Operator
| <
| <
| <
| <
| <WriteToContinuousDataSourceExec
| <StreamingDataSourceV2Relation
and a ContinuousReader
| ContinuousCoalesceExec
|===
[[logging]] [TIP] ==== Enable INFO
logging level for org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy
logger to see what happens inside.
Add the following line to conf/log4j.properties
:
log4j.logger.org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy=INFO
Refer to spark-logging.md[Logging].¶
Applying DataSourceV2Strategy Strategy to Logical Plan¶
apply(
plan: LogicalPlan): Seq[SparkPlan]
apply
branches off per the given <
apply
is part of GenericStrategy abstraction.
==== [[apply-DataSourceV2Relation]] DataSourceV2Relation Logical Operator
For a <apply
...FIXME
apply
then <
apply
prints out the following INFO message to the logs:
Pushing operators to [ClassName of DataSourceV2]
Pushed Filters: [pushedFilters]
Post-Scan Filters: [postScanFilters]
Output: [output]
apply
uses the DataSourceV2Relation
to create a <
If there are any postScanFilters
, apply
creates a <DataSourceV2ScanExec
physical operator as the child.
In the end, apply
creates a <FilterExec
with the DataSourceV2ScanExec
or directly with the DataSourceV2ScanExec
physical operator.
==== [[apply-StreamingDataSourceV2Relation]] StreamingDataSourceV2Relation Logical Operator
For a StreamingDataSourceV2Relation
logical operator, apply
...FIXME
==== [[apply-WriteToDataSourceV2]] WriteToDataSourceV2 Logical Operator
For a <apply
simply creates a <
==== [[apply-AppendData]] AppendData Logical Operator
For a <apply
requests the <
==== [[apply-WriteToContinuousDataSource]] WriteToContinuousDataSource Logical Operator
For a WriteToContinuousDataSource
logical operator, apply
...FIXME
==== [[apply-Repartition]] Repartition Logical Operator
For a Repartition logical operator, apply
...FIXME
=== [[pushFilters]] pushFilters
Internal Method
[source, scala]¶
pushFilters( reader: DataSourceReader, filters: Seq[Expression]): (Seq[Expression], Seq[Expression])
pushFilters
...FIXME
In the end, pushFilters
returns a pair of filters pushed and not.
NOTE: pushFilters
is used exclusively when DataSourceV2Strategy
execution planning strategy is <
=== [[pruneColumns]] Column Pruning -- pruneColumns
Internal Method
[source, scala]¶
pruneColumns( reader: DataSourceReader, relation: DataSourceV2Relation, exprs: Seq[Expression]): Seq[AttributeReference]
pruneColumns
...FIXME
NOTE: pruneColumns
is used when...FIXME