Skip to content

WriteToDataSourceV2Exec Physical Operator

WriteToDataSourceV2Exec is a physical operator that represents an AppendData logical operator (and a deprecated WriteToDataSourceV2 logical operator) at execution time.

WriteToDataSourceV2Exec is <> when DataSourceV2Strategy execution planning strategy is requested to plan an AppendData logical operator (and a deprecated WriteToDataSourceV2).

NOTE: Although <> logical operator is deprecated since Spark SQL 2.4.0 (for <> logical operator), the AppendData logical operator is currently used in tests only. That makes WriteToDataSourceV2 logical operator still relevant.

[[creating-instance]] WriteToDataSourceV2Exec takes the following to be created:

  • [[writer]] FIXME
  • [[query]] Child <>

[[children]] When requested for the child operators, WriteToDataSourceV2Exec gives the one <>.

[[output]] When requested for the <>, WriteToDataSourceV2Exec gives no attributes (an empty collection).

[[logging]] [TIP] ==== Enable INFO logging level for org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec=INFO

Refer to <>.

=== [[doExecute]] Executing Physical Operator (Generating RDD[InternalRow]) -- doExecute Method

[source, scala]

doExecute(): RDD[InternalRow]

doExecute is part of the SparkPlan abstraction.

doExecute...FIXME

doExecute requests the <> to <> (that triggers physical query planning and in the end generates an RDD of InternalRows).

doExecute prints out the following INFO message to the logs:

Start processing data source writer: [writer]. The input RDD has [length] partitions.

[[doExecute-runJob]] doExecute requests the <> to run a Spark job with the following:

  • The RDD[InternalRow] of the <>

  • A partition processing function that requests the DataWritingSparkTask object to run the writing task (of the <>) with or with no commit coordinator

  • A result handler function that records the result WriterCommitMessage from a successful data writer and requests FIXME

doExecute prints out the following INFO message to the logs:

Data source writer [writer] is committing.

doExecute...FIXME

In the end, doExecute prints out the following INFO message to the logs:

Data source writer [writer] committed.

In case of any error (Throwable), doExecute...FIXME


Last update: 2020-11-13
Back to top