Skip to content

JDBCRelation

JDBCRelation is a <> that supports <> and <>.

[[BaseRelation]] As a BaseRelation, JDBCRelation defines the <> and the <>.

[[InsertableRelation]] As a InsertableRelation, JDBCRelation supports <>.

[[PrunedFilteredScan]] As a PrunedFilteredScan, JDBCRelation supports <>.

JDBCRelation is <> when:

[[toString]] When requested for a human-friendly text representation, JDBCRelation requests the <> for the name of the table and the <> (if defined).

JDBCRelation([table]) [numPartitions=[number]]

JDBCRelation in web UI (Details for Query)

scala> df.explain
== Physical Plan ==
*Scan JDBCRelation(projects) [numPartitions=1] [id#0,name#1,website#2] ReadSchema: struct<id:int,name:string,website:string>

[[sqlContext]] JDBCRelation uses the <> to return a SparkSession.md#sqlContext[SQLContext].

[[needConversion]] JDBCRelation turns the needConversion flag off (to announce that <> returns an RDD[InternalRow] already and DataSourceStrategy execution planning strategy does not have to do the RDD conversion).

Creating Instance

JDBCRelation takes the following to be created:

=== [[unhandledFilters]] Finding Unhandled Filter Predicates -- unhandledFilters Method

[source, scala]

unhandledFilters(filters: Array[Filter]): Array[Filter]

unhandledFilters is part of BaseRelation abstraction.

unhandledFilters returns the Filter predicates in the input filters that could not be converted to a SQL expression (and are therefore unhandled by the JDBC data source natively).

=== [[schema]] Schema of Tuples (Data) -- schema Property

[source, scala]

schema: StructType

schema uses JDBCRDD to resolveTable given the JDBCOptions (that simply returns the schema of the table, also known as the default table schema).

If customSchema JDBC option was defined, schema uses JdbcUtils to replace the data types in the default table schema.

schema is part of BaseRelation abstraction.

=== [[insert]] Inserting or Overwriting Data to JDBC Table -- insert Method

[source, scala]

insert(data: DataFrame, overwrite: Boolean): Unit

insert is part of the InsertableRelation abstraction.

insert simply requests the input DataFrame for a <> that in turn is requested to save the data to a table using the JDBC data source (itself!) with the url, table and all options.

insert also requests the DataFrameWriter to set the save mode as Overwrite or Append per the input overwrite flag.

Note

insert uses a "trick" to reuse a code that is responsible for saving data to a JDBC table.

=== [[buildScan]] Building Distributed Data Scan with Column Pruning and Filter Pushdown -- buildScan Method

[source, scala]

buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row]

buildScan is part of the PrunedFilteredScan abstraction.

buildScan uses the JDBCRDD object to create a RDD[Row] for a distributed data scan.


Last update: 2021-05-30
Back to top