Skip to content

SessionCatalog — Session-Scoped Registry of Relational Entities

SessionCatalog is a catalog of relational entities in SparkSession (e.g. databases, tables, views, partitions, and functions).

SessionCatalog is used to create Logical Analyzer and SparkOptimizer (among other things).

Creating Instance

SessionCatalog takes the following to be created:

SessionCatalog and Spark SQL Services

SessionCatalog is created (and cached for later usage) when BaseSessionStateBuilder is requested for one.

Accessing SessionCatalog

SessionCatalog is available through SessionState (of a SparkSession).

scala> :type spark

scala> :type spark.sessionState.catalog

Default Database Name

SessionCatalog defines default as the name of the default database.


SessionCatalog uses an ExternalCatalog for the metadata of permanent entities (i.e. tables).

SessionCatalog is in fact a layer over ExternalCatalog in a SparkSession which allows for different metastores (i.e. in-memory or hive) to be used.


  name: TableIdentifier): Unit


requireTableExists is used when...FIXME


  db: String): Boolean


databaseExists is used when...FIXME

=== [[listTables]] listTables Method

[source, scala]

listTables(db: String): Seq[TableIdentifier] // <1> listTables(db: String, pattern: String): Seq[TableIdentifier]

<1> Uses "*" as the pattern



listTables is used when:

  • ShowTablesCommand logical command is requested to <>

  • SessionCatalog is requested to <> (for testing)

* CatalogImpl is requested to listTables (for testing)

=== [[isTemporaryTable]] Checking Whether Table Is Temporary View -- isTemporaryTable Method

[source, scala]

isTemporaryTable(name: TableIdentifier): Boolean


NOTE: isTemporaryTable is used when...FIXME

=== [[alterPartitions]] alterPartitions Method

[source, scala]

alterPartitions(tableName: TableIdentifier, parts: Seq[CatalogTablePartition]): Unit


NOTE: alterPartitions is used when...FIXME

=== [[listPartitions]] listPartitions Method

[source, scala]

listPartitions( tableName: TableIdentifier, partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition]


NOTE: listPartitions is used when...FIXME

=== [[listPartitionsByFilter]] listPartitionsByFilter Method

[source, scala]

listPartitionsByFilter( tableName: TableIdentifier, predicates: Seq[Expression]): Seq[CatalogTablePartition]


NOTE: listPartitionsByFilter is used when...FIXME

=== [[alterTable]] alterTable Method

[source, scala]

alterTable(tableDefinition: CatalogTable): Unit


NOTE: alterTable is used when AlterTableSetPropertiesCommand, AlterTableUnsetPropertiesCommand, AlterTableChangeColumnCommand, AlterTableSerDePropertiesCommand, AlterTableRecoverPartitionsCommand, AlterTableSetLocationCommand,[AlterViewAsCommand] (for[permanent views]) logical commands are executed.

=== [[alterTableStats]] Altering Table Statistics in Metastore (and Invalidating Internal Cache) -- alterTableStats Method

[source, scala]

alterTableStats(identifier: TableIdentifier, newStats: Option[CatalogStatistics]): Unit

alterTableStats requests <> to alter the statistics of the table (per identifier) followed by <>.

alterTableStats reports a NoSuchDatabaseException if the <>.

alterTableStats reports a NoSuchTableException if the <>.

alterTableStats is used when the following logical commands are executed:

=== [[tableExists]] tableExists Method

[source, scala]

tableExists( name: TableIdentifier): Boolean

tableExists requests the <> to check out whether the table exists or not.

tableExists assumes <> database unless defined in the input TableIdentifier.

NOTE: tableExists is used when...FIXME

=== [[functionExists]] functionExists Method

[source, scala]

functionExists(name: FunctionIdentifier): Boolean


functionExists is used in:

=== [[listFunctions]] listFunctions Method

[source, scala]

listFunctions( db: String): Seq[(FunctionIdentifier, String)] listFunctions( db: String, pattern: String): Seq[(FunctionIdentifier, String)]


NOTE: listFunctions is used when...FIXME

=== [[refreshTable]] Invalidating Table Relation Cache (aka Refreshing Table) -- refreshTable Method

[source, scala]

refreshTable(name: TableIdentifier): Unit


NOTE: refreshTable is used when...FIXME

=== [[loadFunctionResources]] loadFunctionResources Method

[source, scala]

loadFunctionResources(resources: Seq[FunctionResource]): Unit


NOTE: loadFunctionResources is used when...FIXME

=== [[alterTempViewDefinition]] Altering (Updating) Temporary View (Logical Plan) -- alterTempViewDefinition Method

[source, scala]

alterTempViewDefinition(name: TableIdentifier, viewDefinition: LogicalPlan): Boolean

alterTempViewDefinition alters the temporary view by <> (when a database is not specified and the table has already been registered) or a global temporary table (when a database is specified and it is for global temporary tables).

NOTE: "Temporary table" and "temporary view" are synonyms.

alterTempViewDefinition returns true when an update could be executed and finished successfully.

NOTE: alterTempViewDefinition is used exclusively when AlterViewAsCommand logical command is <>.

=== [[createTempView]] Creating (Registering) Or Replacing Local Temporary View -- createTempView Method

[source, scala]

createTempView( name: String, tableDefinition: LogicalPlan, overrideIfExists: Boolean): Unit


NOTE: createTempView is used when...FIXME

=== [[createGlobalTempView]] Creating (Registering) Or Replacing Global Temporary View -- createGlobalTempView Method

[source, scala]

createGlobalTempView( name: String, viewDefinition: LogicalPlan, overrideIfExists: Boolean): Unit

createGlobalTempView simply requests the <> to[register a global temporary view].


createGlobalTempView is used when:

  •[CreateViewCommand] logical command is executed (for a global temporary view, i.e. when the[view type] is[GlobalTempView])

*[CreateTempViewUsing] logical command is executed (for a global temporary view, i.e. when the[global] flag is enabled)

Creating Table

  tableDefinition: CatalogTable,
  ignoreIfExists: Boolean): Unit


NOTE: createTable is used when...FIXME

Finding Function by Name (Using FunctionRegistry)

  name: FunctionIdentifier,
  children: Seq[Expression]): Expression

lookupFunction finds a function by name.

For a function with no database defined that exists in <>, lookupFunction requests FunctionRegistry to[find the function] (by its unqualified name, i.e. with no database).

If the name function has the database defined or does not exist in FunctionRegistry, lookupFunction uses the fully-qualified function name to check if the function exists in <> (by its fully-qualified name, i.e. with a database).

For other cases, lookupFunction requests <> to find the function and <>. It then <> and[looks up the function] again.

lookupFunction is used when:

  • ResolveFunctions logical resolution rule executed (and resolves <> or <> expressions)

  • HiveSessionCatalog is requested to hive/[lookupFunction0]

=== [[lookupRelation]] Finding Relation (Table or View) in Catalogs -- lookupRelation Method

[source, scala]

lookupRelation(name: TableIdentifier): LogicalPlan

lookupRelation finds the name table in the catalogs (i.e. <>, <> or <>) and gives a SubqueryAlias per table type.

[source, scala]

scala> :type spark.sessionState.catalog org.apache.spark.sql.catalyst.catalog.SessionCatalog

import spark.sessionState.{catalog => c} import org.apache.spark.sql.catalyst.TableIdentifier

// Global temp view val db = spark.sharedState.globalTempViewManager.database // Make the example reproducible (and so "replace") spark.range(1).createOrReplaceGlobalTempView("gv1") val gv1 = TableIdentifier(table = "gv1", database = Some(db)) val plan = c.lookupRelation(gv1) scala> println(plan.numberedTreeString) 00 SubqueryAlias gv1 01 +- Range (0, 1, step=1, splits=Some(8))

val metastore = spark.sharedState.externalCatalog

// Regular table val db = spark.catalog.currentDatabase metastore.dropTable(db, table = "t1", ignoreIfNotExists = true, purge = true) sql("CREATE TABLE t1 (id LONG) USING parquet") val t1 = TableIdentifier(table = "t1", database = Some(db)) val plan = c.lookupRelation(t1) scala> println(plan.numberedTreeString) 00 'SubqueryAlias t1 01 +- 'UnresolvedCatalogRelation default.t1,

// Regular view (not temporary view!) // Make the example reproducible metastore.dropTable(db, table = "v1", ignoreIfNotExists = true, purge = true) import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable, CatalogTableType} val v1 = TableIdentifier(table = "v1", database = Some(db)) import org.apache.spark.sql.types.StructType val schema = new StructType().add($"id".long) val storage = CatalogStorageFormat(locationUri = None, inputFormat = None, outputFormat = None, serde = None, compressed = false, properties = Map()) val tableDef = CatalogTable( identifier = v1, tableType = CatalogTableType.VIEW, storage, schema, viewText = Some("SELECT 1") /** Required or RuntimeException reported */) metastore.createTable(tableDef, ignoreIfExists = false) val plan = c.lookupRelation(v1) scala> println(plan.numberedTreeString) 00 'SubqueryAlias v1 01 +- View (default.v1, [id#77L]) 02 +- 'Project [unresolvedalias(1, None)] 03 +- OneRowRelation

// Temporary view spark.range(1).createOrReplaceTempView("v2") val v2 = TableIdentifier(table = "v2", database = None) val plan = c.lookupRelation(v2) scala> println(plan.numberedTreeString) 00 SubqueryAlias v2 01 +- Range (0, 1, step=1, splits=Some(8))

Internally, lookupRelation looks up the name table using:

. <> when the database name of the table matches the[name] of GlobalTempViewManager

a. Gives SubqueryAlias or reports a NoSuchTableException

. <> when the database name of the table is specified explicitly or the <> does not contain the table

a. Gives SubqueryAlias with View when the table is a view (aka temporary table)

b. Gives SubqueryAlias with UnresolvedCatalogRelation otherwise

. The <>

a. Gives SubqueryAlias with the logical plan per the table as registered in the <>

NOTE: lookupRelation considers default to be the name of the database if the name table does not specify the database explicitly.

lookupRelation is used when:

  • DescribeTableCommand logical command is <>

  • ResolveRelations logical evaluation rule is requested to lookupTableFromCatalog

=== [[getTableMetadata]] Retrieving Table Metadata from External Catalog (Metastore) -- getTableMetadata Method

[source, scala]

getTableMetadata(name: TableIdentifier): CatalogTable

getTableMetadata simply requests <> (metastore) for the table metadata.

Before requesting the external metastore, getTableMetadata makes sure that the <> and <> (of the input TableIdentifier) both exist. If either does not exist, getTableMetadata reports a NoSuchDatabaseException or NoSuchTableException, respectively.

=== [[getTempViewOrPermanentTableMetadata]] Retrieving Table Metadata -- getTempViewOrPermanentTableMetadata Method

[source, scala]

getTempViewOrPermanentTableMetadata(name: TableIdentifier): CatalogTable

Internally, getTempViewOrPermanentTableMetadata branches off per database.

When a database name is not specified, getTempViewOrPermanentTableMetadata <> and creates a CatalogTable (with VIEW table type and an undefined storage) or <>.

With the database name of the[GlobalTempViewManager], getTempViewOrPermanentTableMetadata requests <> for the[global view definition] and creates a CatalogTable (with the name of GlobalTempViewManager in table identifier, VIEW table type and an undefined storage) or reports a NoSuchTableException.

With the database name not of GlobalTempViewManager, getTempViewOrPermanentTableMetadata simply <>.


getTempViewOrPermanentTableMetadata is used when:

* AlterTableAddColumnsCommand, CreateTableLikeCommand,[DescribeColumnCommand], ShowColumnsCommand and <> logical commands are requested to run (executed)

=== [[requireDbExists]] Reporting NoSuchDatabaseException When Specified Database Does Not Exist -- requireDbExists Internal Method

[source, scala]

requireDbExists(db: String): Unit

requireDbExists reports a NoSuchDatabaseException if the <>. Otherwise, requireDbExists does nothing.

=== [[reset]] reset Method

[source, scala]

reset(): Unit


NOTE: reset is used exclusively in the Spark SQL internal tests.

=== [[dropGlobalTempView]] Dropping Global Temporary View -- dropGlobalTempView Method

[source, scala]

dropGlobalTempView(name: String): Boolean

dropGlobalTempView simply requests the <> to <> the name global temporary view.

NOTE: dropGlobalTempView is used when...FIXME

=== [[dropTable]] Dropping Table -- dropTable Method

[source, scala]

dropTable( name: TableIdentifier, ignoreIfNotExists: Boolean, purge: Boolean): Unit


dropTable is used when:

=== [[getGlobalTempView]] Looking Up Global Temporary View by Name -- getGlobalTempView Method

[source, scala]

getGlobalTempView( name: String): Option[LogicalPlan]

getGlobalTempView requests the <> for the[temporary view definition by the input name].

getGlobalTempView is used when CatalogImpl is requested to dropGlobalTempView.

=== [[registerFunction]] registerFunction Method

[source, scala]

registerFunction( funcDefinition: CatalogFunction, overrideIfExists: Boolean, functionBuilder: Option[FunctionBuilder] = None): Unit



registerFunction is used when:

  • SessionCatalog is requested to <>

  • HiveSessionCatalog is requested to hive/[lookupFunction0]

* CreateFunctionCommand logical command is executed

=== [[lookupFunctionInfo]] lookupFunctionInfo Method

[source, scala]

lookupFunctionInfo(name: FunctionIdentifier): ExpressionInfo


NOTE: lookupFunctionInfo is used when...FIXME

=== [[alterTableDataSchema]] alterTableDataSchema Method

[source, scala]

alterTableDataSchema( identifier: TableIdentifier, newDataSchema: StructType): Unit


NOTE: alterTableDataSchema is used when...FIXME

=== [[getCachedTable]] getCachedTable Method

[source, scala]

getCachedTable( key: QualifiedTableName): LogicalPlan


NOTE: getCachedTable is used when...FIXME

Internal Properties


currentDb is...FIXME


tableRelationCache is a cache of fully-qualified table names to[table relation plans] (i.e. LogicalPlan).

Used when SessionCatalog <>


tempViews is a registry of temporary views (i.e. non-global temporary tables)

Used when SessionCatalog <>

Last update: 2020-11-07