Skip to content

DescribeTableCommand Logical Command

DescribeTableCommand is a <> that <> a DESCRIBE TABLE SQL statement.

DescribeTableCommand is <> exclusively when SparkSqlAstBuilder is requested to parse <> SQL statement (with no column specified).

[[output]] DescribeTableCommand uses the following <>:

  • col_name as the name of the column
  • data_type as the data type of the column
  • comment as the comment of the column

[source, scala]

spark.range(1).createOrReplaceTempView("demo")

// DESC view scala> sql("DESC EXTENDED demo").show +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ | id| bigint| null| +--------+---------+-------+

// DESC table // Make the demo reproducible spark.sharedState.externalCatalog.dropTable( db = "default", table = "bucketed", ignoreIfNotExists = true, purge = true) spark.range(10).write.bucketBy(5, "id").saveAsTable("bucketed") assert(spark.catalog.tableExists("bucketed"))

// EXTENDED to include Detailed Table Information // Note no partitions used // Could also be FORMATTED scala> sql("DESC EXTENDED bucketed").show(numRows = 50, truncate = false) +----------------------------+-----------------------------------------------------------------------------+-------+ |col_name |data_type |comment| +----------------------------+-----------------------------------------------------------------------------+-------+ |id |bigint |null | | | | | |# Detailed Table Information| | | |Database |default | | |Table |bucketed | | |Owner |jacek | | |Created Time |Sun Sep 30 20:57:22 CEST 2018 | | |Last Access |Thu Jan 01 01:00:00 CET 1970 | | |Created By |Spark 2.3.1 | | |Type |MANAGED | | |Provider |parquet | | |Num Buckets |5 | | |Bucket Columns |[id] | | |Sort Columns |[] | | |Table Properties |[transient_lastDdlTime=1538333842] | | |Statistics |3740 bytes | | |Location |file:/Users/jacek/dev/apps/spark-2.3.1-bin-hadoop2.7/spark-warehouse/bucketed| | |Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | |InputFormat |org.apache.hadoop.mapred.SequenceFileInputFormat | | |OutputFormat |org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | |Storage Properties |[serialization.format=1] | | +----------------------------+-----------------------------------------------------------------------------+-------+

// Make the demo reproducible val tableName = "partitioned_bucketed_sorted" val partCol = "part" spark.sharedState.externalCatalog.dropTable( db = "default", table = tableName, ignoreIfNotExists = true, purge = true) spark.range(10) .withColumn("part", $"id" % 2) // extra column for partitions .write .partitionBy(partCol) .bucketBy(5, "id") .sortBy("id") .saveAsTable(tableName) assert(spark.catalog.tableExists(tableName)) scala> sql(s"DESC EXTENDED $tableName").show(numRows = 50, truncate = false) +----------------------------+------------------------------------------------------------------------------------------------+-------+ |col_name |data_type |comment| +----------------------------+------------------------------------------------------------------------------------------------+-------+ |id |bigint |null | |part |bigint |null | |# Partition Information | | | |# col_name |data_type |comment| |part |bigint |null | | | | | |# Detailed Table Information| | | |Database |default | | |Table |partitioned_bucketed_sorted | | |Owner |jacek | | |Created Time |Mon Oct 01 10:05:32 CEST 2018 | | |Last Access |Thu Jan 01 01:00:00 CET 1970 | | |Created By |Spark 2.3.1 | | |Type |MANAGED | | |Provider |parquet | | |Num Buckets |5 | | |Bucket Columns |[id] | | |Sort Columns |[id] | | |Table Properties |[transient_lastDdlTime=1538381132] | | |Location |file:/Users/jacek/dev/apps/spark-2.3.1-bin-hadoop2.7/spark-warehouse/partitioned_bucketed_sorted| | |Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | |InputFormat |org.apache.hadoop.mapred.SequenceFileInputFormat | | |OutputFormat |org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | |Storage Properties |[serialization.format=1] | | |Partition Provider |Catalog | | +----------------------------+------------------------------------------------------------------------------------------------+-------+

scala> sql(s"DESCRIBE EXTENDED tableName PARTITION (partCol=1)").show(numRows = 50, truncate = false) +--------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------+ |col_name |data_type |comment| +--------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------+ |id |bigint |null | |part |bigint |null | |# Partition Information | | | |# col_name |data_type |comment| |part |bigint |null | | | | | |# Detailed Partition Information| | | |Database |default | | |Table |partitioned_bucketed_sorted | | |Partition Values |[part=1] | | |Location |file:/Users/jacek/dev/apps/spark-2.3.1-bin-hadoop2.7/spark-warehouse/partitioned_bucketed_sorted/part=1 | | |Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | |InputFormat |org.apache.hadoop.mapred.SequenceFileInputFormat | | |OutputFormat |org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | |Storage Properties |[path=file:/Users/jacek/dev/apps/spark-2.3.1-bin-hadoop2.7/spark-warehouse/partitioned_bucketed_sorted, serialization.format=1]| | |Partition Parameters |{totalSize=1870, numFiles=5, transient_lastDdlTime=1538381329} | | |Partition Statistics |1870 bytes | | | | | | |# Storage Information | | | |Num Buckets |5 | | |Bucket Columns |[id] | | |Sort Columns |[id] | | |Location |file:/Users/jacek/dev/apps/spark-2.3.1-bin-hadoop2.7/spark-warehouse/partitioned_bucketed_sorted | | |Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | |InputFormat |org.apache.hadoop.mapred.SequenceFileInputFormat | | |OutputFormat |org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | |Storage Properties |[serialization.format=1] | | +--------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------+


=== [[run]] Executing Logical Command -- run Method

[source, scala]

run(sparkSession: SparkSession): Seq[Row]

NOTE: run is part of the <> to execute (run) a logical command.

run uses the SessionCatalog (of the <> of the input <>) and branches off per the type of the table to display.

For a temporary view, run requests the SessionCatalog to lookupRelation to access the <> and <>.

For all other table types, run does the following:

. Requests the SessionCatalog to retrieve the table metadata from the external catalog (metastore) (as a CatalogTable) and <> (with the schema)

. <>

. <> if the <> is available or <> when <> flag is on

=== [[describeFormattedDetailedPartitionInfo]] Describing Detailed Partition and Storage Information -- describeFormattedDetailedPartitionInfo Internal Method

[source, scala]

describeFormattedDetailedPartitionInfo( tableIdentifier: TableIdentifier, table: CatalogTable, partition: CatalogTablePartition, buffer: ArrayBuffer[Row]): Unit


describeFormattedDetailedPartitionInfo simply adds the following entries (rows) to the input mutable buffer:

. A new line

. # Detailed Partition Information

. Database with the database of the given table

. Table with the table of the given tableIdentifier

. Partition specification (of the CatalogTablePartition)

. A new line

. # Storage Information

. Bucketing specification of the table (if defined)

. Storage specification of the table

describeFormattedDetailedPartitionInfo is used when:

=== [[describeFormattedTableInfo]] Describing Detailed Table Information -- describeFormattedTableInfo Internal Method

[source, scala]

describeFormattedTableInfo(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit

describeFormattedTableInfo...FIXME

NOTE: describeFormattedTableInfo is used exclusively when DescribeTableCommand is requested to <> for a non-temporary table and the <> flag on.

=== [[describeDetailedPartitionInfo]] describeDetailedPartitionInfo Internal Method

[source, scala]

describeDetailedPartitionInfo( tableIdentifier: TableIdentifier, table: CatalogTable, partition: CatalogTablePartition, buffer: ArrayBuffer[Row]): Unit


describeDetailedPartitionInfo...FIXME

NOTE: describeDetailedPartitionInfo is used exclusively when DescribeTableCommand is requested to <> with a non-empty <>.

=== [[creating-instance]] Creating DescribeTableCommand Instance

DescribeTableCommand takes the following when created:

  • [[table]] TableIdentifier
  • [[partitionSpec]] TablePartitionSpec
  • [[isExtended]] isExtended flag

DescribeTableCommand initializes the <>.

=== [[describeSchema]] describeSchema Internal Method

[source, scala]

describeSchema( schema: StructType, buffer: ArrayBuffer[Row], header: Boolean): Unit


describeSchema...FIXME

NOTE: describeSchema is used when...FIXME

=== [[describePartitionInfo]] Describing Partition Information -- describePartitionInfo Internal Method

[source, scala]

describePartitionInfo(table: CatalogTable, buffer: ArrayBuffer[Row]): Unit

describePartitionInfo...FIXME

NOTE: describePartitionInfo is used when...FIXME

Back to top