AnalyzeTableCommand Logical Command -- Computing Table-Level Statistics¶
AnalyzeTableCommand
is a RunnableCommand.md[logical command] that <
AnalyzeTableCommand
is <PARTITION
specification and FOR COLUMNS
clause.
[source, scala]¶
// Seq((0, 0, "zero"), (1, 1, "one")).toDF("id", "p1", "p2").write.partitionBy("p1", "p2").saveAsTable("t1") val sqlText = "ANALYZE TABLE t1 COMPUTE STATISTICS NOSCAN" val plan = spark.sql(sqlText).queryExecution.logical import org.apache.spark.sql.execution.command.AnalyzeTableCommand val cmd = plan.asInstanceOf[AnalyzeTableCommand] scala> println(cmd) AnalyzeTableCommand t1
, false
=== [[run]] Executing Logical Command (Computing Table-Level Statistics and Altering Metastore) -- run
Method
[source, scala]¶
run(sparkSession: SparkSession): Seq[Row]¶
NOTE: run
is part of <
run
requests the session-specific SessionCatalog
for the metadata of the <
NOTE: run
uses the input SparkSession
to access the session-specific SparkSession.md#sessionState[SessionState] that in turn gives access to the current SessionState.md#catalog[SessionCatalog].
[[total-size-stat]][[row-count-stat]] run
computes the total size and, without <
NOTE: run
uses SparkSession
to SparkSession.md#table[find the table] in a metastore.
In the end, run
alters table statistics if different from the existing table statistics in metastore.
run
throws a AnalysisException
when executed on a view.
ANALYZE TABLE is not supported on views.
[NOTE]¶
Row count statistics triggers a Spark job to count the number of rows in a table (that happens with ANALYZE TABLE
with no NOSCAN
flag).
[source, scala]¶
// Seq((0, 0, "zero"), (1, 1, "one")).toDF("id", "p1", "p2").write.partitionBy("p1", "p2").saveAsTable("t1") val sqlText = "ANALYZE TABLE t1 COMPUTE STATISTICS" val plan = spark.sql(sqlText).queryExecution.logical import org.apache.spark.sql.execution.command.AnalyzeTableCommand val cmd = plan.asInstanceOf[AnalyzeTableCommand] scala> println(cmd) AnalyzeTableCommand t1
, false
// Execute ANALYZE TABLE // Check out web UI's Jobs tab for the number of Spark jobs // http://localhost:4040/jobs/ spark.sql(sqlText).show
====
=== [[creating-instance]] Creating AnalyzeTableCommand Instance
AnalyzeTableCommand
takes the following when created:
- [[tableIdent]]
TableIdentifier
- [[noscan]]
noscan
flag (enabled by default) that indicates whether spark-sql-cost-based-optimization.md#NOSCAN[NOSCAN] option was used or not