Skip to content

AstBuilder — ANTLR-based SQL Parser

AstBuilder converts ANTLR ParseTrees into Catalyst entities using visit callbacks.

AstBuilder is the only requirement of the AbstractSqlParser abstraction (and used by CatalystSqlParser directly while SparkSqlParser uses SparkSqlAstBuilder instead).

SqlBase.g4 — ANTLR Grammar

AstBuilder is a ANTLR AbstractParseTreeVisitor (as SqlBaseBaseVisitor) that is generated from the ANTLR grammar of Spark SQL.

SqlBaseBaseVisitor is a ANTLR-specific base class that is generated at build time from the ANTLR grammar of Spark SQL is available in the Apache Spark repository at SqlBase.g4.

SqlBaseBaseVisitor is an AbstractParseTreeVisitor in ANTLR.

Visit Callbacks

visitAnalyze

Creates an AnalyzeColumn or AnalyzeColumn logical operator

ANALYZE TABLE multipartIdentifier partitionSpec? COMPUTE STATISTICS
  (identifier | FOR COLUMNS identifierSeq | FOR ALL COLUMNS)?

ANTLR labeled alternative: #analyze

visitCommentTable

Creates a CommentOnTable logical command

COMMENT ON TABLE tableIdentifier IS ('text' | NULL)

ANTLR labeled alternative: #commentTable

visitCreateTable

Creates a CreateTableAsSelectStatement or a CreateTableStatement logical command

CREATE TEMPORARY? EXTERNAL? TABLE (IF NOT EXISTS)? [multipartIdentifier]
  ('(' [colType] (',' [colType])* ')')?
  (USING [multipartIdentifier])?
  [createTableClauses]
  (AS? query)?

ANTLR labeled alternative: #createTable

visitDeleteFromTable

Creates a DeleteFromTable logical command

DELETE FROM multipartIdentifier tableAlias whereClause?

ANTLR labeled alternative: #deleteFromTable

visitDescribeRelation

Creates a DescribeColumnStatement or DescribeRelation

(DESC | DESCRIBE) TABLE? option=(EXTENDED | FORMATTED)?
  multipartIdentifier partitionSpec? describeColName?

ANTLR labeled alternative: #describeRelation

visitExists

Creates an Exists expression

ANTLR labeled alternative: #exists

visitExplain

Creates a ExplainCommand

ANTLR rule: explain

visitFirst

Creates a First aggregate function expression

FIRST '(' expression (IGNORE NULLS)? ')'

ANTLR labeled alternative: #first

visitFromClause

Creates a LogicalPlan

FROM relation (',' relation)* lateralView* pivotClause?

Supports multiple comma-separated relations (that all together build a condition-less INNER JOIN) with optional LATERAL VIEW.

A relation can be one of the following or a combination thereof:

  • Table identifier
  • Inline table using VALUES exprs AS tableIdent
  • Table-valued function (currently only range is supported)

ANTLR rule: fromClause

visitFunctionCall

Creates one of the following:

ANTLR rule: functionCall

import spark.sessionState.sqlParser

scala> sqlParser.parseExpression("foo()")
res0: org.apache.spark.sql.catalyst.expressions.Expression = 'foo()

scala> sqlParser.parseExpression("foo() OVER windowSpecRef")
res1: org.apache.spark.sql.catalyst.expressions.Expression = unresolvedwindowexpression('foo(), WindowSpecReference(windowSpecRef))

scala> sqlParser.parseExpression("foo() OVER (CLUSTER BY field)")
res2: org.apache.spark.sql.catalyst.expressions.Expression = 'foo() windowspecdefinition('field, UnspecifiedFrame)

visitInlineTable

Creates a UnresolvedInlineTable unary logical operator (as the child of SubqueryAlias for tableAlias)

VALUES expression (',' expression)* tableAlias

expression can be as follows:

tableAlias can be specified explicitly or defaults to colN for every column (starting from 1 for N).

ANTLR rule: inlineTable

visitInsertIntoTable

Creates a InsertIntoTable (indirectly)

A 3-element tuple with a TableIdentifier, optional partition keys and the exists flag disabled

INSERT INTO TABLE? tableIdentifier partitionSpec?

ANTLR labeled alternative: #insertIntoTable

Note

insertIntoTable is part of insertInto that is in turn used only as a helper labeled alternative in singleInsertQuery and multiInsertQueryBody ANTLR rules.

visitInsertOverwriteTable

Creates a InsertIntoTable (indirectly)

A 3-element tuple with a TableIdentifier, optional partition keys and the exists flag

INSERT OVERWRITE TABLE tableIdentifier (partitionSpec (IF NOT EXISTS)?)?

In a way, visitInsertOverwriteTable is simply a more general version of the visitInsertIntoTable with the exists flag on or off based on existence of IF NOT EXISTS. The main difference is that dynamic partitions are used with no IF NOT EXISTS.

ANTLR labeled alternative: #insertOverwriteTable

Note

insertIntoTable is part of insertInto that is in turn used only as a helper labeled alternative in singleInsertQuery and multiInsertQueryBody ANTLR rules.

visitMergeIntoTable

Creates a MergeIntoTable logical command

MERGE INTO target targetAlias
USING (source | '(' sourceQuery ')') sourceAlias
ON mergeCondition
matchedClause*
notMatchedClause*

matchedClause
  : WHEN MATCHED (AND matchedCond)? THEN matchedAction

notMatchedClause
  : WHEN NOT MATCHED (AND notMatchedCond)? THEN notMatchedAction

matchedAction
  : DELETE
  | UPDATE SET *
  | UPDATE SET assignment (',' assignment)*

notMatchedAction
  : INSERT *
  | INSERT '(' columns ')'
    VALUES '(' expression (',' expression)* ')'

Requirements:

  1. There must be at least one WHEN clause
  2. When there are more than one MATCHED clauses, only the last MATCHED clause can omit the condition
  3. When there are more than one NOT MATCHED clauses, only the last NOT MATCHED clause can omit the condition

ANTLR labeled alternative: #mergeIntoTable

visitMultiInsertQuery

Creates a logical operator with a InsertIntoTable (and UnresolvedRelation leaf operator)

FROM relation (',' relation)* lateralView*
INSERT OVERWRITE TABLE ...

FROM relation (',' relation)* lateralView*
INSERT INTO TABLE? ...

ANTLR rule: multiInsertQueryBody

visitNamedExpression

Creates one of the following Catalyst expressions:

  • Alias (for a single alias)
  • MultiAlias (for a parenthesis enclosed alias list)
  • a bare Expression

ANTLR rule: namedExpression

visitNamedQuery

Creates a SubqueryAlias

visitQuerySpecification

Creates OneRowRelation or LogicalPlan

OneRowRelation

visitQuerySpecification creates a OneRowRelation for a SELECT without a FROM clause.

val q = sql("select 1")
scala> println(q.queryExecution.logical.numberedTreeString)
00 'Project [unresolvedalias(1, None)]
01 +- OneRowRelation$

ANTLR rule: querySpecification

visitPredicated

Creates an Expression

ANTLR rule: predicated

visitRelation

Creates a LogicalPlan for a FROM clause.

ANTLR rule: relation

visitRepairTable

Creates a RepairTableStatement for the following SQL statement:

MSCK REPAIR TABLE multipartIdentifier

ANTLR labeled alternative: #repairTable

visitShowCurrentNamespace

Creates a ShowCurrentNamespaceStatement for the following SQL statement:

SHOW CURRENT NAMESPACE

ANTLR labeled alternative: #showCurrentNamespace

visitShowTblProperties

Creates a ShowTableProperties logical command

SHOW TBLPROPERTIES [multi-part table identifier]
  ('(' [dot-separated table property key] ')')?

ANTLR labeled alternative: #showTblProperties

visitShowTables

Creates a ShowTables for the following SQL statement:

SHOW TABLES ((FROM | IN) multipartIdentifier)?
  (LIKE? pattern=STRING)?

ANTLR labeled alternative: #showTables

visitSingleDataType

Creates a DataType

ANTLR rule: singleDataType

visitSingleExpression

Creates an Expression

Takes the named expression and relays to visitNamedExpression

ANTLR rule: singleExpression

visitSingleInsertQuery

Creates a LogicalPlan with a InsertIntoTable

INSERT INTO TABLE? tableIdentifier partitionSpec? #insertIntoTable

INSERT OVERWRITE TABLE tableIdentifier (partitionSpec (IF NOT EXISTS)?)? #insertOverwriteTable

ANTLR labeled alternative: #singleInsertQuery

visitSortItem

Creates a SortOrder unary expression

sortItem
    : expression ordering=(ASC | DESC)? (NULLS nullOrder=(LAST | FIRST))?
    ;

// queryOrganization
ORDER BY order+=sortItem (',' order+=sortItem)*
SORT BY sort+=sortItem (',' sort+=sortItem)*

// windowSpec
(ORDER | SORT) BY sortItem (',' sortItem)*)?

ANTLR rule: sortItem

visitSingleStatement

Creates a LogicalPlan from a single SQL statement

ANTLR rule: singleStatement

visitStar

Creates a UnresolvedStar

ANTLR labeled alternative: #star

visitSubqueryExpression

Creates a ScalarSubquery

ANTLR labeled alternative: #subqueryExpression

visitUpdateTable

Creates an UpdateTable logical operator

UPDATE multipartIdentifier tableAlias setClause whereClause?

ANTLR labeled alternative: #updateTable

visitUse

Creates a UseStatement

USE NAMESPACE? multipartIdentifier

ANTLR labeled alternative: #use

visitWindowDef

Creates a WindowSpecDefinition

// CLUSTER BY with window frame
'(' CLUSTER BY partition+=expression (',' partition+=expression)*) windowFrame? ')'

// PARTITION BY and ORDER BY with window frame
'(' ((PARTITION | DISTRIBUTE) BY partition+=expression (',' partition+=expression)*)?
  ((ORDER | SORT) BY sortItem (',' sortItem)*)?)
  windowFrame? ')'

ANTLR rule: windowDef

Parsing Handlers

withAggregationClause

Adds one of the following logical operators:

  • GroupingSets for GROUP BY ... GROUPING SETS (...)

  • Aggregate for GROUP BY ... (WITH CUBE | WITH ROLLUP)?

withCTE

Creates an UnresolvedWith logical operator for Common Table Expressions (in visitQuery and visitDmlStatement)

WITH namedQuery (',' namedQuery)*

namedQuery
    : name (columnAliases)? AS? '(' query ')'
    ;

withGenerate

Adds a Generate with a UnresolvedGenerator and join flag enabled for LATERAL VIEW (in SELECT or FROM clauses).

withHavingClause

Creates an UnresolvedHaving for the following:

HAVING booleanExpression

withHints

Adds an UnresolvedHint for /*+ hint */ in SELECT queries.

Note

Note + (plus) between /* and */

hint is of the format name or name (param1, param2, ...).

/*+ BROADCAST (table) */

withInsertInto

Creates one of the following logical operators:

withInsertInto is used for visitMultiInsertQuery and visitSingleInsertQuery

withJoinRelations

Creates one or more Join logical operators for a FROM clause and relation.

The following join types are supported:

  • INNER (default)
  • CROSS
  • LEFT (with optional OUTER)
  • LEFT SEMI
  • RIGHT (with optional OUTER)
  • FULL (with optional OUTER)
  • ANTI (optionally prefixed with LEFT)

The following join criteria are supported:

  • ON booleanExpression
  • USING '(' identifier (',' identifier)* ')'

Joins can be NATURAL (with no join criteria)

withQuerySpecification

Adds a query specification to a logical operator

For transform SELECT (with TRANSFORM, MAP or REDUCE qualifiers), withQuerySpecification does...FIXME

For regular SELECT (no TRANSFORM, MAP or REDUCE qualifiers), withQuerySpecification adds (in that order):

  1. Generate unary logical operators (if used in the parsed SQL text)

  2. Filter unary logical plan (if used in the parsed SQL text)

  3. GroupingSets or Aggregate unary logical operators (if used in the parsed SQL text)

  4. Project and/or Filter unary logical operators

  5. WithWindowDefinition unary logical operator (if used in the parsed SQL text)

  6. UnresolvedHint unary logical operator (if used in the parsed SQL text)

withPredicate

  • NOT? IN '(' query ')' adds an In predicate expression with a ListQuery subquery expression

  • NOT? IN '(' expression (',' expression)* ')' adds an In predicate expression

withQueryResultClauses

Important

This section needs your help

withRepartitionByExpression

withRepartitionByExpression throws a ParseException:

DISTRIBUTE BY is not supported

withRepartitionByExpression is used when AstBuilder is requested to withQueryResultClauses (for DISTRIBUTE BY and CLUSTER BY SQL clauses).

withSample

Important

This section needs your help

withSelectQuerySpecification

Important

This section needs your help

withWindows

Adds a WithWindowDefinition for window aggregates (given WINDOW definitions).

Used for withQueryResultClauses and withQuerySpecification with windows definition.

WINDOW identifier AS windowSpec
  (',' identifier AS windowSpec)*

aliasPlan Method

aliasPlan(
  alias: ParserRuleContext,
  plan: LogicalPlan): LogicalPlan

aliasPlan...FIXME

aliasPlan is used when...FIXME

mayApplyAliasPlan Method

mayApplyAliasPlan(
  tableAlias: TableAliasContext,
  plan: LogicalPlan): LogicalPlan

mayApplyAliasPlan...FIXME

mayApplyAliasPlan is used when...FIXME

Back to top