spark-workshop

Exercise: Standalone Spark Application to Display Spark SQL Version

This exercise is about creating a standalone Spark SQL application in Scala that displays the version of Spark SQL in use.

Module: Spark SQL

Duration: 30 mins

Steps

  1. In IntelliJ IDEA create a new Scala sbt-managed project
  2. Define Spark SQL dependency in build.sbt
    • libraryDependencies
    • "org.apache.spark" %% "spark-sql" % "3.2.1"
  3. Write the required Spark SQL code
  4. Build an executable jar
    • Use sbt package on command line or use IDEA’s sbt view
  5. Run the application using spark-submit