Skip to content

Demo: Deploying Spark Application to Google Kubernetes Engine

This demo shows the steps to deploy a Spark application to a Google Kubernetes Engine (GKE) cluster.

Before you begin

Make sure to review the other demos (esp. Demo: Running Spark Examples on Google Kubernetes Engine) to get some experience with Spark on Kubernetes and Google Kubernetes Engine.

Build Spark Application Image

sbt clean docker:publishLocal

List the images using docker images.

$ docker images \
  --filter=reference='$GCP_CR/*:*' \
  --format "table {{.Repository}}\t{{.Tag}}"
REPOSITORY                                                TAG   0.1.0                  v3.0.1

Pushing Image to Container Registry

Upload the image to a registry so that your GKE cluster can download and run the container image (as described in Pushing the Docker image to Container Registry).

gcloud auth configure-docker
$ sbt docker:publish
[info] Built image with tags [0.1.0]
[info] The push refers to repository []
[info] Published image

View the images in the repository.

$ gcloud container images list --repository $GCP_CR

Creating Google Kubernetes Engine Cluster

Create a GKE cluster to run the Spark application.

export CLUSTER_NAME=spark-demo-cluster
gcloud container clusters create $CLUSTER_NAME \
  --cluster-version=1.17.15-gke.800 \

Deploying Spark Application to GKE

Let's deploy the Docker image of the Spark application to the GKE cluster.


Create the required Kubernetes resources to run Spark applications as described in Demo: Running Spark Examples on Google Kubernetes Engine

export K8S_SERVER=$(kubectl config view --output=jsonpath='{.clusters[].cluster.server}')
export DEMO_POD_NAME=spark-demo-gke
export CONTAINER_IMAGE=$GCP_CR/spark-docker-example:0.1.0
./bin/spark-submit \
  --master k8s://$K8S_SERVER \
  --deploy-mode cluster \
  --name $DEMO_POD_NAME \
  --class meetup.SparkApp \
  --conf spark.kubernetes.container.image=$CONTAINER_IMAGE \
  --conf$DEMO_POD_NAME \
  --conf spark.kubernetes.namespace=spark-demo \
  --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
  --verbose \

Watch the pods in another terminal.

k get po -w

Accessing Logs

Access the logs of the driver.

k logs -f $DEMO_POD_NAME

Cleaning up

Delete the GKE cluster.

gcloud container clusters delete $CLUSTER_NAME --quiet

Delete the images.

gcloud container images delete $GCP_CR/spark:v3.0.1 --force-delete-tags --quiet
gcloud container images delete $GCP_CR/spark-docker-example:0.1.0 --force-delete-tags --quiet
Back to top