Demo: Deploying Spark Application to Google Kubernetes Engine¶
This demo shows the steps to deploy a Spark application to a Google Kubernetes Engine (GKE) cluster.
Before you begin¶
Make sure to review the other demos (esp. Demo: Running Spark Examples on Google Kubernetes Engine) to get some experience with Spark on Kubernetes and Google Kubernetes Engine.
Build Spark Application Image¶
sbt clean docker:publishLocal
List the images using docker images
.
$ docker images \
--filter=reference='$GCP_CR/*:*' \
--format "table {{.Repository}}\t{{.Tag}}"
REPOSITORY TAG
eu.gcr.io/spark-on-kubernetes-2021/spark-docker-example 0.1.0
eu.gcr.io/spark-on-kubernetes-2021/spark v3.0.1
Pushing Image to Container Registry¶
Upload the image to a registry so that your GKE cluster can download and run the container image (as described in Pushing the Docker image to Container Registry).
gcloud auth configure-docker
$ sbt docker:publish
...
[info] Built image eu.gcr.io/spark-on-kubernetes-2021/spark-docker-example with tags [0.1.0]
[info] The push refers to repository [eu.gcr.io/spark-on-kubernetes-2021/spark-docker-example]
...
[info] Published image eu.gcr.io/spark-on-kubernetes-2021/spark-docker-example:0.1.0
View the images in the repository.
$ gcloud container images list --repository $GCP_CR
NAME
eu.gcr.io/spark-on-kubernetes-2021/spark
eu.gcr.io/spark-on-kubernetes-2021/spark-docker-example
Creating Google Kubernetes Engine Cluster¶
Create a GKE cluster to run the Spark application.
export CLUSTER_NAME=spark-demo-cluster
gcloud container clusters create $CLUSTER_NAME \
--cluster-version=1.17.15-gke.800 \
--machine-type=c2-standard-4
Deploying Spark Application to GKE¶
Let's deploy the Docker image of the Spark application to the GKE cluster.
Important
Create the required Kubernetes resources to run Spark applications as described in Demo: Running Spark Examples on Google Kubernetes Engine
cd $SPARK_HOME
export K8S_SERVER=$(kubectl config view --output=jsonpath='{.clusters[].cluster.server}')
export DEMO_POD_NAME=spark-demo-gke
export CONTAINER_IMAGE=$GCP_CR/spark-docker-example:0.1.0
./bin/spark-submit \
--master k8s://$K8S_SERVER \
--deploy-mode cluster \
--name $DEMO_POD_NAME \
--class meetup.SparkApp \
--conf spark.kubernetes.container.image=$CONTAINER_IMAGE \
--conf spark.kubernetes.driver.pod.name=$DEMO_POD_NAME \
--conf spark.kubernetes.namespace=spark-demo \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--verbose \
local:///opt/docker/lib/meetup.spark-docker-example-0.1.0.jar
Watch the pods in another terminal.
k get po -w
Accessing Logs¶
Access the logs of the driver.
k logs -f $DEMO_POD_NAME
Cleaning up¶
Delete the GKE cluster.
gcloud container clusters delete $CLUSTER_NAME --quiet
Delete the images.
gcloud container images delete $GCP_CR/spark:v3.0.1 --force-delete-tags --quiet
gcloud container images delete $GCP_CR/spark-docker-example:0.1.0 --force-delete-tags --quiet