KubernetesClusterManager¶
KubernetesClusterManager
is an ExternalClusterManager
(Apache Spark) that can create scheduler components for k8s master URLs:
KubernetesClusterManager
is registered with Apache Spark using META-INF/services/org.apache.spark.scheduler.ExternalClusterManager
service file.
Creating Instance¶
KubernetesClusterManager
takes no arguments to be created.
KubernetesClusterManager
is created when:
SparkContext
is requested for anExternalClusterManager
(when requested for a SchedulerBackend and TaskScheduler)
Creating SchedulerBackend¶
createSchedulerBackend(
sc: SparkContext,
masterURL: String,
scheduler: TaskScheduler): SchedulerBackend
createSchedulerBackend
is part of the ExternalClusterManager
(Apache Spark) abstraction.
createSchedulerBackend
creates a KubernetesClusterSchedulerBackend.
Note
createSchedulerBackend
assumes that the given TaskScheduler
is TaskSchedulerImpl
(Apache Spark).
createSchedulerBackend
determines four internal values based on the spark.kubernetes.submitInDriver internal configuration property.
spark.kubernetes.submitInDriver | ||
---|---|---|
Enabled (true ) | Disabled (false ) | |
authConfPrefix | spark.kubernetes.authenticate.driver.mounted | spark.kubernetes.authenticate |
apiServerUri | spark.kubernetes.driver.master | Master URL with no k8s:// prefix |
defaultServiceAccountToken | /var/run/secrets/kubernetes.io/serviceaccount/token | |
defaultServiceAccountCaCrt | /var/run/secrets/kubernetes.io/serviceaccount/ca.crt |
Unless already defined, createSchedulerBackend
sets the spark.kubernetes.executor.podNamePrefix configuration properties based on spark.app.name prefix.
createSchedulerBackend
creates a KubernetesClient for the Driver
client type and the following:
- spark.kubernetes.namespace configuration property
- apiServerUri
- authConfPrefix
- defaultServiceAccountToken
- defaultServiceAccountCaCrt
With spark.kubernetes.executor.podTemplateFile configuration property enabled, createSchedulerBackend
loads the pod spec from the pod template file with the optional spark.kubernetes.executor.podTemplateContainerName configuration property.
In the end, createSchedulerBackend
creates a KubernetesClusterSchedulerBackend with the following:
-
Java
ScheduledExecutorService
with kubernetes-executor-maintenance thread name -
ExecutorPodsSnapshotsStoreImpl with a Java
ScheduledExecutorService
with kubernetes-executor-snapshots-subscribers thread names and 2 threads -
ExecutorPodsPollingSnapshotSource with a Java
ScheduledExecutorService
with kubernetes-executor-pod-polling-sync thread name
IllegalArgumentException¶
With spark.kubernetes.submitInDriver
enabled, createSchedulerBackend
asserts that the name of the driver pod is configured (using spark.kubernetes.driver.pod.name configuration property) or else throws an IllegalArgumentException
:
If the application is deployed using spark-submit in cluster mode, the driver pod name must be provided.
Creating TaskScheduler¶
createTaskScheduler(
sc: SparkContext,
masterURL: String): TaskScheduler
createTaskScheduler
is part of the ExternalClusterManager
(Apache Spark) abstraction.
createTaskScheduler
creates a TaskSchedulerImpl
(Apache Spark).
Initializing Scheduling Components¶
initialize(
scheduler: TaskScheduler,
backend: SchedulerBackend): Unit
initialize
is part of the ExternalClusterManager
(Apache Spark) abstraction.
initialize
requests the given TaskSchedulerImpl
(Apache Spark) to initialize with the given SchedulerBackend
(Apache Spark).