KubernetesClusterSchedulerBackend¶
KubernetesClusterSchedulerBackend
is a CoarseGrainedSchedulerBackend
(Apache Spark) for Spark on Kubernetes.
Creating Instance¶
KubernetesClusterSchedulerBackend
takes the following to be created:
-
TaskSchedulerImpl
(Apache Spark) -
SparkContext
(Apache Spark) -
KubernetesClient
- Java's ScheduledExecutorService
- ExecutorPodsSnapshotsStore
- ExecutorPodsAllocator
- ExecutorPodsLifecycleManager
- ExecutorPodsWatchSnapshotSource
- ExecutorPodsPollingSnapshotSource
KubernetesClusterSchedulerBackend
is created when:
KubernetesClusterManager
is requested for a SchedulerBackend
ExecutorPodsLifecycleManager¶
KubernetesClusterSchedulerBackend
is given an ExecutorPodsLifecycleManager to be created.
KubernetesClusterSchedulerBackend
requests the ExecutorPodsLifecycleManager
to start (with itself) when started.
ExecutorPodsAllocator¶
KubernetesClusterSchedulerBackend
is given an ExecutorPodsAllocator to be created.
When started, KubernetesClusterSchedulerBackend
requests the ExecutorPodsAllocator
to setTotalExpectedExecutors to the number of initial executors and starts it with application Id.
When requested for the expected number of executors, KubernetesClusterSchedulerBackend
requests the ExecutorPodsAllocator
to setTotalExpectedExecutors to the given total number of executors.
When requested to isBlacklisted, KubernetesClusterSchedulerBackend
requests the ExecutorPodsAllocator
to isDeleted with a given executor.
Number of Initial Executors¶
initialExecutors: Int
KubernetesClusterSchedulerBackend
calculates the initial target number of executors (cf. SchedulerBackendUtils) when created.
initialExecutors
is used when KubernetesClusterSchedulerBackend
is requested to start and whether or not sufficient resources registered.
Default ResourceProfile¶
KubernetesClusterSchedulerBackend
requests the TaskSchedulerImpl for the SparkContext
that is in turn requested for the ResourceProfileManager
(Apache Spark) for the default ResourceProfile
(Apache Spark).
When started, KubernetesClusterSchedulerBackend
uses the default ResourceProfile
(along with the initialExecutors) for the ExecutorPodsAllocator to setTotalExpectedExecutors.
Application Id¶
applicationId(): String
applicationId
is part of the SchedulerBackend
(Apache Spark) abstraction.
applicationId
is the value of spark.app.id
configuration property if defined or the default applicationId
.
Sufficient Resources Registered¶
sufficientResourcesRegistered(): Boolean
sufficientResourcesRegistered
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
sufficientResourcesRegistered
holds (is true
) when the totalRegisteredExecutors
is at least the ratio of the initial executors.
Minimum Resources Available Ratio¶
minRegisteredRatio: Double
minRegisteredRatio
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
minRegisteredRatio
is 0.8
unless spark.scheduler.minRegisteredResourcesRatio
configuration property is defined.
Starting SchedulerBackend¶
start(): Unit
start
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
start
creates a delegation token manager.
start
requests the ExecutorPodsAllocator to setTotalExpectedExecutors to initialExecutors.
start
requests the ExecutorPodsLifecycleManager to start (with this KubernetesClusterSchedulerBackend
).
start
requests the ExecutorPodsAllocator to start (with the applicationId)
start
requests the ExecutorPodsWatchSnapshotSource to start (with the applicationId)
start
requests the ExecutorPodsPollingSnapshotSource to start (with the applicationId)
In the end, start
setUpExecutorConfigMap.
setUpExecutorConfigMap¶
setUpExecutorConfigMap(): Unit
setUpExecutorConfigMap
takes the Name of Config Map for Executors and buildSparkConfDirFilesMap (with the SparkConf
).
setUpExecutorConfigMap
buildConfigMap with the labels (and the name of the config map and the configuration files).
Name | Value |
---|---|
spark-app-selector | Application Id |
spark-role | executor |
In the end, setUpExecutorConfigMap
requests the KubernetesClient to create a new config map.
Creating DriverEndpoint¶
createDriverEndpoint(): DriverEndpoint
createDriverEndpoint
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
createDriverEndpoint
creates a KubernetesDriverEndpoint.
Requesting Executors from Cluster Manager¶
doRequestTotalExecutors(
requestedTotal: Int): Future[Boolean]
doRequestTotalExecutors
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
doRequestTotalExecutors
requests the ExecutorPodsAllocator to setTotalExpectedExecutors to the given requestedTotal
.
In the end, doRequestTotalExecutors
returns a completed Future
with true
value.
Stopping SchedulerBackend¶
stop(): Unit
stop
is part of the CoarseGrainedSchedulerBackend
(Apache Spark) abstraction.
stop
...FIXME