KubernetesClusterSchedulerBackend¶
KubernetesClusterSchedulerBackend is a CoarseGrainedSchedulerBackend (Apache Spark) for Spark on Kubernetes.
Creating Instance¶
KubernetesClusterSchedulerBackend takes the following to be created:
-
TaskSchedulerImpl(Apache Spark) -
SparkContext(Apache Spark) -
KubernetesClient - Java's ScheduledExecutorService
- ExecutorPodsSnapshotsStore
- ExecutorPodsAllocator
- ExecutorPodsLifecycleManager
- ExecutorPodsWatchSnapshotSource
- ExecutorPodsPollingSnapshotSource
KubernetesClusterSchedulerBackend is created when:
KubernetesClusterManageris requested for a SchedulerBackend
ExecutorPodsLifecycleManager¶
KubernetesClusterSchedulerBackend is given an ExecutorPodsLifecycleManager to be created.
KubernetesClusterSchedulerBackend requests the ExecutorPodsLifecycleManager to start (with itself) when started.
ExecutorPodsAllocator¶
KubernetesClusterSchedulerBackend is given an ExecutorPodsAllocator to be created.
When started, KubernetesClusterSchedulerBackend requests the ExecutorPodsAllocator to setTotalExpectedExecutors to the number of initial executors and starts it with application Id.
When requested for the expected number of executors, KubernetesClusterSchedulerBackend requests the ExecutorPodsAllocator to setTotalExpectedExecutors to the given total number of executors.
When requested to isBlacklisted, KubernetesClusterSchedulerBackend requests the ExecutorPodsAllocator to isDeleted with a given executor.
Number of Initial Executors¶
initialExecutors: Int
KubernetesClusterSchedulerBackend calculates the initial target number of executors (cf. SchedulerBackendUtils) when created.
initialExecutors is used when KubernetesClusterSchedulerBackend is requested to start and whether or not sufficient resources registered.
Default ResourceProfile¶
KubernetesClusterSchedulerBackend requests the TaskSchedulerImpl for the SparkContext that is in turn requested for the ResourceProfileManager (Apache Spark) for the default ResourceProfile (Apache Spark).
When started, KubernetesClusterSchedulerBackend uses the default ResourceProfile (along with the initialExecutors) for the ExecutorPodsAllocator to setTotalExpectedExecutors.
Application Id¶
applicationId(): String
applicationId is part of the SchedulerBackend (Apache Spark) abstraction.
applicationId is the value of spark.app.id configuration property if defined or the default applicationId.
Sufficient Resources Registered¶
sufficientResourcesRegistered(): Boolean
sufficientResourcesRegistered is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
sufficientResourcesRegistered holds (is true) when the totalRegisteredExecutors is at least the ratio of the initial executors.
Minimum Resources Available Ratio¶
minRegisteredRatio: Double
minRegisteredRatio is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
minRegisteredRatio is 0.8 unless spark.scheduler.minRegisteredResourcesRatio configuration property is defined.
Starting SchedulerBackend¶
start(): Unit
start is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
start creates a delegation token manager.
start requests the ExecutorPodsAllocator to setTotalExpectedExecutors to initialExecutors.
start requests the ExecutorPodsLifecycleManager to start (with this KubernetesClusterSchedulerBackend).
start requests the ExecutorPodsAllocator to start (with the applicationId)
start requests the ExecutorPodsWatchSnapshotSource to start (with the applicationId)
start requests the ExecutorPodsPollingSnapshotSource to start (with the applicationId)
In the end, start setUpExecutorConfigMap.
setUpExecutorConfigMap¶
setUpExecutorConfigMap(): Unit
setUpExecutorConfigMap takes the Name of Config Map for Executors and buildSparkConfDirFilesMap (with the SparkConf).
setUpExecutorConfigMap buildConfigMap with the labels (and the name of the config map and the configuration files).
| Name | Value |
|---|---|
spark-app-selector | Application Id |
spark-role | executor |
In the end, setUpExecutorConfigMap requests the KubernetesClient to create a new config map.
Creating DriverEndpoint¶
createDriverEndpoint(): DriverEndpoint
createDriverEndpoint is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
createDriverEndpoint creates a KubernetesDriverEndpoint.
Requesting Executors from Cluster Manager¶
doRequestTotalExecutors(
requestedTotal: Int): Future[Boolean]
doRequestTotalExecutors is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
doRequestTotalExecutors requests the ExecutorPodsAllocator to setTotalExpectedExecutors to the given requestedTotal.
In the end, doRequestTotalExecutors returns a completed Future with true value.
Stopping SchedulerBackend¶
stop(): Unit
stop is part of the CoarseGrainedSchedulerBackend (Apache Spark) abstraction.
stop...FIXME