Skip to content

KubernetesUtils Utility

Parsing Master URL

parseMasterUrl(
  url: String): String

parseMasterUrl takes off the k8s:// prefix from the given url.

parseMasterUrl is used when:

buildPodWithServiceAccount

buildPodWithServiceAccount(
  serviceAccount: Option[String],
  pod: SparkPod): Option[Pod]

buildPodWithServiceAccount creates a new pod spec with the service account and service account name based on the given service account (if defined). Otherwise, buildPodWithServiceAccount returns None.

buildPodWithServiceAccount is used when:

Loading Pod Spec from Template File

loadPodFromTemplate(
  kubernetesClient: KubernetesClient,
  templateFile: File,
  containerName: Option[String]): SparkPod

loadPodFromTemplate requests the given KubernetesClient to load a pod spec from the input template file.

loadPodFromTemplate selects the Spark container (from the pod spec and the input container name).

In case of an Exception, loadPodFromTemplate prints out the following ERROR message to the logs:

Encountered exception while attempting to load initial pod spec from file

loadPodFromTemplate (re)throws a SparkException:

Could not load pod from template file.

loadPodFromTemplate is used when:

selectSparkContainer

selectSparkContainer(
  pod: Pod,
  containerName: Option[String]): SparkPod

selectSparkContainer creates a SparkPod based on the containers in the given Pod and the containerName.

selectSparkContainer takes the container specs from the the given Pod spec and tries to find the one with the containerName or takes the first defined.

selectSparkContainer includes the other containers in the pod spec.

selectSparkContainer prints out the following WARN message to the logs when no container could be found by the given name:

specified container [name] not found on the pod template, falling back to taking the first container

Uploading Local Files to Hadoop DFS

uploadAndTransformFileUris(
  fileUris: Iterable[String],
  conf: Option[SparkConf] = None): Iterable[String]

uploadAndTransformFileUris uploads local files in the given fileUris to Hadoop DFS (based on spark.kubernetes.file.upload.path configuration property).

In the end, uploadAndTransformFileUris returns the target URIs.

uploadAndTransformFileUris is used when:

uploadFileUri

uploadFileUri(
  uri: String,
  conf: Option[SparkConf] = None): String

uploadFileUri resolves the given uri to a well-formed file URI.

uploadFileUri creates a new Hadoop Configuration and resolves the spark.kubernetes.file.upload.path configuration property to a Hadoop FileSystem.

uploadFileUri creates (mkdirs) the Hadoop DFS path to upload the file of the format:

[spark.kubernetes.file.upload.path]/[spark-upload-[randomUUID]]

uploadFileUri prints out the following INFO message to the logs:

Uploading file: [path] to dest: [targetUri]...

In the end, uploadFileUri uploads the file to the target location (using Hadoop DFS's FileSystem.copyFromLocalFile) and returns the target URI.

SparkExceptions

uploadFileUri throws a SparkException when:

  1. Uploading the uri fails:

    Uploading file [path] failed...
    
  2. spark.kubernetes.file.upload.path configuration property is not defined:

    Please specify spark.kubernetes.file.upload.path property.
    
  3. SparkConf is not defined:

    Spark configuration is missing...
    

renameMainAppResource

renameMainAppResource(
  resource: String,
  conf: SparkConf): String

renameMainAppResource is converted to spark-internal internal name when the given resource is local and resolvable. Otherwise, renameMainAppResource returns the given resource as-is.

renameMainAppResource is used when:

  • DriverCommandFeatureStep is requested for the base driver container (for a JavaMainAppResource application)

isLocalAndResolvable

isLocalAndResolvable(
  resource: String): Boolean

isLocalAndResolvable is true when the given resource is:

  1. Not internal
  2. Uses either file or no URI scheme (after converting to a well-formed URI)

isLocalAndResolvable is used when:

isLocalDependency

isLocalDependency(
  uri: URI): Boolean

An input URI is a local dependency when the scheme is null (undefined) or file.

Logging

Enable ALL logging level for org.apache.spark.deploy.k8s.KubernetesUtils logger to see what happens inside.

Add the following line to conf/log4j.properties:

log4j.logger.org.apache.spark.deploy.k8s.KubernetesUtils=ALL

Refer to Logging.


Last update: 2021-01-27