Utils Utility¶
Local URI Scheme¶
Utils defines a local URI scheme for files that are locally available on worker nodes in the cluster.
The local URL scheme is used when:
Utilsis used to isLocalUriClient(Spark on YARN) is used
isLocalUri¶
isLocalUri(
uri: String): Boolean
isLocalUri is true when the URI is a local: URI (the given uri starts with local: scheme).
isLocalUri is used when:
- FIXME
getCurrentUserName¶
getCurrentUserName(): String
getCurrentUserName computes the user name who has started the SparkContext.md[SparkContext] instance.
NOTE: It is later available as SparkContext.md#sparkUser[SparkContext.sparkUser].
Internally, it reads SparkContext.md#SPARK_USER[SPARK_USER] environment variable and, if not set, reverts to Hadoop Security API's UserGroupInformation.getCurrentUser().getShortUserName().
NOTE: It is another place where Spark relies on Hadoop API for its operation.
localHostName¶
localHostName(): String
localHostName computes the local host name.
It starts by checking SPARK_LOCAL_HOSTNAME environment variable for the value. If it is not defined, it uses SPARK_LOCAL_IP to find the name (using InetAddress.getByName). If it is not defined either, it calls InetAddress.getLocalHost for the name.
NOTE: Utils.localHostName is executed while SparkContext.md#creating-instance[SparkContext is created] and also to compute the default value of spark-driver.md#spark_driver_host[spark.driver.host Spark property].
getUserJars¶
getUserJars(
conf: SparkConf): Seq[String]
getUserJars is the spark.jars configuration property with non-empty entries.
getUserJars is used when:
SparkContextis created
extractHostPortFromSparkUrl¶
extractHostPortFromSparkUrl(
sparkUrl: String): (String, Int)
extractHostPortFromSparkUrl creates a Java URI with the input sparkUrl and takes the host and port parts.
extractHostPortFromSparkUrl asserts that the input sparkURL uses spark scheme.
extractHostPortFromSparkUrl throws a SparkException for unparseable spark URLs:
Invalid master URL: [sparkUrl]
extractHostPortFromSparkUrl is used when:
StandaloneSubmitRequestServletis requested tobuildDriverDescriptionRpcAddressis requested to extract an RpcAddress from a Spark master URL
isDynamicAllocationEnabled¶
isDynamicAllocationEnabled(
conf: SparkConf): Boolean
isDynamicAllocationEnabled is true when the following hold:
- spark.dynamicAllocation.enabled configuration property is
true - spark.master is non-
local
isDynamicAllocationEnabled is used when:
SparkContextis created (to start an ExecutorAllocationManager)DAGScheduleris requested to checkBarrierStageWithDynamicAllocationSchedulerBackendUtilsis requested to getInitialTargetExecutorNumberStandaloneSchedulerBackend(Spark Standalone) is requested tostartExecutorPodsAllocator(Spark on Kubernetes) is requested toonNewSnapshotsApplicationMaster(Spark on YARN) is created
checkAndGetK8sMasterUrl¶
checkAndGetK8sMasterUrl(
rawMasterURL: String): String
checkAndGetK8sMasterUrl...FIXME
checkAndGetK8sMasterUrl is used when:
SparkSubmitis requested to prepareSubmitEnvironment (for Kubernetes cluster manager)
getLocalDir¶
getLocalDir(
conf: SparkConf): String
getLocalDir...FIXME
getLocalDir is used when:
-
Utilsis requested to <> -
SparkEnvis core:SparkEnv.md#create[created] (on the driver) -
spark-shell.md[spark-shell] is launched
-
Spark on YARN's
Clientis requested to spark-yarn-client.md#prepareLocalResources[prepareLocalResources] and spark-yarn-client.md#createConfArchive[create ++spark_conf.zip++ archive with configuration files and Spark configuration] -
PySpark's
PythonBroadcastis requested toreadObject -
PySpark's
EvalPythonExecis requested todoExecute
Fetching File¶
fetchFile(
url: String,
targetDir: File,
conf: SparkConf,
securityMgr: SecurityManager,
hadoopConf: Configuration,
timestamp: Long,
useCache: Boolean): File
fetchFile...FIXME
fetchFile is used when:
-
SparkContextis requested to SparkContext.md#addFile[addFile] -
Executoris requested to executor:Executor.md#updateDependencies[updateDependencies] -
Spark Standalone's
DriverRunneris requested todownloadUserJar
getOrCreateLocalRootDirs¶
getOrCreateLocalRootDirs(
conf: SparkConf): Array[String]
getOrCreateLocalRootDirs...FIXME
getOrCreateLocalRootDirs is used when:
-
Utilsis requested to <> -
Workeris requested to spark-standalone-worker.md#receive[handle a LaunchExecutor message]
getOrCreateLocalRootDirsImpl¶
getOrCreateLocalRootDirsImpl(
conf: SparkConf): Array[String]
getOrCreateLocalRootDirsImpl...FIXME
getOrCreateLocalRootDirsImpl is used when Utils is requested to getOrCreateLocalRootDirs