BroadcastManager¶
BroadcastManager
is a Spark service to manage Broadcast.md[]s in a Spark application.
BroadcastManager, SparkEnv and BroadcastFactory
BroadcastManager assigns <
BroadcastManager is used to create a scheduler:MapOutputTrackerMaster.md#BroadcastManager[MapOutputTrackerMaster]
== [[creating-instance]] Creating Instance
BroadcastManager takes the following to be created:
- <
> flag - [[conf]] SparkConf.md[SparkConf]
- [[securityManager]] SecurityManager
When created, BroadcastManager <
BroadcastManager is created when SparkEnv is core:SparkEnv.md[created] (for the driver and executors and hence the need for the <
== [[isDriver]] isDriver Flag
BroadcastManager is given isDriver
flag when <
The isDriver flag indicates whether the initialization happens on the driver (true
) or executors (false
).
BroadcastManager uses the flag when requested to <
== [[broadcastFactory]] TorrentBroadcastFactory
BroadcastManager manages a core:BroadcastFactory.md[BroadcastFactory]:
-
It is created and initialized in <
> -
It is stopped in <
> (and that is all it does)
BroadcastManager uses the BroadcastFactory when requested to <
== [[cachedValues]] cachedValues Registry
[source,scala]¶
cachedValues: ReferenceMap¶
== [[nextBroadcastId]] Unique Identifiers of Broadcast Variables
BroadcastManager tracks broadcast variables and controls their identifiers.
Every <
== [[initialize]][[initialized]] Initializing BroadcastManager
[source, scala]¶
initialize(): Unit¶
initialize creates a <
initialize turns initialized
internal flag on to guard against multiple initializations. With the initialized flag already enabled, initialize does nothing.
initialize is used once when BroadcastManager is <
== [[stop]] Stopping BroadcastManager
[source, scala]¶
stop(): Unit¶
stop requests the <
== [[newBroadcast]] Creating Broadcast Variable
[source, scala]¶
newBroadcastT: Broadcast[T]
newBroadcast requests the core:BroadcastFactory.md[current BroadcastFactory
for a new broadcast variable].
The BroadcastFactory
is created when <
newBroadcast is used when:
-
MapOutputTracker utility is used to scheduler:MapOutputTracker.md#serializeMapStatuses[serializeMapStatuses]
-
SparkContext is requested for a SparkContext.md#broadcast[new broadcast variable]