-
Notifications
You must be signed in to change notification settings - Fork 247
Flow Execution
Let's say you have a "hello_world.go" file, and compiled into a "hello_word" binary file.
When "hello_word -glow" starts, it runs locally on your computer (duh...), and it will try to run a flow. It generates an optimized execution plan by merging steps, and divides the steps into task groups based on the number of partitions.
A task group needs to run on an executor. So the driver program will ask the master for available executors.
The resources, cpu cores and memory, are reported to the master by agents through heartbeats.
However the master may not always give the exact number of executors that the driver is asking for, due to resource limitation, competition with other driver programs, etc. But since all communication cross servers are pull-based and asynchronous, glow runs fine with limited resources. In extreme case, a flow can proceed even with just 1 single executor.
And even some executors start to run, the driver program will still request resources from the master if the compute demand is still high.
If some tasks are finished early, the executors are re-used directly instead of return and request again.
For each allocated server, the driver program will assign a group of tasks to the agent that manages the server. Then the agent replicates the driver binary file if not cached already, and starts to run the binary file in "task" mode, by specifying "-glow.flow.id", "-glow.taskGroup.id", etc. The driver also tells the task group about the input dataset shard locations.
The driver will try to run tasks that have inputs ready or in computing process. So if task group 3 depends on outputs of task group 5, task group 5 will run first.
Since all flows are statically defined, the tasks to run are deterministic given flow.id and taskGroup.id, and input dataset shard locations. The output is written to local agent. The inputs are retrieved from the location served by the remote agents.
A flow can also have input channels and output channels, through which the driver program will send and receive data. Ideally you should not pump a lot of data through these channels, but just data locations or succinct final results.