Skip to content
Manoj Awasthi edited this page Nov 21, 2016 · 5 revisions

Architecture

In a glow program, datasets are linked together by computation steps. Each dataset can be partitioned into dataset shards. Each computation step can also be partitioned into computation tasks.

Glow makes full use of Golang's channel feature.

In local mode, data are fed into computation tasks by input channel(s), and output also via a channel to a new dataset.

In distributed mode, a group of tasks will run together in a server, pulling its own input data from remote dataset shards. The output are streamed to local disk, which will be pulled by downstream task groups. All these plumbing work are hidden from the computation, since the tasks only read from inputs and write to output via common go channels.

Architecture in distributed mode

Here are the components in distributed mode. Master, Agents, Driver and TaskGroup.

Glow Master

Master collects resource infromation from agents. Resources are CPU, memory, allocated CPUs, allocated memory, etc.

Currently there is only a single master. But since the master only has soft states from agents, we can easily extend it to a master cluster to avoid SPOF.

Glow Agent

An agent wears several hats:

  1. reports system resources and usage to master via heartbeats.
  2. accepts tasks from driver program.
  3. fetches the binary executable from driver program.
  4. starts the executable in task mode.
  5. accepts remote reads and local writes for a dataset.

Driver program

Driver program is actually just the code that glow users write. If executed with "-glow" option, it will drive the distributed execution.

  1. creates optimized execution plan, grouping tasks into task groups.
  2. requests resources from master.
  3. allocates tasks to assigned servers.
  4. if a dataset has input or output channels, it writes to or receives from those datasets.
  5. after the flow finishes, it cleans up intermediate datasets generated during run time.

TaskGroup program

Tasks usually can be grouped together. A taskgroup program also uses the same binary executable file as the driver program, but in task mode.

  1. setup inputs and outputs for the tasks.
  2. execute the tasks.

How executions are moved to remote servers?

One of the Golang's lacking feature is the capability to dynamically move execution closure across network. AFAIK it will not be possible any time soon. Given current situation, Glow is designed to just move the whole binary code, but run in different modes, i.g., task mode and driver mode.

However, the implication is that the computation flow will be static. The flow graph can not be changed. One future way to allow dynamic flow is to pre-register all the flows, and dynamically choose one or several flows to run.