-
Notifications
You must be signed in to change notification settings - Fork 247
resource allocation
There are two resource allocations:
- Glow master allocates agents to the driver.
- Driver allocates tasks to assigned agents.
All agents periodically send heartbeats to Glow master, together with these information:
- Number of CPU cores
- Memory Size in MB
- Number of CPU cores already allocated
- Memory Size in MB already allocated
- Data center name of the agent
- Rack name of the agent
To keep logic simple, Glow driver uses a market to match Supply(computing resources) with Demand(computing tasks) to dynamically and efficiently execute tasks. The algorithm is called Continuous Double Auction(CDA) market. It sounds fancy, but here is how it works:
Continuous Double Auction Protocol (CDA) is to allocate the best possible resource to an arriving task and to prioritize tasks according to their score. When a Task Query object arrives at the market the protocol searches all available resource offers and returns the first occurrence of the ’best’ match, i.e. the cheapest or the fastest resource which satisfies the task’s constraints. Whenever a resource becomes available and there are several tasks waiting, the one with the highest score is processed first.
Scores between a task group and a resource is currently calculated this way:
cost := float64(1)
for _, input := range taskGroup.Inputs {
cost += get_shard_location(input).Distance(resourceLocation)
}
return float64(bid) / cost
So the architecture is:
Driver => adds tasks => CDA Market <= Supplier adds resources <= Master
Driver program adds tasks to the market as demands. And a supplier add agents to the market, fetched from Master.
Currently supplier gets agents from master. The agents are priced as totalBid/numberOfTaskGroups. This area could need some more tweaking later.
Glow master will try to allocate a group of agents to the driver, ideally agents sitting in one rack, in one data center, unless there are no more agents on the same rack or data center.
Of course, if more agents are available, the flow will be executed faster. But in theory, a flow can proceed even with just one agent. This would be helpful when there are limited resources available.