-
Notifications
You must be signed in to change notification settings - Fork 247
Common Code Structure
Golang is statically compiled. It's not planned to have a feature to dynamically move computation closure across servers, AFAIK.
So Glow has this strategy: one binary code, 3 running mode: standalone mode, driver mode and task mode.
register_one_or_more_flows //shared by all modes
...
if option "-glow.flow.id" and "-glow.taskGroup.id" is passed in {
the binary will run in task mode.
}else if option "-glow" is passed in {
the binary will run in driver mode.
}else{
starts in standalone mode.
}
This way, all the flows will be statically registered. Given a "-glow.flow.id", the task mode will know which flow to execute.
If you have just one flow, without any driver specific logic, everything can be just in the main() section.
func main(){
flag.Parse() // required for distributed execution
flow.New().Map(...).Reduce(...).Join(...).Run()
}
For many data processing job, this could be enough.
Note: when starting in task mode, all the original driver's environment variables, commandline options, are all passed to the task mode.
If we need to do additional things in driver mode, such as feed into input channels, read from output channels, we need to mark where the shared code stops and the driver/task mode should run in their own way.
We need to add this magic line:
func main(){
flag.Parse()
f1 = register_one_flow // shared by all modes
flow.Ready() // the magic line!
// start driver specific logic
...
// run in driver mode
f1.Run()
}
Actually the function name "Ready" may not correctly show its meaning. If you have any suggestions, please let me know.
This flow.Ready() function will check which mode it should run. If it is task mode, it will start to run the specified task, and stop. If it is driver mode or standalone mode, it will just continue.
When a Glow program starts, it should register the flow, or a list of flows. This is ideal for Golang's "func init()" section. So Glow's idiomatic way is to create the flows in the init sections.
// file1.go
var f1 *flow.FlowContext
func init(){
f1 = flow.New().Map(...).Reduce(...).Join(...)
}
// file1.go
var f2 *flow.FlowContext
func init(){
f2 = flow.New().Map(...).Reduce(...).Join(...)
}
// main.go
func main(){
flag.Parse()
flow.Ready()
// starts driver specific logic
...
// complicated logic, such as if/else, for loop, etc.
if something {
f1.Run()
}else{
f2.Run()
}
}