Change sampling strategy #15

tribhuvanesh · 2014-11-21T11:41:19Z

Right now, by setting solverOptions.sampleFrac, say 0.5, the master node samples 50% of the training data and hands each of the K workers 50/K% of the data.

This does not scale well. Since, adding more workers forces the users to change the sampling fraction in order to scale.
Rather, each worker should independently sample the data and then perform the computation.
Doing so implies adding more workers results in faster convergence.

tribhuvanesh self-assigned this Nov 21, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change sampling strategy #15

Change sampling strategy #15

tribhuvanesh commented Nov 21, 2014

Change sampling strategy #15

Change sampling strategy #15

Comments

tribhuvanesh commented Nov 21, 2014