Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distribution with YARN #790

Closed
AbdealiLoKo opened this issue Aug 6, 2017 · 6 comments
Closed

Distribution with YARN #790

AbdealiLoKo opened this issue Aug 6, 2017 · 6 comments

Comments

@AbdealiLoKo
Copy link

I've been looking at the examples in lightgbm for a distributed run and noticed that I need to mention the machine IPs and so on.

Is there an easier way to submit a job with YARN ? Like how XGBoost has with dmlc-submit ?

@StrikerRUS StrikerRUS changed the title Dsitribution with YARN Distribution with YARN Oct 3, 2017
@zhanglistar
Copy link

I am going to do this.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Jul 2, 2018

@zhanglistar Sounds great! You can open a PR with WIP mark to make it easy to track the progress and discuss questions.

@chuang39
Copy link

@StrikerRUS Any one worked on this support. We need to productionize lightgbm on yarn. If no one working on this, we'll build a version one. Otherwise, please kindly let me know. Thanks a lot!

@StrikerRUS
Copy link
Collaborator

@chuang39 As I know, there are only https://github.com/Azure/mmlspark production distributed version of LightGBM.

ping @guolinke and @chivee as more competent

@StrikerRUS
Copy link
Collaborator

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

@jameslamb
Copy link
Collaborator

I want to note here that the new lightgbm.dask interface in the Python package might be a path to distributed training with YARN. You can run a Dask cluster on a YARN-managed cluster using the dask-yarn project.

If there's interest, we can at some point spin up an Amazon EMR cluster of some other YARN cluster and try running Dask-based training on it. Would be a fun experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants