Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU support to NVFlare vertical demo #9552

Merged
merged 1 commit into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions demo/nvflare/horizontal/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,8 @@ shutdown server
## Training with GPUs

To demo with Federated Learning using GPUs, make sure your machine has at least 2 GPUs.
Build XGBoost with the federated learning plugin enabled along with CUDA, but with NCCL
turned off (see the [README](../../plugin/federated/README.md)).
Build XGBoost with the federated learning plugin enabled along with CUDA
(see the [README](../../plugin/federated/README.md)).

Modify `config/config_fed_client.json` and set `use_gpus` to `true`, then repeat the steps
Modify `../config/config_fed_client.json` and set `use_gpus` to `true`, then repeat the steps
above.
2 changes: 1 addition & 1 deletion demo/nvflare/horizontal/custom/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def _do_training(self, fl_ctx: FLContext):
dtest = xgb.DMatrix('agaricus.txt.test?format=libsvm')

# Specify parameters via map, definition are same as c++ version
param = {'max_depth': 2, 'eta': 1, 'objective': 'binary:logistic'}
param = {'tree_method': 'hist', 'max_depth': 2, 'eta': 1, 'objective': 'binary:logistic'}
if self._use_gpus:
self.log_info(fl_ctx, f'Training with GPU {rank}')
param['device'] = f"cuda:{rank}"
Expand Down
7 changes: 6 additions & 1 deletion demo/nvflare/vertical/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,9 @@ shutdown server

## Training with GPUs

Currently GPUs are not yet supported by vertical federated XGBoost.
To demo with Vertical Federated Learning using GPUs, make sure your machine has at least 2 GPUs.
Build XGBoost with the federated learning plugin enabled along with CUDA
(see the [README](../../plugin/federated/README.md)).

Modify `../config/config_fed_client.json` and set `use_gpus` to `true`, then repeat the steps
above.
6 changes: 4 additions & 2 deletions demo/nvflare/vertical/custom/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,13 +77,15 @@ def _do_training(self, fl_ctx: FLContext):
'gamma': 1.0,
'max_depth': 8,
'min_child_weight': 100,
'tree_method': 'approx',
'tree_method': 'hist',
'grow_policy': 'depthwise',
'objective': 'binary:logistic',
'eval_metric': 'auc',
}
if self._use_gpus:
self.log_info(fl_ctx, 'GPUs are not currently supported by vertical federated XGBoost')
if self._use_gpus:
Comment on lines 85 to +86

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition duplicate? @rongou

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch. Copy paste error. Sent #9578

self.log_info(fl_ctx, f'Training with GPU {rank}')
param['device'] = f"cuda:{rank}"

# specify validations set to watch performance
watchlist = [(dtest, "eval"), (dtrain, "train")]
Expand Down
Loading