Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TensorFlow (2.0) #77

Open
zw0610 opened this issue Jul 5, 2020 · 0 comments
Open

Support TensorFlow (2.0) #77

zw0610 opened this issue Jul 5, 2020 · 0 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@zw0610
Copy link
Contributor

zw0610 commented Jul 5, 2020

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

Status:

So far FTLib does not support TensorFlow. When adopted in ElasticDL, we take a NumPy NDArray and wrapped it into a Tensor data structure defined in PyTorch. Such approach not only suffers from overhead, but also is not elegant. It will be much better if FTLib support TensorFlow natively.

Potential Approach(es):

Distributed Strategy is introduced with TF 2.0. The implementation of CollectiveAllReduceStrategy hints we can customize a new strategy with a fault-tolerant/elastic ops defined in FTLib.

Regarding the enhanced ops,

  1. the logic FTLib uses to enhance collective ops can be assembled in a new, customized (by FTLib) cross_device_ops library
  2. the logic FTLib uses to reconfigure the member list can be customized into the new distributed strategy in FTLib

Steps:

  1. Prepare new collective ops with elastic enhancement
  2. Create customized distributed strategy

Potential Issues:

  1. While this proposal mainly shall work for TF 2.0, it cannot be applied to earlier version.
  2. While it may look transparent to the TF 2.0 users, this design is remotely close to what FTLib does with PyTorch and NumPy.

/cc @gaocegege @QiJune @skydoorkai

@caicloud-bot caicloud-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants