Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDI training compared to not DDI training #73

Open
cantabile-kwok opened this issue Nov 12, 2022 · 1 comment
Open

DDI training compared to not DDI training #73

cantabile-kwok opened this issue Nov 12, 2022 · 1 comment

Comments

@cantabile-kwok
Copy link

Hi! I am curious about why you use DDI (data-dependent initialization) here, as not doing DDI won't cause a bug in the program. So how is the performance of not using DDI at the beginning? Does it have a specific use?

@cantabile-kwok
Copy link
Author

cantabile-kwok commented Nov 12, 2022

Also, where is the source of this method? I found a paper (https://arxiv.org/pdf/1511.06856.pdf) but it does not seem to be the implementation used in this repo. Appreciate any discussions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant