-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch size #150
Comments
The models in the original paper were trained using MXNet implementation. Typically, the larger the batch size is, the wore the performance will be. See "train ImageNet in 1 hr" paper for details. The drop out only helps larger model. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
Thank you for the work. May I ask, for ResNeSt-50, did you use batch size of 8192 (from paper) or 2048 (from pytorch-encoding)? How much will the performance change?
And I was also wondering the drop out was mentioned in the paper while set to 0 in the training script in pytorch-encoding. Does it mean the trick won’t have much of impact on the performance?
Thanks again for the time.
The text was updated successfully, but these errors were encountered: