Replies: 1 comment
-
The same answer here #54 Probably all GPUs see the all dataset in an epoch. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
something is odd with the trainer. When I train on 2 GPUs on a dataset D, it takes X hours, if I train on 4 GPU it still takes X hours, I was expecting it to take X / 2 hours. Does the trainer share the batches between subprocesses or each subprocess takes it's own batch?
Beta Was this translation helpful? Give feedback.
All reactions