Gnmt train slow #668
Replies: 10 comments
-
Batch size=16 seems not reasonable. Did you try to increase it? |
Beta Was this translation helpful? Give feedback.
-
@ymjiang There's no way to increase it any more. Now it takes up about 10G of gpu memory. |
Beta Was this translation helpful? Give feedback.
-
cc @sxjscience. One way to work around the memory limit is to perform gradient accumulation. Several model zoo scripts already implement this logic, and it might be a good idea to add it to GNMT too. |
Beta Was this translation helpful? Give feedback.
-
Will take a look at it. |
Beta Was this translation helpful? Give feedback.
-
@zhreshold Using |
Beta Was this translation helpful? Give feedback.
-
@szhengac using multi worker will enable pre-fetching batches, default is 2*batch size, and each worker will be fetching a batch simultaneously. |
Beta Was this translation helpful? Give feedback.
-
now.I saved gpu money by setting share_embed.and I increase batch size.it some improvements. |
Beta Was this translation helpful? Give feedback.
-
@zhreshold but each input batch is simply an integer matrix, which shouldn't occupy too much gpu memory. |
Beta Was this translation helpful? Give feedback.
-
@szhengac Did you mean gpu memory? Dataloader should never occupy gpu memory |
Beta Was this translation helpful? Give feedback.
-
@zhreshold Oh I mean cpu memory. |
Beta Was this translation helpful? Give feedback.
-
GPU cannot be filled with train_gnmt.
GPU:GTX TITAN X
CPU:4770k
GPU LOAD: about 10%
CPU LOAD: one core full.
dataset: LCSTS
vocab size: 320783
num work: 0 (Trying to set it to 2 seems to improve, but it takes up too much memory (about six times as much as 0)).
batch size: 16
Beta Was this translation helpful? Give feedback.
All reactions