-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/reduce-decoder-mem-usage #84
base: develop
Are you sure you want to change the base?
Conversation
declare an empty accum tensor outside the for loop. the old way of having out and out1 results in two copies of the array which results in more memory use. at 9km this added 6gb to peak mem usage
|
for more information, see https://pre-commit.ci
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #84 +/- ##
========================================
Coverage 99.85% 99.85%
========================================
Files 23 23
Lines 1350 1350
========================================
Hits 1348 1348
Misses 2 2 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
great work. Is this from a training run or inference run? |
Inference. Havent tried in training bc this only happens when num_chunks > 1 |
Absolutely, would be just interesting to check. |
This change increases the memory saved from using chunking in the mapper. At the moment we use two arrays to accumulate chunks, this replaces it with a single array. At 9km this reduces peak memory usage by 6GB.
Below I have pictures of memory usage during the chunking part of the decoder at 9km
Before
Notice the zig-zag pattern. This is from the 'out1' tensor being constantly created and freed each chunk.
After
Now the zig-zag pattern is gone and peak memory usage has decreased by 6GB