-
Notifications
You must be signed in to change notification settings - Fork 5
Switch output writing to h5py to reduce memory footprint. Update version #12
base: develop
Are you sure you want to change the base?
Conversation
Can we change this so that the behavior remains the same for code that already exists - and add a new option, "save_format" with a default to "pkl"? This would break some code that already exists - as it relies on the outputs in a pkl format. Also, I've changed this branch to develop - but that might require some different commits - it looks like some things are changed that should not be, when merging. |
Whoops, on my phone - closed this by accident. |
That makes sense. I'll add the flag and rebase to develop |
769a421
to
3de145c
Compare
3de145c
to
b254a23
Compare
Codecov Report
@@ Coverage Diff @@
## develop #12 +/- ##
===========================================
- Coverage 65.6% 65.46% -0.15%
===========================================
Files 143 143
Lines 6240 6255 +15
===========================================
+ Hits 4094 4095 +1
- Misses 2146 2160 +14
Continue to review full report at Codecov.
|
while True: | ||
loss, outputs = self.run('default', return_outputs=True) | ||
grp = f.create_group(str(i)) | ||
outputs = outputs[0] # can we rely on this being a tuple of length 1? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidMChan I'm not sure about this, but can I rely on the run output always being a tuple of length 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Effectively, the outputs are the forward pass of your model. This means that "outputs" can be whatever you want it to be (a numpy array, a dict of numpy arrays, a tuple of arrays, etc.) This is probably why @rmrao hijacked it for use in TAPE. It also makes it tricky to write a generic saving function for the outputs since you have no guarantees on the data format. You can know, however, that the outputs will be the result of a forward run of the model (so they are convertible to tensors).
Perhaps it makes sense instead of adding options, to add a callback function? Not entirely sure, but this is why we used pickle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of a callback function. The save_outputs
option is really more about quick-and-dirty debugging than it is a real feature at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise there's not a really good, general way of saving things. It'll vary hugely. Plus a callback would let us do things other than saving them.
Collecting all data and then pickling can crash a machine with limited memory.
e.g. a 20GB results set (held in memory) will crash a 32GB machine. This is partially due to the memory overhead of pickling an object. Writing to hdf5 allows us to write iteratively, reducing the memory footprint overall, and avoiding the pickling overhead.