added DA in-memory generator in data.py #90
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Regular data augmentation (DA) seems not so straightforward with sequence samples, as we dont want to randomly rotate each frame in a single sample sequence by a different degree. -> Custom DA is implemented
Since this DA is on the fly, the get_all_sequences_in_memory() won't cut it anymore. However, the generator is much slower.
This is partly counteracted by implementing a hybrid version: Load all raw data to memory, and apply DA on the fly from there.
Locally this gave a 1.6 - 2.0 speed up as compared to running the regular generator with DA (currently the regular generator does NOT have any DA option).
NB:
I changed the code locally quite a bit, with different names. The code definitely ran, yet before uploading I altered it to fit the online terminology again, in which I have not tested wrong naming or wrong attributes or something.