Plans to incorporate #9

flokadillo · 2020-11-25T06:46:14Z

Thanks for publishing this interesting framework!
I have a question: Are you planning to support "real" python generators such as tf.Dataset as DataGenerator? Then the input pipeline could be much more efficient, especially when dealing with datasets that don't fit in memory.
Here is a nice blogpost about this: https://towardsdatascience.com/how-to-build-efficient-audio-data-pipelines-with-tensorflow-2-0-b3133474c3c1
Thanks and all the best!

pzinemanas · 2020-11-27T13:16:36Z

Hi! Thanks for your question and comments.

First, it would be interesting to adapt the DataGenerator to work as a Python Generator. I'll read more about this.

Regarding your comment on data loading, it is clear that this process should be more efficient. This is an important issue to work on. We have to think how to do this, because the idea is to keep the library independent of the backend used to implement the machine learning model. Does it make sense to have different DataGenerators for each backend as we have with the ModelContainers? For instance, now we have KerasDataGenerator that converts a DataGenerator to be used for training keras models. Maybe we can implement this keras generator following the blog.

Perhaps @mrocamora, @pcancela-github or @ffont can comment about this.

flokadillo closed this as completed Nov 27, 2020

pzinemanas added the enhancement New feature or request label Nov 27, 2020

pzinemanas reopened this Nov 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plans to incorporate #9

Plans to incorporate #9

flokadillo commented Nov 25, 2020

pzinemanas commented Nov 27, 2020

Plans to incorporate #9

Plans to incorporate #9

Comments

flokadillo commented Nov 25, 2020

pzinemanas commented Nov 27, 2020