Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More frequent feedback from NeuralNet #138

Open
BenjaminBossan opened this issue Aug 13, 2015 · 1 comment
Open

More frequent feedback from NeuralNet #138

BenjaminBossan opened this issue Aug 13, 2015 · 1 comment

Comments

@BenjaminBossan
Copy link
Collaborator

As mentioned here, there are situations where the user wants more frequent feedback from the net than just after each epoch. Especially so with the arrival of RNNs, which are hungry for tons of data but slow. Having more frequent feedback also allows more neat stuff, for instance, to stop early after, 2.5 epochs etc.

The solution proposed in the PR would solve the issue but feels a little bit like cheating, since the batch iterator will pretend the epoch to be over when really it isn't.

I have an implementation lying around that has an on_epoch_finished callback. Unfortunately, that complicates matters, since you have to synchronize the loop through train and eval (which in turn requires adjusting the batch size for eval).

So does anybody have another solution? I would help out with coding if necessary.

@dnouri
Copy link
Owner

dnouri commented Mar 26, 2016

A on_batch_finished handler has been added since. But it won't cover your use case where you do early stopping between epochs.

However, I think that @dirtysalt's MinibatchIterator works well enough for your case. Sure the output will say you iterated one epoch when you didn't, but I think that can be dealt with.

I'll reproduce the MinibatchIterator class here for the record:

class MiniBatchIterator(BatchIterator):
    def __init__(self, batch_size = 128, iterations = 32):
        BatchIterator.__init__(self, batch_size)
        self.iterations = iterations
        self.X = None
        self.y = None
        self.cidx = 0
        self.midx = 0

    def __call__(self, X, y = None):
        # if data set is reset
        if not (self.X is X and self.y is y):
            self.cidx = 0
            n_samples = X.shape[0]
            bs = self.batch_size
            self.midx = (n_samples + bs - 1) // bs
        self.X, self.y = X, y
        return self

    def __iter__(self):
        bs = self.batch_size
        for i in range(0, self.iterations):
            sl = slice(self.cidx * bs , (self.cidx + 1) * bs)
            self.cidx += 1
            # wrap up.
            if self.cidx >= self.midx: self.cidx = 0
            Xb = self.X[sl]
            if self.y is not None:
                yb = self.y[sl]
            else:
                yb = None
            yield self.transform(Xb, yb)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants