Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

score broken if batch_iterator_test doesn't yield same order / size as y #269

Open
cancan101 opened this issue May 25, 2016 · 0 comments
Open

Comments

@cancan101
Copy link
Contributor

cancan101 commented May 25, 2016

Right now score assumes that the values of X yielded by the batch_iterator_test are the same order / length as the y passed in.

Instead, I suggest the following modifications which uses the y's yielded by the iterator:

    def predict_proba(self, X, y=None):
        probas = []
        ys = []
        for Xb, yb in self.batch_iterator_test(X, y):
            probas.append(self.apply_batch_func(self.predict_iter_, Xb))
            ys.append(yb)
        return np.vstack(probas), np.hstack(ys)

    def predict(self, X, y=None):
        if self.regression:
            return self.predict_proba(X, y)
        else:
            predictions, y_actual = self.predict_proba(X, y)
            y_pred = np.argmax(predictions, axis=1)
            if self.use_label_encoder:
                y_pred = self.enc_.inverse_transform(y_pred)
            return y_pred, y_actual

    def score(self, X, y):
        score = mean_squared_error if self.regression else accuracy_score
        return float(score(*self.predict(X, y)))

In order to match the old signature, we might want to have the return look something like:

if y is not None:
        return np.vstack(probas), np.hstack(ys)
else:
       return np.vstack(probas)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant