Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep predictions from valid_sets using lgb.train #1309

Closed
dkivaranovic opened this issue Apr 11, 2018 · 3 comments
Closed

Keep predictions from valid_sets using lgb.train #1309

dkivaranovic opened this issue Apr 11, 2018 · 3 comments

Comments

@dkivaranovic
Copy link

I am running lgb.train with early stopping and want to save the predictions of the validation set of the best round. However, it seems that this is not possible at the moment. I am working with large data sets and loading data and predicting is very time consuming... Such an option would save me a lot of time.

My question is related to #283

@dkivaranovic dkivaranovic changed the title Keep predictions from valid_sets Keep predictions from valid_sets using lgb.train Apr 11, 2018
@guolinke
Copy link
Collaborator

you can use this function (https://github.com/Microsoft/LightGBM/blob/master/python-package/lightgbm/basic.py#L1923) to get the prediction.
data_idx=1 means the first validation dataset, data_idx=2 means the second one, and so on.

And you should set keep_training_booster=True when using lgb.train .

@dkivaranovic
Copy link
Author

Thank you @guolinke! Not sure if I'm missing something but when I run

import numpy as np
import lightgbm as lgb

Xtrain = np.random.randn(100, 10)
ytrain = np.where((np.sum(Xtrain, axis = 1) + np.random.randn(100)) > 0, 1, 0)
Xtest = np.random.randn(50, 10)
ytest = np.where((np.sum(Xtest, axis = 1) + np.random.randn(50)) > 0, 1, 0)
dtrain = lgb.Dataset(Xtrain, label = ytrain)
dtest = lgb.Dataset(Xtest, label = ytest)

params = {
    'boosting_type': 'gbdt',
    'objective':     'binary',
    'metric':        'binary_logloss',
    'learning_rate': 0.01,
    'num_leaves':    10,
    'min_data_in_leaf': 5,
    'verbosity':       0
}

max_rounds = 1000
early_stopping = 10

gbm = lgb.train(params,
                dtrain,
                num_boost_round = max_rounds,
                early_stopping_rounds = early_stopping,
                valid_sets = dtest,
                verbose_eval = 10,
                keep_training_booster=True)

pred = gbm.__inner_predict(data_idx = 1)

I get the following error:
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'Booster' object has no attribute '__inner_predict'

I am using Ubuntu 16.04.4 LTS, Python 3.6.0 :: Anaconda 4.3.1 (64-bit) and Lightgbm 2.1.0

@sbensoussan
Copy link

Hi, @dkivaranovic.

You can retrieve the predictions using gbm._Booster__inner_predict and the data_idx.

- pred = gbm.__inner_predict(data_idx = 1)
+ pred = gbm._Booster__inner_predict(data_idx=1)

@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants