Keep predictions from valid_sets using lgb.train #1309

dkivaranovic · 2018-04-11T12:22:46Z

I am running lgb.train with early stopping and want to save the predictions of the validation set of the best round. However, it seems that this is not possible at the moment. I am working with large data sets and loading data and predicting is very time consuming... Such an option would save me a lot of time.

My question is related to #283

guolinke · 2018-04-12T01:39:46Z

you can use this function (https://github.com/Microsoft/LightGBM/blob/master/python-package/lightgbm/basic.py#L1923) to get the prediction.
data_idx=1 means the first validation dataset, data_idx=2 means the second one, and so on.

And you should set keep_training_booster=True when using lgb.train .

dkivaranovic · 2018-04-12T10:56:38Z

Thank you @guolinke! Not sure if I'm missing something but when I run

import numpy as np
import lightgbm as lgb

Xtrain = np.random.randn(100, 10)
ytrain = np.where((np.sum(Xtrain, axis = 1) + np.random.randn(100)) > 0, 1, 0)
Xtest = np.random.randn(50, 10)
ytest = np.where((np.sum(Xtest, axis = 1) + np.random.randn(50)) > 0, 1, 0)
dtrain = lgb.Dataset(Xtrain, label = ytrain)
dtest = lgb.Dataset(Xtest, label = ytest)

params = {
    'boosting_type': 'gbdt',
    'objective':     'binary',
    'metric':        'binary_logloss',
    'learning_rate': 0.01,
    'num_leaves':    10,
    'min_data_in_leaf': 5,
    'verbosity':       0
}

max_rounds = 1000
early_stopping = 10

gbm = lgb.train(params,
                dtrain,
                num_boost_round = max_rounds,
                early_stopping_rounds = early_stopping,
                valid_sets = dtest,
                verbose_eval = 10,
                keep_training_booster=True)

pred = gbm.__inner_predict(data_idx = 1)

I get the following error:
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'Booster' object has no attribute '__inner_predict'

I am using Ubuntu 16.04.4 LTS, Python 3.6.0 :: Anaconda 4.3.1 (64-bit) and Lightgbm 2.1.0

sbensoussan · 2018-10-03T12:49:55Z

Hi, @dkivaranovic.

You can retrieve the predictions using gbm._Booster__inner_predict and the data_idx.

- pred = gbm.__inner_predict(data_idx = 1)
+ pred = gbm._Booster__inner_predict(data_idx=1)

dkivaranovic changed the title ~~Keep predictions from valid_sets~~ Keep predictions from valid_sets using lgb.train Apr 11, 2018

guolinke closed this as completed Apr 12, 2018

lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep predictions from valid_sets using lgb.train #1309

Keep predictions from valid_sets using lgb.train #1309

dkivaranovic commented Apr 11, 2018

guolinke commented Apr 12, 2018

dkivaranovic commented Apr 12, 2018

sbensoussan commented Oct 3, 2018

Keep predictions from valid_sets using lgb.train #1309

Keep predictions from valid_sets using lgb.train #1309

Comments

dkivaranovic commented Apr 11, 2018

guolinke commented Apr 12, 2018

dkivaranovic commented Apr 12, 2018

sbensoussan commented Oct 3, 2018