Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R package question] Is that possible to retrieve the score for each sample from lgb.cv? #808

Closed
ajing opened this issue Aug 9, 2017 · 11 comments

Comments

@ajing
Copy link

ajing commented Aug 9, 2017

I just wondering whether this is possible. This function can help to analyze the errors during training and do better feature engineering.

@ajing ajing changed the title [R package] Is that possible to retrieve the score for each sample from lgb.cv? [R package question] Is that possible to retrieve the score for each sample from lgb.cv? Aug 9, 2017
@blewy
Copy link

blewy commented Aug 10, 2017

  • Hi ajing, i cant get the lgb.cv to work for modeling a continous variable with regression. Where you able to make it work?
    lgb.train worked just fine with an validation dataset, but gb.cv still puzzles me.

  • R gets stuck and the folowing warning shows:
    "Warning messages:
    1: In foldVector[y == dimnames(numInClass)$y[i]] <- sample(seqVector) :
    number of items to replace is not a multiple of replacement length "

It seams its searching for a class label. I've used the folowing the code:

"params <- list(objective = "regression", metric = "l2")
model <- lgb.cv(params,
dtrain,
10,
nfold = 5,
min_data = 1,
learning_rate = 1,
early_stopping_rounds = 10)"

@ajing
Copy link
Author

ajing commented Aug 10, 2017

What is in your dtrain? What does your y look like?

@blewy
Copy link

blewy commented Aug 10, 2017

Its a continous variable

I created dtrain like this:

  • dtrain <- lgb.Dataset(as.matrix(train.dataset), label = train.target)

and a summary from train.target:

  • summary(train.target)
    Min. 1st Qu. Median Mean 3rd Qu. Max.
    0.67 1204.49 2116.89 3032.27 3859.20 106863.00

@ajing
Copy link
Author

ajing commented Aug 11, 2017

Is train.dataset a data.frame? Does it have any missing value?

@blewy
Copy link

blewy commented Aug 11, 2017 via email

@ajing
Copy link
Author

ajing commented Aug 11, 2017

Yes, it worked.

@blewy
Copy link

blewy commented Aug 11, 2017 via email

@blewy
Copy link

blewy commented Aug 11, 2017 via email

@ajing
Copy link
Author

ajing commented Aug 12, 2017

Have you tried this example:

library(lightgbm)
data(agaricus.train, package='lightgbm')
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label=train$label)
params <- list(objective="regression", metric="l2")
model <- lgb.cv(params, dtrain, 10, nfold=5, min_data=1, learning_rate=1, early_stopping_rounds=10)

This is a classification problem, but you can use regression to fit the model.

@blewy
Copy link

blewy commented Aug 12, 2017 via email

@guolinke
Copy link
Collaborator

duplicated, refer to #283

@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants