You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When calling predict, I obtain the error:
Error in factor(predictions, labels = labels) :
invalid 'labels'; length n should be 1 or k
where n > k every time (note you could substitute n, k for any integers satisfying this constraint above).
To Reproduce
I believe the error is that if predictions does not contain any predictions for a single class that exists in the training data, the way that the factoring is done causes an error. Minimal reproducible example demonstrating this issue with the way the predictions are being assigned class labels would be (ie, the flaw with the approach chosen):
x <- rep(letters[1:5], 3) # x has only 5 unique elements
factor(x, labels=LETTERS[1:10]) # note that there are more labels than unique elements of x
Error in factor(x, labels = LETTERS[1:10]) :
invalid 'labels'; length 10 should be 1 or 5
I noticed this bug when I had a training set with extremely sparse representation (30 samples of 10,000) of a single class, which presumably is just never predicted during prediction and hence the error is thrown if I had to guess.
Expected behavior
The predictions are returned.
Desktop (please complete the following information):
OS: Ubuntu 18.04
Language: R
Version 2.0.4
Additional context
It would appear this issue can be fixed by simply:
x <- rep(letters[1:5], 3) # x has only 5 unique elements
factor(x, levels=LETTERS[1:10]) # note that there are more labels than unique elements of x
The text was updated successfully, but these errors were encountered:
ebridge2
changed the title
Issue with Predict()
Issue with Predict() when number of unique predicted labels is less than number of possible labels
Dec 5, 2019
Describe the bug
When calling predict, I obtain the error:
where
n > k
every time (note you could substitute n, k for any integers satisfying this constraint above).To Reproduce
I believe the error is that if
predictions
does not contain any predictions for a single class that exists in the training data, the way that the factoring is done causes an error. Minimal reproducible example demonstrating this issue with the way the predictions are being assigned class labels would be (ie, the flaw with the approach chosen):I noticed this bug when I had a training set with extremely sparse representation (30 samples of 10,000) of a single class, which presumably is just never predicted during prediction and hence the error is thrown if I had to guess.
Expected behavior
The predictions are returned.
Desktop (please complete the following information):
Additional context
It would appear this issue can be fixed by simply:
The text was updated successfully, but these errors were encountered: