the Random Forest classifies everything to be 1 #68

AlanSpencer2 · 2021-12-28T20:17:04Z

I am new to thundergbm, and just trying to get a simple Random Forest classifier going. But the classifier classifies every single sample to be 1. Not one single case out of 188244 samples is classified as 0. No other classifier behaves like this. I also tried different number of trees, depth etc. But it still classies everything to 1. Is there something wrong with the following code?

from thundergbm import TGBMClassifier
clf = TGBMClassifier(depth=6, n_trees = 1, n_parallel_trees=100, bagging=1)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

#y_pred classifies everything in the test set (X_test) to one.

Kurt-Liuhf · 2021-12-29T09:32:24Z

@AlanSpencer2 Hi, I used the classifier with the same parameters to fit the covtype data set from sklearn but I could not reproduce your results. The predictions seem to be correct. So it would be better if you could provide a subset of your data set.
Thanks.

AlanSpencer2 · 2021-12-29T19:22:57Z

Hi, the problem occurs with binary classification. That is, if the target variable is 0 or 1, True or False. Can you please try a binary classification problem? (Not regression, and not multiple classification.)

Here is the Iris dataset with 3 different flower types. The target/label variable is 1 if the flower is Setosa, and 0 for the other 2 flower types:
iris_data.csv

--------------------------------Python code------------------------------------
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from thundergbm import TGBMClassifier

df = pd.read_csv(r'C:\Python\iris_data.csv', encoding='ISO-8859-1', low_memory=False, index_col=0)
df
X=df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']]
y=df['Label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf = TGBMClassifier(depth=6, n_trees = 1, n_parallel_trees=100, bagging=1)
clf.fit(X_train, y_train)
pred_test = clf.predict(X_test)
#The predictions are never a mixture of 1s and 0s, but either all predictions are 0 or all predictions are 1.
--------------------------------end of code------------------------------------

ps. I have tried all kinds of different datasets. They all had the same issue.

AlanSpencer2 changed the title ~~the Random Forest doesn't work~~ the Random Forest classifies everything to be 1 Dec 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the Random Forest classifies everything to be 1 #68

the Random Forest classifies everything to be 1 #68

AlanSpencer2 commented Dec 28, 2021 •

edited

Loading

Kurt-Liuhf commented Dec 29, 2021 •

edited

Loading

AlanSpencer2 commented Dec 29, 2021 •

edited

Loading

the Random Forest classifies everything to be 1 #68

the Random Forest classifies everything to be 1 #68

Comments

AlanSpencer2 commented Dec 28, 2021 • edited Loading

Kurt-Liuhf commented Dec 29, 2021 • edited Loading

AlanSpencer2 commented Dec 29, 2021 • edited Loading

AlanSpencer2 commented Dec 28, 2021 •

edited

Loading

Kurt-Liuhf commented Dec 29, 2021 •

edited

Loading

AlanSpencer2 commented Dec 29, 2021 •

edited

Loading