You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered some unexpected behavior and wanted to understand the reasoning behind it. The issue is regarding the impact of column order on model predictions in a regression setup. I’ve seen similar questions on this topic and tried applying various suggestions to achieve deterministic results, but without success.
Below is a toy example with:
Two sets of features
Two sets of hyperparameters
With the default hyperparameters (params 1), I get the same results regardless of column order. However, with the second set (params 2), the results are the same for feature set 1, while they differ for feature set 2—there’s only one observation in the test set that returns a different prediction.
Could you please help me understand where the difference is coming from? In my actual use case, the discrepancies are larger than in this toy dataset.
If you need any further details regarding the environment, please let me know :)
The text was updated successfully, but these errors were encountered:
erykml
changed the title
Column order impacting the prediction of LGBM (regression)
Column order impacting the predictions of LGBM (regression)
Oct 10, 2024
Hi 🙂
I encountered some unexpected behavior and wanted to understand the reasoning behind it. The issue is regarding the impact of column order on model predictions in a regression setup. I’ve seen similar questions on this topic and tried applying various suggestions to achieve deterministic results, but without success.
Below is a toy example with:
With the default hyperparameters (params 1), I get the same results regardless of column order. However, with the second set (params 2), the results are the same for feature set 1, while they differ for feature set 2—there’s only one observation in the test set that returns a different prediction.
Could you please help me understand where the difference is coming from? In my actual use case, the discrepancies are larger than in this toy dataset.
If you need any further details regarding the environment, please let me know :)
Env:
Toy example:
The text was updated successfully, but these errors were encountered: