You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using PCA on the Titanic dataset and discretizing the output results in strange rounding of the values by the disretization. This results with multiple values with the same "name".
This is the workflow I have (I have also included the .ows):
Here on the left is the Data Table that shows the results of the PCA (pay attention to the PC7 attribute). On the right we can see the results of the Discretize widget, the discretized PC7 attribute has been rounded strangely, there are also multiple PC7 values with the same "name" (highlighted).
How can we reproduce the problem?
Zip of the workflow: discretize_bug.zip
To reproduce the problem, set the PCA components to 8 in the provided workflow.
What's your environment?
Operating system: Windows 10
Orange version: 3.38
How you installed Orange: Using pip in a conda environment
The text was updated successfully, but these errors were encountered:
I think we could (and probably should) simply add bins_ = np.unique(bins_) after rounding.
Decimal binning doesn't guarantee to give the exact number of intervals specified by the user, but returns the closest match across different possible "nice" thresholds. If rounding+unique decrease the number of intervals for a certain bin width, the method may choose another (smaller) width, or return smaller number of bins; both are OK.
If you wish, change this (don't forget to add a simple test).
janezd
added
bug
A bug confirmed by the core team
snack
This will take an hour or two
and removed
bug report
Bug is reported by user, not yet confirmed by the core team
labels
Aug 21, 2024
What's wrong?
Using PCA on the Titanic dataset and discretizing the output results in strange rounding of the values by the disretization. This results with multiple values with the same "name".
This is the workflow I have (I have also included the .ows):
Here on the left is the Data Table that shows the results of the PCA (pay attention to the PC7 attribute). On the right we can see the results of the Discretize widget, the discretized PC7 attribute has been rounded strangely, there are also multiple PC7 values with the same "name" (highlighted).
How can we reproduce the problem?
Zip of the workflow:
discretize_bug.zip
To reproduce the problem, set the PCA components to 8 in the provided workflow.
What's your environment?
The text was updated successfully, but these errors were encountered: