You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is related to the issue with Discretization #6876.
If the input to the Continuize widget has attributes with multiple values with the same "name"/"value" (see Issue #6876 for a better explanation), the One-hot encoding will create multiple attributes with the same name which results in an exception.
Workflow I used (an extension of the workflow from issue #6876):
Exception:
Traceback (most recent call last):
File "C:\Users\zanme\work\orange3\orange3\Orange\widgets\data\owcontinuize.py", line 458, in _on_radio_clicked
self.commit.deferred()
File "C:\Users\zanme\miniconda3\envs\orange3\Lib\site-packages\orangewidget\gui.py", line 2006, in conditional_commit
do_commit()
File "C:\Users\zanme\miniconda3\envs\orange3\Lib\site-packages\orangewidget\gui.py", line 2014, in do_commit
commit.call()
File "C:\Users\zanme\miniconda3\envs\orange3\Lib\site-packages\orangewidget\gui.py", line 1879, in call
acting_func(instance)
File "C:\Users\zanme\work\orange3\orange3\Orange\widgets\data\owcontinuize.py", line 517, in commit
self.Outputs.data.send(self._prepare_output())
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zanme\work\orange3\orange3\Orange\widgets\data\owcontinuize.py", line 534, in _prepare_output
return self.data.transform(Domain(attrs, class_vars, metas))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\zanme\work\orange3\orange3\Orange\data\domain.py", line 154, in __init__
raise Exception('All variables in the domain should have'
Exception: All variables in the domain should have unique names.
Screenshot of the raised exception and the two attributes with the same name:
Note
Because of this issue, a test was failing for the ScoringSheet widget. I have temporarily excluded the widget from the test, but it should be included again when the issue is resolved.
Code may assume that values of categorical variables are unique. The bug is thus in discretization. Adding np.unique, as I suggested in a comment in #6876, resolves it.
I nevertheless made #6878 to prevent construction of variables with duplicated values, so any future bugs that result in duplicated values will be reported earlier, at the appropriate place.
What's wrong?
This issue is related to the issue with Discretization #6876.
If the input to the Continuize widget has attributes with multiple values with the same "name"/"value" (see Issue #6876 for a better explanation), the One-hot encoding will create multiple attributes with the same name which results in an exception.
Workflow I used (an extension of the workflow from issue #6876):
Exception:
Screenshot of the raised exception and the two attributes with the same name:
Note
Because of this issue, a test was failing for the ScoringSheet widget. I have temporarily excluded the widget from the test, but it should be included again when the issue is resolved.
Test:
Orange.tests.test_classification.LearnerAccessibility.test_all_models_work_after_unpickling_pca
How can we reproduce the problem?
Zip of the workflow: continuize_bug.zip
To reproduce the problem, set the PCA components to 8 in the provided workflow.
What's your environment?
The text was updated successfully, but these errors were encountered: