You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is unclear how perturbation_stats should handle multiple Subkeys with the same origin (thus the same column name in df).
Currently attempting to group on a duplicated column throws ValueError: Grouper for 'subsample' not 1-dimensional.
The illustrative example of this issue comes if we take the exact example pipeline from #35 and attempt to use a single subsample Vset with output_matching=False (so the X_trains/X_tests will match properly) instead of the two. Now if we want to predict with uncertainty over subsamples, it is unclear what this means. I think there are 2 cases:
My initial thought we could implement a way to distinguish identical mismatched Subkeys (maybe by appending -i)
Alternatively/additionally we could try to support multidimensional grouping in perturbation_stats
It is unclear how
perturbation_stats
should handle multiple Subkeys with the same origin (thus the same column name in df).Currently attempting to group on a duplicated column throws
ValueError: Grouper for 'subsample' not 1-dimensional
.The illustrative example of this issue comes if we take the exact example pipeline from #35 and attempt to use a single subsample Vset with
output_matching=False
(so the X_trains/X_tests will match properly) instead of the two. Now if we want to predict with uncertainty over subsamples, it is unclear what this means. I think there are 2 cases:-i
)perturbation_stats
Illustrative Example
The text was updated successfully, but these errors were encountered: