-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix for float precision calculation using categorical data with trailing 0s #1125
Conversation
categorical_series = pd.Series( | ||
[202209, 202210, 202211], dtype="category" | ||
).apply(str) | ||
float_profiler = FloatColumn("Name") | ||
float_profiler.update(categorical_series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testing this locally so may have some additional comments, but initial thought is that there should definitely be some assert
statements here to validate this is actually working as intended post-change @SchadtJ @scottiegarcia
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts @scottiegarcia @SchadtJ?
reverting this accidental merge -- @SchadtJ please reopen into |
* Replace snappy with cramjam (#1091) * add downloads tile (#1085) * Replace snappy with cramjam * Delete test_no_snappy --------- Co-authored-by: Taylor Turner <[email protected]> * Quick fix for dependency max pins (#1120) * Fix dask_expr * Keras and Tensorflow version fix * Keras and Tensorflow version fix * Fix keras bug * pre-commit fix (#1122) * docs: update test link to latest version (#1114) * docs: add contributor notes on where to find documentation branches (#1113) * docs: add contributor notes on where to find documentation branches * docs: update documentation wording to spell out why `dev-gh-pages` and `gh-pages` branches exist for staging content * docs: add note on fork Co-authored-by: Taylor Turner <[email protected]> * Update .github/CONTRIBUTING.md Co-authored-by: Taylor Turner <[email protected]> --------- Co-authored-by: Taylor Turner <[email protected]> * update black version (#1131) * Add memray max version (#1132) * Bug fix for float precision calculation using categorical data with trailing zeros. (#1125) * Revert "Bug fix for float precision calculation using categorical data with t…" (#1133) This reverts commit d3159bd. * fix * make up to date * yep, shouldn't change * bump version --------- Co-authored-by: Gábor Lipták <[email protected]> Co-authored-by: abajpai15 <[email protected]> Co-authored-by: Patrick Carlson <[email protected]> Co-authored-by: James Schadt <[email protected]>
* Replace snappy with cramjam (#1091) * add downloads tile (#1085) * Replace snappy with cramjam * Delete test_no_snappy --------- Co-authored-by: Taylor Turner <[email protected]> * pre-commit fix (#1122) * Bug fix for float precision calculation using categorical data with trailing zeros. (#1125) * Revert "Bug fix for float precision calculation using categorical data with t…" (#1133) This reverts commit d3159bd. * refactor: move layers outside of class * refactor: update model to keras 3.0 * fix: manifest * fix: bugs in compile and train * fix: bug in load_from_library * fix: bugs in CharCNN * refactor: loading tf model labeler * fix: bug in data_labeler identification * fix: update model to use proper softmax layer names * fix: formatting * fix: remove unused line * refactor: drop support for 3.8 * fix: comments * fix: comment --------- Co-authored-by: Gábor Lipták <[email protected]> Co-authored-by: Taylor Turner <[email protected]> Co-authored-by: James Schadt <[email protected]>
* refactor: Upgrade the models to use keras 3.0 (#1138) * Replace snappy with cramjam (#1091) * add downloads tile (#1085) * Replace snappy with cramjam * Delete test_no_snappy --------- Co-authored-by: Taylor Turner <[email protected]> * pre-commit fix (#1122) * Bug fix for float precision calculation using categorical data with trailing zeros. (#1125) * Revert "Bug fix for float precision calculation using categorical data with t…" (#1133) This reverts commit d3159bd. * refactor: move layers outside of class * refactor: update model to keras 3.0 * fix: manifest * fix: bugs in compile and train * fix: bug in load_from_library * fix: bugs in CharCNN * refactor: loading tf model labeler * fix: bug in data_labeler identification * fix: update model to use proper softmax layer names * fix: formatting * fix: remove unused line * refactor: drop support for 3.8 * fix: comments * fix: comment --------- Co-authored-by: Gábor Lipták <[email protected]> Co-authored-by: Taylor Turner <[email protected]> Co-authored-by: James Schadt <[email protected]> * Fix Tox (#1143) * tox new * update * update * update * update * update * update * update * update tox.ini * update * update * remove docs * empty retrigger * update (#1146) * bump version * update 3.11 * remove dist/ --------- Co-authored-by: JGSweets <[email protected]> Co-authored-by: Gábor Lipták <[email protected]> Co-authored-by: James Schadt <[email protected]>
Bug fix for #1048 (comment).
The float precision calculation errors out for categorical data when one of the values has leading/ trailing zeros. This is due to the regex operation stripping these zeros and the resulting value being outside the list of possible values.
Passing tests: