Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tableone.TableOne with categorical pandas DataFrame column raises TypeError #177

Closed
eroell opened this issue Jun 15, 2024 · 4 comments
Closed

Comments

@eroell
Copy link

eroell commented Jun 15, 2024

Hi,

with the new release 0.9.0 I run into the following issue when having a categorical column type in a pandas DataFrame:

import tableone
import pandas as pd

dummy_table = pd.DataFrame(
    {
        "age": [70, 80, 90, 85],
        "sex": ["m", "f", "m", "f"]
    }
)
dummy_table["sex"] = dummy_table["sex"].astype("category")

tableone.TableOne(dummy_table)

raises

TypeError: Cannot setitem on a Categorical with a new category (None), set the categories first

The same example works just fine when omitting dummy_table["sex"] = dummy_table["sex"].astype("category"), that is when the column type is "object".

On Python 3.11.9, Info on Environment (pip list):

Package         Version
--------------- -----------
et-xmlfile      1.1.0
Jinja2          3.1.4
MarkupSafe      2.1.5
numpy           1.26.4
openpyxl        3.1.4
packaging       24.1
pandas          2.2.2
patsy           0.5.6
pip             24.0
python-dateutil 2.9.0.post0
pytz            2024.1
scipy           1.13.1
setuptools      65.5.0
six             1.16.0
statsmodels     0.14.2
tableone        0.9.0
tabulate        0.9.0
tzdata          2024.1

Did not yet dive into why this is the case... for tableone 0.8.0 this works. I have pandas 2.2.2 for both the working tableone 0.8.0 and the non-working tableone 0.9.0 setup.

Is this a bug or has this been made stricter input rule for a reason?

Best,

@tompollard
Copy link
Owner

Apologies, thanks for flagging this!

Is this a bug or has this been made stricter input rule for a reason?

It's a bug, I think caused by the introduction of the include_null argument in: #175

I'll fix it today or tomorrow, but for the meantime you may find that setting include_null=False resolves the issue. This will switch back to the old behaviour.

@tompollard
Copy link
Owner

Thanks again @eroell. Should be fixed if you bump the version to 0.9.1:
https://pypi.org/project/tableone/0.9.1/

@eroell
Copy link
Author

eroell commented Jun 16, 2024

Thanks a lot for the fast resolve @tompollard! Confirm that bumping to 0.9.1 resolved this issue.

@tompollard
Copy link
Owner

Thanks! Feel free to raise issues if there are other bug fixes or features that you'd like to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants