Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG (string dtype): let fillna with invalid value upcast to object dtype #60296

Merged

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Nov 13, 2024

Like I did for replace() (issue #60282 / PR #60285), it turns out that also for fillna() we generally upcast to object dtype if the fill value does not fit the dtype:

>>> ser = pd.Series([1, 2, None], dtype="float")
>>> ser.fillna("a")
Out[4]: 
0    1.0
1    2.0
2      a
dtype: object

And similarly as for replace(), this was raising an error in the case of StringDtype.
The reason is that for generic (external) ExtensionArrays, we are actually strict and fillna is just dispatched to the EA without fallback to object dtype; while for all our own default extension dtypes we do have custom code to ensure to do a fallback (but not for the masked dtypes and ArrowDtype ..).

Given this is the default behaviour right now for other dtypes (and so we also haven't deprecated this for current usage of strings in object dtype), I would preserve this fallback for now to avoid this as another breaking change in 3.0.

xref #54792

@jorisvandenbossche jorisvandenbossche added Strings String extension data type and string data API - Consistency Internal Consistency of API/Behavior labels Nov 13, 2024
@jorisvandenbossche jorisvandenbossche added this to the 2.3 milestone Nov 13, 2024
@jorisvandenbossche jorisvandenbossche marked this pull request as ready for review November 14, 2024 10:59
@mroeschke mroeschke merged commit 34c39e9 into pandas-dev:main Nov 14, 2024
51 checks passed
@mroeschke
Copy link
Member

Thanks @jorisvandenbossche

Copy link

lumberbot-app bot commented Nov 14, 2024

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.3.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 34c39e9078ea8af12871a92bdcea2058553c9869
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #60296: BUG (string dtype): let fillna with invalid value upcast to object dtype'
  1. Push to a named branch:
git push YOURFORK 2.3.x:auto-backport-of-pr-60296-on-2.3.x
  1. Create a PR against branch 2.3.x, I would have named this PR:

"Backport PR #60296 on branch 2.3.x (BUG (string dtype): let fillna with invalid value upcast to object dtype)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@jorisvandenbossche jorisvandenbossche deleted the string-dtype-fillna-object-upcast branch November 14, 2024 19:14
jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this pull request Nov 14, 2024
…ype (pandas-dev#60296)

* BUG (string dtype): let fillna with invalid value upcast to object dtype

* fix fillna limit case + update tests for no longer raising

(cherry picked from commit 34c39e9)
@jorisvandenbossche
Copy link
Member Author

Manual backport -> #60316

mroeschke pushed a commit that referenced this pull request Nov 14, 2024
…cast to object dtype (#60296) (#60316)

BUG (string dtype): let fillna with invalid value upcast to object dtype (#60296)

* BUG (string dtype): let fillna with invalid value upcast to object dtype

* fix fillna limit case + update tests for no longer raising

(cherry picked from commit 34c39e9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants