-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735
Comments
Hmm, @yubin-ai I am not sure this is unexpected, but perhaps strange. For example: import anndata as ad
import pandas as pd
adata = ad.AnnData(obs=pd.DataFrame({ 'a': ['c', 'b']}, index=[1.2, 1.3]))
adata[[1.2]] works, and should work, but
would error. In general, I think floating numbers as an index is probably fine, but this raises a question of ambiguity with integers as well. If someone has an integer index, how should we interpret those integers? pandas has mechanisms for disambiguating the purpose of an indexing object, whether its label-based (as above) or its positional. @flying-sheep thoughts here? |
@ilan-gold Indeed, it’s strange. In your case, using floats as query seems logical that there are floats index. However, in my example, none of the indices are floats. It appears that adata, or perhaps just the underlying pandas DataFrame, rounds the float to an integer and then selects the rounded index, which is concerning since it returns entries that don’t actually exist/match. My title might have to be modified a bit to describe this more clearly. In my case, the float input was due to an error on my end—a wrong variable was passed—and I didn’t catch it because of how smoothly it was handled. A quick assertion or error check to confirm the query index exists in adata.obs.index could help prevent issues like this. |
I agree @yubin-ai - I misread your original print statement too, so didn't catch that it was actually downcasting. So yes, we should then check that. |
@yubin-ai I have just been informed we only accept string indices, so my example doesn't really work. So I really think we should just error out then. Thanks for the issue |
@ilan-gold an error out sounds great. Thanks a lot for working on it! |
Please make sure these conditions are met
Report
I encountered unexpected behavior when indexing an AnnData object using a list of float values. While indexing with a single float correctly raises an IndexError, indexing with a list or array of floats does not raise an error and proceeds as if the floats were valid indices. This seems inconsistent and could lead to unintended results.
Code:
Traceback:
Versions
anndata 0.10.9
numpy 1.26.4
pandas 2.2.3
session_info 1.0.0
torch 2.5.0+cu124
tqdm 4.66.5
...
Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0]
Linux-5.10.226-214.880.amzn2.x86_64-x86_64-with-glibc2.36
The text was updated successfully, but these errors were encountered: