Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735

yubin-ai · 2024-10-31T02:43:19Z

Please make sure these conditions are met

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of anndata.
(optional) I have confirmed this bug exists on the master branch of anndata.

Report

I encountered unexpected behavior when indexing an AnnData object using a list of float values. While indexing with a single float correctly raises an IndexError, indexing with a list or array of floats does not raise an error and proceeds as if the floats were valid indices. This seems inconsistent and could lead to unintended results.

Code:

import anndata
import numpy as np

# Create a sample AnnData object
adata = anndata.AnnData(np.random.rand(100, 10))

# Indexing with a single float value raises an error (as expected)
try:
    adata[43.4, :].obs
except IndexError as e:
    print(f"Single float index error (expected): {e}")

# Indexing with a list of floats does not raise an error (unexpected behavior)
float_indices = [42.85256014, 62.04391223, 26.08972756, 54.38563822, 90.45806554, 78.73412668]
try:
    result = adata[float_indices, :].obs
    print("Indexing with float list succeeded (unexpected):")
    print(result)
except IndexError as e:
    print(f"Float list index error (expected): {e}")

Traceback:

Single float index error (expected): Unknown indexer 43.4 of type <class 'float'>
Indexing with float list succeeded (unexpected):
Empty DataFrameView
Columns: []
Index: [42, 62, 26, 54, 90, 78]

Versions

anndata 0.10.9
numpy 1.26.4
pandas 2.2.3
session_info 1.0.0
torch 2.5.0+cu124
tqdm 4.66.5
...
Python 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0]
Linux-5.10.226-214.880.amzn2.x86_64-x86_64-with-glibc2.36

ilan-gold · 2024-11-04T11:17:39Z

Hmm, @yubin-ai I am not sure this is unexpected, but perhaps strange. For example:

import anndata as ad
import pandas as pd

adata = ad.AnnData(obs=pd.DataFrame({ 'a': ['c', 'b']}, index=[1.2, 1.3]))
adata[[1.2]]

works, and should work, but

adata[1.2]

would error. In general, I think floating numbers as an index is probably fine, but this raises a question of ambiguity with integers as well. If someone has an integer index, how should we interpret those integers? pandas has mechanisms for disambiguating the purpose of an indexing object, whether its label-based (as above) or its positional.

@flying-sheep thoughts here?

yubin-ai · 2024-11-04T15:07:19Z

@ilan-gold Indeed, it’s strange. In your case, using floats as query seems logical that there are floats index. However, in my example, none of the indices are floats. It appears that adata, or perhaps just the underlying pandas DataFrame, rounds the float to an integer and then selects the rounded index, which is concerning since it returns entries that don’t actually exist/match. My title might have to be modified a bit to describe this more clearly.

In my case, the float input was due to an error on my end—a wrong variable was passed—and I didn’t catch it because of how smoothly it was handled. A quick assertion or error check to confirm the query index exists in adata.obs.index could help prevent issues like this.

ilan-gold · 2024-11-04T15:40:35Z

I agree @yubin-ai - I misread your original print statement too, so didn't catch that it was actually downcasting. So yes, we should then check that.

ilan-gold · 2024-11-08T09:49:06Z

@yubin-ai I have just been informed we only accept string indices, so my example doesn't really work. So I really think we should just error out then. Thanks for the issue

yubin-ai · 2024-11-08T17:03:39Z

@ilan-gold an error out sounds great. Thanks a lot for working on it!

yubin-ai added Bug 🐛 Triage 🩺 labels Oct 31, 2024

ilan-gold removed the Triage 🩺 label Nov 4, 2024

ilan-gold added this to the 0.11.1 milestone Nov 8, 2024

ilan-gold linked a pull request Nov 8, 2024 that will close this issue

(fix): raise error on non-integer floating types in iterables #1746

Open

3 tasks

ilan-gold modified the milestones: 0.11.1, 0.11.2 Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735

Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735

yubin-ai commented Oct 31, 2024

ilan-gold commented Nov 4, 2024 •

edited

Loading

yubin-ai commented Nov 4, 2024

ilan-gold commented Nov 4, 2024

ilan-gold commented Nov 8, 2024

yubin-ai commented Nov 8, 2024

Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735

Unexpected Behavior: AnnData Allows Indexing with Float Arrays Without Error #1735

Comments

yubin-ai commented Oct 31, 2024

Please make sure these conditions are met

Report

Versions

ilan-gold commented Nov 4, 2024 • edited Loading

yubin-ai commented Nov 4, 2024

ilan-gold commented Nov 4, 2024

ilan-gold commented Nov 8, 2024

yubin-ai commented Nov 8, 2024

ilan-gold commented Nov 4, 2024 •

edited

Loading