New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

Open

ilan-gold wants to merge 31 commits into main from ig/cache_arrays

+119 −45

Open

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

(fix): dont generate coo matrices

Azure Pipelines / scverse.anndata failed Nov 21, 2024 in 27m 33s

Build #20241121.3 had test failures

1 errors / 0 warnings

Details

Tests

Failed: 1 (0.00%)
Passed: 17,404 (85.96%)
Other: 2,841 (14.03%)
Total: 20,246

Code coverage

5056 of 6083 line covered (83.12%)

Annotations

Check failure on line 351 in Build log

azure-pipelines / scverse.anndata

Build log #L351

Bash exited with code '1'.

Check failure on line 1 in test_concat_size_0_axis[obs-outer-same-no_var]

azure-pipelines / scverse.anndata

test_concat_size_0_axis[obs-outer-same-no_var]

FutureWarning: Mismatched null-like values <NA> and nan found. In a future version, pandas equality-testing functions (e.g. assert_frame_equal) will consider these not-matching and raise.

Error raised from element 'obsm/df'.

Raw output


            axis_name = 'obs', join_type = 'outer', merge_strategy = 'same', shape = (8, 0)

    @pytest.mark.parametrize(
        "shape", [pytest.param((8, 0), id="no_var"), pytest.param((0, 10), id="no_obs")]
    )
    def test_concat_size_0_axis(axis_name, join_type, merge_strategy, shape):
        """Regression test for https://github.com/scverse/anndata/issues/526"""
        axis, axis_name = merge._resolve_axis(axis_name)
        alt_axis = 1 - axis
        col_dtypes = (*DEFAULT_COL_TYPES, pd.StringDtype)
        a = gen_adata((5, 7), obs_dtypes=col_dtypes, var_dtypes=col_dtypes)
        b = gen_adata(shape, obs_dtypes=col_dtypes, var_dtypes=col_dtypes)
    
        expected_size = expected_shape(a, b, axis=axis, join=join_type)
    
        ctx_concat_empty = (
            pytest.warns(
                FutureWarning,
                match=r"The behavior of DataFrame concatenation with empty or all-NA entries is deprecated",
            )
            if shape[axis] == 0 and Version(pd.__version__) >= Version("2.1")
            else nullcontext()
        )
        with ctx_concat_empty:
            result = concat(
                {"a": a, "b": b},
                axis=axis,
                join=join_type,
                merge=merge_strategy,
                pairwise=True,
                index_unique="-",
            )
        assert result.shape == expected_size
    
        if join_type == "outer":
            # Check new entries along axis of concatenation
            axis_new_inds = axis_labels(result, axis).str.endswith("-b")
            altaxis_new_inds = ~axis_labels(result, alt_axis).isin(axis_labels(a, alt_axis))
            axis_idx = make_idx_tuple(axis_new_inds, axis)
            altaxis_idx = make_idx_tuple(altaxis_new_inds, 1 - axis)
    
            check_filled_like(result.X[axis_idx], elem_name="X")
            check_filled_like(result.X[altaxis_idx], elem_name="X")
            for k, elem in getattr(result, "layers").items():
                check_filled_like(elem[axis_idx], elem_name=f"layers/{k}")
                check_filled_like(elem[altaxis_idx], elem_name=f"layers/{k}")
    
            if shape[axis] > 0:
                b_result = result[axis_idx].copy()
                mapping_elem = f"{axis_name}m"
                setattr(b_result, f"{axis_name}_names", getattr(b, f"{axis_name}_names"))
                for k, result_elem in getattr(b_result, mapping_elem).items():
                    elem_name = f"{mapping_elem}/{k}"
                    # pd.concat can have unintuitive return types. is similar to numpy promotion
                    if isinstance(result_elem, pd.DataFrame):
>                       assert_equal(
                            getattr(b, mapping_elem)[k].astype(object),
                            result_elem.astype(object),
                            elem_name=elem_name,
                        )

tests/test_concatenate.py:1427: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/functools.py:907: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/anndata/tests/helpers.py:635: in are_equal_dataframe
    report_name(pd.testing.assert_frame_equal)(
/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/anndata/tests/helpers.py:552: in func_wrapper
    raise e
/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/anndata/tests/helpers.py:541: in func_wrapper
    return func(*args, **kwargs)
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
testing.pyx:160: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   FutureWarning: Mismatched null-like values <NA> and nan found. In a future version, pandas equality-testing functions (e.g. assert_frame_equal) will consider these not-matching and raise.
E

View more details on Azure Pipelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

Build #20241121.3 had test failures

Details

Annotations

azure-pipelines / scverse.anndata

azure-pipelines / scverse.anndata

Re-running checks...

(fix): cache arrays in BaseCompressedSparseDataset #1744

Are you sure you want to change the base?

(fix): dont generate coo matrices

(fix): cache arrays in BaseCompressedSparseDataset #1744

Build #20241121.3 had test failures

Details

Annotations

azure-pipelines / scverse.anndata

azure-pipelines / scverse.anndata

Re-running checks...

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

(fix): cache arrays in `BaseCompressedSparseDataset` #1744