(fix): cache arrays in `BaseCompressedSparseDataset` #1744

ilan-gold · 2024-11-08T08:45:37Z

Noticed while creating the screenshot tutorial for the announcement
Tests added
Release note added (or unnecessary)

codecov · 2024-11-08T09:03:49Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.55%. Comparing base (d61e09c) to head (1dcf7ad).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1744      +/-   ##
==========================================
- Coverage   87.01%   84.55%   -2.46%     
==========================================
  Files          40       40              
  Lines        6059     6075      +16     
==========================================
- Hits         5272     5137     -135     
- Misses        787      938     +151

Files with missing lines	Coverage Δ
src/anndata/_core/sparse_dataset.py	`93.50% <100.00%> (+0.48%)`	⬆️
src/anndata/_io/specs/lazy_methods.py	`100.00% <100.00%> (ø)`

... and 8 files with indirect coverage changes

---- 🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

flying-sheep

Does what it promises and has nice tests.

However I think Isaac only cached the longest array intentionally. I forgot the reason though.

Sure, we now have much bigger data so maybe if there was a tradeoff, things are now different. But maybe worth checking.

src/anndata/tests/helpers.py

ilan-gold · 2024-11-12T09:04:26Z

However I think Isaac only cached the longest array intentionally. I forgot the reason though.

We only cached indptr previously as a np.ndarray i.e., we read it into memory because it needs to be read in anywya ever time you make an access.

As for the other arrays, I don't think there was a good reason - I think we wanted to just start with that but I posted on zulip.

ilan-gold · 2024-11-12T09:04:37Z

(I was the one who started doing the caching)

flying-sheep · 2024-11-12T09:41:12Z

Hm, since these arrays are storage classes anyway, and probably don’t cost any noticable amount of memory, I don‘t foresee a problem.

ilan-gold · 2024-11-21T09:35:28Z

Let's review this in the absence of an answer

ilan-gold · 2024-11-21T09:35:58Z

(Or rather, give approval since you already seem to have reviewed)

src/anndata/_core/sparse_dataset.py

ilan-gold added 2 commits November 8, 2024 09:41

(fix): lazy chunking respects -1

be4be30

(fix): cache arrays in BaseCompressedSparseDataset

2115298

ilan-gold added type: sparse 🫥 type: dask array Bug 🐛 labels Nov 8, 2024

ilan-gold added this to the 0.11.1 milestone Nov 8, 2024

ilan-gold added the skip-gpu-ci label Nov 8, 2024

ilan-gold added 16 commits November 8, 2024 10:13

(fix): clean up typing

2edabe2

(fix): doctest double >>>

2860116

(chore): add tests

fa96348

(fix): more typing updates

a0e2d52

(chore): add tests

dc01a3a

Merge branch 'ig/fix_chunking' into ig/cache_arrays

1ba4920

(fix): remove extra >>>

37aba1b

Merge branch 'ig/fix_chunking' into ig/cache_arrays

5eb3a6c

(fix): spelling

32fbef9

Merge branch 'ig/fix_chunking' into ig/cache_arrays

fcebbf7

Merge branch 'main' into ig/fix_chunking

0d3278e

(chore): release note

ceb70b4

Merge branch 'ig/fix_chunking' into ig/cache_arrays

e7d14ae

(chore): release note

fedd827

(fix): support None and -1

5960331

Merge branch 'ig/fix_chunking' into ig/cache_arrays

f59e5ca

ilan-gold marked this pull request as ready for review November 8, 2024 11:19

ilan-gold marked this pull request as draft November 8, 2024 11:19

ilan-gold added 4 commits November 8, 2024 12:24

(chore): typing

e652c44

Merge branch 'ig/fix_chunking' into ig/cache_arrays

fc8495f

(chore): add cache bust test

59849a8

(chore): type

41bd62e

ilan-gold added 4 commits November 8, 2024 17:05

(chore): types

0304d31

(chore): better name

76ecda5

(Fix): overload type

e538a12

(chore): bring back test comment

b513217

Base automatically changed from ig/fix_chunking to main November 8, 2024 17:45

Merge branch 'main' into ig/cache_arrays

0cca2fe

ilan-gold marked this pull request as ready for review November 11, 2024 14:28

ilan-gold requested a review from flying-sheep November 11, 2024 14:56

Update 1744.bugfix.md

d2d9f55

flying-sheep reviewed Nov 11, 2024

View reviewed changes

src/anndata/tests/helpers.py Outdated Show resolved Hide resolved

ilan-gold modified the milestones: 0.11.1, 0.11.2 Nov 12, 2024

(fix): revert erroneous change

3dc0ddd

ilan-gold requested a review from flying-sheep November 21, 2024 09:33

Merge branch 'main' into ig/cache_arrays

05c16ff

(fix): dont generate coo matrices

1dcf7ad

flying-sheep reviewed Nov 21, 2024

View reviewed changes

src/anndata/_core/sparse_dataset.py Show resolved Hide resolved

flying-sheep approved these changes Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

ilan-gold commented Nov 8, 2024 •

edited

Loading

codecov bot commented Nov 8, 2024 •

edited

Loading

flying-sheep left a comment

ilan-gold commented Nov 12, 2024

ilan-gold commented Nov 12, 2024

flying-sheep commented Nov 12, 2024

ilan-gold commented Nov 21, 2024

ilan-gold commented Nov 21, 2024

(fix): cache arrays in BaseCompressedSparseDataset #1744

Are you sure you want to change the base?

(fix): cache arrays in BaseCompressedSparseDataset #1744

Conversation

ilan-gold commented Nov 8, 2024 • edited Loading

codecov bot commented Nov 8, 2024 • edited Loading

Codecov Report

flying-sheep left a comment

Choose a reason for hiding this comment

ilan-gold commented Nov 12, 2024

ilan-gold commented Nov 12, 2024

flying-sheep commented Nov 12, 2024

ilan-gold commented Nov 21, 2024

ilan-gold commented Nov 21, 2024

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

(fix): cache arrays in `BaseCompressedSparseDataset` #1744

ilan-gold commented Nov 8, 2024 •

edited

Loading

codecov bot commented Nov 8, 2024 •

edited

Loading