Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LocalCache: limit allowed blob size #69

Merged
merged 3 commits into from
May 21, 2024
Merged

LocalCache: limit allowed blob size #69

merged 3 commits into from
May 21, 2024

Conversation

magland
Copy link
Collaborator

@magland magland commented May 17, 2024

I found a situation where the chunk size was larger than 1 GB and the sqllite save failed. This particular dataset was not chunked, and this should not be an issue for most dandisets where chunk sizes are reasonable.

So with this PR I limit the size of chunks in local cache to 900 MB.

See https://www.sqlite.org/limits.html

(For some reason, despite what it says in that limits.html, it seems that 1000 MB doesn't work whereas 900 MB does)

For LindiH5ZarrStore and LindiReferenceFileSystemStore, I display a warning if the chunk size is too large, and then skip storing it in local cache. For LindiRemfile I raise an exception because this should never happen in that context.

I added a test.

@magland magland requested a review from rly May 17, 2024 13:06
@rly
Copy link
Contributor

rly commented May 18, 2024

Since the 900 MB limit is specific to the sqlite cache and I would like to keep the magic numbers in one place rather than four, could you move the if statement to LocalCacheSQLiteClient.put_remote_chunk and raise an exception there if it is too big? We can catch the exception and raise a warning instead in LindiH5ZarrStore and LindiReferenceFileSystemStore.

I'm also happy to refactor that.

@magland
Copy link
Collaborator Author

magland commented May 20, 2024

Since the 900 MB limit is specific to the sqlite cache and I would like to keep the magic numbers in one place rather than four, could you move the if statement to LocalCacheSQLiteClient.put_remote_chunk and raise an exception there if it is too big? We can catch the exception and raise a warning instead in LindiH5ZarrStore and LindiReferenceFileSystemStore.

I'm also happy to refactor that.

Sounds good. I'll work on that.

@codecov-commenter
Copy link

codecov-commenter commented May 21, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 79.41%. Comparing base (797869d) to head (1535202).

Files Patch % Lines
lindi/LindiH5ZarrStore/LindiH5ZarrStore.py 20.00% 4 Missing ⚠️
...ndi/LindiH5pyFile/LindiReferenceFileSystemStore.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #69      +/-   ##
==========================================
- Coverage   79.58%   79.41%   -0.17%     
==========================================
  Files          30       30              
  Lines        2243     2254      +11     
==========================================
+ Hits         1785     1790       +5     
- Misses        458      464       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@magland
Copy link
Collaborator Author

magland commented May 21, 2024

@rly Is this what you had in mind?

@rly
Copy link
Contributor

rly commented May 21, 2024

Yes, awesome. Thanks for the change!

@magland magland merged commit 197a388 into main May 21, 2024
6 checks passed
@magland magland deleted the local-cache-blob-size branch May 21, 2024 10:23
@magland
Copy link
Collaborator Author

magland commented May 21, 2024

on pypi as 0.3.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants