-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] adding an option to read file from s3 #1322
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1322 +/- ##
==========================================
- Coverage 85.70% 83.50% -2.20%
==========================================
Files 34 34
Lines 5450 5451 +1
==========================================
- Hits 4671 4552 -119
- Misses 779 899 +120
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Hi, I think enabling HDF5-on-S3 through such a small change is pretty cool. But I think it would make sense to make an issue first, to talk about support for other formats and how to tackle them. |
@flying-sheep - this is a bit related to #634, I've commented there at some point, but didn't get answer. I will open a fresh issue to get some new feedback |
Maybe @Koncopd has a comment here, too? |
Yes, we use |
Do you have integration of How can I use |
@djarecka https://github.com/laminlabs/lamindb/blob/bf1c81e7ae8c177a7338d560521cc5b0c1cdc1aa/lamindb/dev/storage/file.py#L84 |
Recently I've been testing an option to access files directly from s3 using
remfile
. remfile provides a file-like object for reading a remote file over HTTP, optimized for use with h5py.I've added a simple change to the code and a simple test that access a big file directly from s3 and is able to check the file size without reading entire content. I'd be happy to expand the tests and changes if the package maintainers say that are interested in the addition.