Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways to share the hdf5 files using amazon web services #3

Open
tomsing1 opened this issue Apr 11, 2018 · 2 comments
Open

Ways to share the hdf5 files using amazon web services #3

tomsing1 opened this issue Apr 11, 2018 · 2 comments

Comments

@tomsing1
Copy link
Contributor

tomsing1 commented Apr 11, 2018

Some random thoughts on how to potentially make the hdf5 files / data available within Denali.

Note: Every time data is retrieved from AWS, there is a small transfer fee (per Gb). Probably not an issue right now, but good to know.

  • Keep a copy of the hdf5 files on AWS S3 and provide functions to download them the first time they are needed (and then keep them cached on the user's computer). This is similar to downloading them from the authors' website, but probably faster because of our fast uplink to AWS. The SRAdb package is an example of providing a function to download the required data.
  • AWS Elastic File System (EFS) that would make the data available to EC2 instances. (There seem to be workarounds to mount EFS on non EC2 computers, but that might be too much of a hassle.)
  • HDF Server implements a REST service. This slide deck is interesting, too, and points to a github repo from the hdfgroup. Not sure if h5serv and hdfserver refer to the same thing....
@lianos lianos changed the title Sharing the hdf5 files within Denali Ways you might share the hdf5 files using amazon web services Apr 11, 2018
@lianos lianos changed the title Ways you might share the hdf5 files using amazon web services Ways to share the hdf5 files using amazon web services Apr 11, 2018
@trahloff
Copy link

trahloff commented Mar 2, 2022

Hi Thomas! Have you found an approach to conveniently store HDF5 files on AWS and work with them?

@tomsing1
Copy link
Contributor Author

tomsing1 commented Mar 2, 2022

@trahloff : I toyed with using the HDF5Array Delayed Arrays from Bioconductor and got to the proof-of-concept stage with help from folks on the Bioconductor support forum: https://support.bioconductor.org/p/9135005/ and https://support.bioconductor.org/p/9134972/ . But I never got to use it for anything real. Right now, I am still copying the HDF5 files from S3 to a local volume.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants