You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Driven by the need for processing large bioscientific datasets for the Catalyst partner communities.
We propose data transfer workflow like the following:
a) users should stage their 'input' datasets in object storage buckets
b) if workflows support reading directly from object storage then use that else make a local copy from object storage to /tmp
c) use /tmp for any intermediate files created during a workflow pipeline
d) push 'output' data sets to object storage for persistence
e) strongly encourage community users to keep home directory storage to under 1GB per user
f) discourage use of shared expect for smaller datasets (100Gb total per community)
Context
Driven by the need for processing large bioscientific datasets for the Catalyst partner communities.
We propose data transfer workflow like the following:
a) users should stage their 'input' datasets in object storage buckets
b) if workflows support reading directly from object storage then use that else make a local copy from object storage to /tmp
c) use /tmp for any intermediate files created during a workflow pipeline
d) push 'output' data sets to object storage for persistence
e) strongly encourage community users to keep home directory storage to under 1GB per user
f) discourage use of shared expect for smaller datasets (100Gb total per community)
See 2i2c-org/infrastructure#4213
Proposal
Document the workflow as a tutorial to guide hub admins and end users through this recommended workflow.
Updates and actions
No response
The text was updated successfully, but these errors were encountered: