-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow bucket to be mounted to an arbitrary logical path #2196
Comments
One way of satisfying this feature request would be to create a new archive naming policy, e.g., iadmin mkresc \
myBucketResc \
s3 \
"$(hostname)":/my_bucket/prefix/in/bucket \
'S3_DEFAULT_HOSTNAME=s3.us-east-1.amazonaws.com;S3_AUTH_FILE=/var/lib/irods/my_bucket.keypair;ARCHIVE_NAMING_POLICY=chroot;ROOT_COLL=/zone/home/project/s3-bucket' Here's an implementation of this for version 4.2.11. main...tedgin:irods_resource_plugin_s3:main |
Very interesting. We'll look into it following UGM. |
For posterity... This was discussed at length during the May 2024 S3 Working Group. Minutes have not yet been published. |
I think this is a subset of today's functionality of the S3 plugin. This is a restriction of which logical_path(s) are allowed to be stored on this resource. So... potentially a new context string setting... Could be enough to let us implement the requested feature and 'pin' a resource to a certain subset of the logical namespace. If there is existing data in a newly 'mounted' bucket, it would need to be 'scanned' or 'registered' for that data to be visible via the catalog. Could be via Lambda, could be ingest tool, etc. |
But that new idea wouldn't allow management/updating of the physical path in the bucket itself. |
I'm requesting that the iRODS S3 storage resource plugin be able to mount (attach, graft) a bucket to an arbitrary logical path in an iRODS zone.
Currently, a bucket path, e.g.,
/my_bucket/
, is mounted at the zone logical path, e.g.,/zone/
. This means a data object added at/zone/home/user/object
gets the namehome/user/object
in/my_bucket/
. This is fine when a bucket is being used as a zone-wide storage resource, and the data in the bucket will primarily be accessed through iRODS. If the data will primarily be accessed outside of iRODS, but on occasion still needs to be accessed through iRODS, forcing an object in the bucket to be prefixed with something likehome/user
or be accessed in iRODS at the base of the zone, i.e.,/zone/object
is inconvenient.Pretend that
/my_bucket/
already has thousands of objects in it when an iRODS storage resource is created for it. Furthermore, there are mature workflows that add and access objects in the bucket outside of iRODS following specific naming conventions. Renaming existing objects so they don't show up directly under the zone would be difficult. This gets worse, if one of the S3 objects has the name of an existing iRODS collection or data object likehome
orhome/tedgin/teletubbies.jpg
. If the bucket path were able to be mounted to an arbitrary logical path, e.g.,/zone/home/project/s3-bucket/
, then the names of existing S3 objects wouldn't need to be renamed.Having a bucket be able to be mounted to an arbitrary logical path also opens up the possibility of a user or project being able to access data from an S3 bucket that they own (and pay for) from within an iRODS zone without the S3 bucket becoming usable by everyone else in the zone. Supporting this is outside the scope of this feature request.
The text was updated successfully, but these errors were encountered: