Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support S3 request rewriting #147

Merged
merged 1 commit into from
Aug 27, 2024
Merged

Conversation

vagaerg
Copy link
Member

@vagaerg vagaerg commented Aug 16, 2024

I turned TestPresigningHeaders and TestPresignedRequests into abstract test classes, so unfortunately this results in a large diff. Diffing these files with their Abstract counterparts should provide useful insight.

This PR introduces functionality to redirect the bucket and key of an S3 request, in order to provide a virtualised view of a bucket.

Since the proxy does not decode the responses it gets back from S3, listing files or listing buckets will result in the real names being returned.

The authorisation logic happens with the original bucket and key and not the rewritten one. I believe this is a powerful way of letting users build abstractions on top of existing storage buckets, but it would need to be documented to ensure people understand how request rewriting changes the authorization flow.

@cla-bot cla-bot bot added the cla-signed label Aug 16, 2024
}
}

Optional<S3RewriteResult> rewrite(Credentials credentials, ParsedS3Request request);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider returning Optional<ParsedS3Request>? This would be more flexible as the plugin could adjust anything it wanted. wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this, but the ParsedS3Request class has lots of fields and it could become error prone. Users would need to understand the different input streams the class can have for instance.
There's also the headers logic, where we rely on some headers having been stripped out of the ParsedS3Request so we don't send them to S3 (for instance, Authorization).

I feel that it is easier to expand the S3RewriteResult class in the future if we want to make it more extensible, than have to consider all the possible modifications people may make to our internal parsed request record.

Wdyt?

@vagaerg vagaerg merged commit f870663 into trinodb:main Aug 27, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants