Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Support the SSE-C(server-side encryption with customer-provided keys) for the AWS s3 filesystem read and write #43535

Closed
ripplehang opened this issue Aug 2, 2024 · 1 comment

Comments

@ripplehang
Copy link
Contributor

Describe the enhancement requested

According to https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html#specifying-s3-c-encryption
The AWS S3 bucket has already support the SSE-C, is it possible for the arrow c++ to provide the API to support read/write data with SSE-C for the aws s3 filesystem.
Thanks

Component(s)

C++

ripplehang pushed a commit to ripplehang/arrow that referenced this issue Aug 23, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Aug 27, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Aug 27, 2024
@kou kou changed the title Support the SSE-C(server-side encryption with customer-provided keys) for the AWS s3 filesystem read and write [C++] Support the SSE-C(server-side encryption with customer-provided keys) for the AWS s3 filesystem read and write Aug 31, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Sep 20, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Sep 20, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Sep 20, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Sep 27, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Sep 29, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Oct 8, 2024
ripplehang pushed a commit to ripplehang/arrow that referenced this issue Oct 16, 2024
pitrou pushed a commit to ripplehang/arrow that referenced this issue Oct 17, 2024
pitrou pushed a commit to ripplehang/arrow that referenced this issue Nov 3, 2024
pitrou added a commit that referenced this issue Nov 3, 2024
### Rationale for this change
 [server-side encryption with customer-provided keys](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html) is an important security feature for aws s3, it's useful when user want to manager the encryption key themselves, say, they don't want the data to be exposed to the aws system admin, and ensure the object is safe even the ACCESS_KEY and SECRET_KEY is somehow leaked. 
Some comparison of S3 encryption options :
https://www.linkedin.com/pulse/delusion-s3-encryption-benefits-ravi-ivaturi/
### What changes are included in this PR?

1. Add the **sse_customer_key** member for S3Options to support [server-side encryption with customer-provided keys](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html) (SSE-C keys).
    -  The sse_customer_key was expected to be  256 bits (32 bytes) according to [aws doc](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html#specifying-s3-c-encryption)
    - The sse_customer_key  was expected to be the raw key rather than base64 encoded value, arrow would calculate the base64 and MD5 on the fly.
    - By default the sse_customer_key  is empty, and when the sse_customer_key is empty, there is no impact on the existing workflow. When the sse_customer_key  is configured, it would require the aws sdk version to newer than 1.9.201.

2. Add the **tls_ca_file_path**, **tls_ca_dir_path** and **tls_verify_certificates**  members for S3Options.
   -  the tls_ca_file_path, tls_ca_dir_path member for S3Options would override the value configured by arrow::fs::FileSystemGlobalOptions. 
   - for s3, according to [aws sdk doc](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/client-config.html), the tls_ca_file_path and tls_ca_dir_path only take effect in Linux, in order to support  connect to the the storage server like minio with self-signed certificates on non-linux platform, we expose the tls_verify_certificates.

3. Refine the unit test to start the minio server with self-signed certificate on linux platform, so the unit test could cover the https case on linux, and http case on non-linux platform.

### Are these changes tested?
Yes

### Are there any user-facing changes?

Only additional members to S3Options.

* GitHub Issue: #43535

Lead-authored-by: Hang Zheng <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@pitrou pitrou added this to the 19.0.0 milestone Nov 3, 2024
@pitrou
Copy link
Member

pitrou commented Nov 3, 2024

Issue resolved by pull request 43601
#43601

@pitrou pitrou closed this as completed Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants