Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugins that help to pass credentials for S3 and GCS to remote cluster workers #438

Open
dbalabka opened this issue Oct 4, 2024 · 1 comment
Labels
provider/aws/ec2 Cluster provider for AWS EC2 Instances provider/gcp/vm Cluster provider for GCP Instances question Further information is requested

Comments

@dbalabka
Copy link
Contributor

dbalabka commented Oct 4, 2024

I didn't find a simple way to pass credentials to remote workers, such as S3 and GCS, while both are widely used to store data frames.
In this ticket's scope, I propose creating plugins that will help distribute the required keys to remote workers.

GCP credentials
GCP credentials file path is stored in GOOGLE_APPLICATION_CREDENTIALS env variable. The plugin has to create a remote file and pass an env variable with a proper path to workers.

S3 credentials
Like GCP, we must update credential files and store them on each worker.

PR: #439

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Oct 7, 2024

Usually you would create an IAM instance role and profile that can access S3, then configure workers to have this role via the iam_instance_profile keyword argument.

The GCP equivalent is to create a service account that can access GCS and configure that with the service_account kwarg.

This way you don't have to pass credentials around. Is there a reason why you aren't doing it this way?

@jacobtomlinson jacobtomlinson added question Further information is requested provider/gcp/vm Cluster provider for GCP Instances provider/aws/ec2 Cluster provider for AWS EC2 Instances labels Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider/aws/ec2 Cluster provider for AWS EC2 Instances provider/gcp/vm Cluster provider for GCP Instances question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants