Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sign blob URL using workload identity instead of common service account credentials #1237

Closed
devnicolas1 opened this issue Mar 13, 2024 · 11 comments
Assignees
Labels
api: storage Issues related to the googleapis/python-storage API. type: question Request for information or clarification. Not an issue.

Comments

@devnicolas1
Copy link

devnicolas1 commented Mar 13, 2024

I currently have a setup where, locally, I use a file sa-credentials.json for my credentials, and its path is described in the environment variable GOOGLE_APPLICATION_CREDENTIALS.

This works just fine, but when pushing to development/production, the authentication method is workload identity for GKE (described in this documentation).

Currently, my code is:

blob = bucket.blob(name)
url = blob.generate_signed_url(
                version="v4",
                expiration=300,
                method="GET",
         )

It works absolutely perfectly locally with the JSON file, but when using it on development/production environments, I get the following error:

image

According to all that I searched, it seems that signing URLs using workload identity isn't as trivial as doing it with a simple JSON file.

I would truly appreciate absolutely any help on this, as I'm not really experienced with GCP and really don't know exactly where should I go from here.

Edit: also I didn't even talk about it, but of course, it there is a native way of doing this without having to go through IAM signing manually, would also be great! I couldn't find anything abou it though

@parthea parthea transferred this issue from googleapis/google-cloud-python Mar 13, 2024
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Mar 13, 2024
@devnicolas1
Copy link
Author

Following this example for manually signing URLs, I still didn't quite comprehend what exactly should be my google_credentials. Using both the Credentials object suggested in the example and the impersonated credentials auth.impersonated_credentials still resulted me in the error "no private key". Still trying

@ddelgrosso1 ddelgrosso1 added the type: question Request for information or clarification. Not an issue. label Mar 14, 2024
@devnicolas1
Copy link
Author

So, after a lot of trying, this is the furthest I think I got:

from google.auth import default
from googleapiclient import discovery
import base64
  
bucket_name = "bucket_name"
object_name = "object_name"
expiration_time = 3600
credentials, _ = default(scopes=['https://www.googleapis.com/auth/cloud-platform'])
service_account_email = '[email protected]'
iam_service = discovery.build('iam', 'v1', credentials=credentials)
string_to_sign = f"GET\n\n\n\n{expiration_time}\n/storage/v1/b/{bucket_name}/o/{object_name}"
body = base64.b64encode(string_to_sign.encode("utf-8")).decode("utf-8")
namePath = 'projects/-/serviceAccounts/[email protected]'


response = iam_service.projects().serviceAccounts().signBlob(
      name=namePath,
      body={"bytesToSign": body}
).execute()

Still gives me the error:

googleapiclient.errors.HttpError: <HttpError 403 when requesting https://iam.googleapis.com/v1/'projects/-/serviceAccounts/[email protected]:signBlob?alt=json returned "Permission 'iam.serviceAccounts.signBlob' denied on resource (or it may not exist).". Details: "[{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'IAM_PERMISSION_DENIED', 'domain': 'iam.googleapis.com', 'metadata': {'permission': 'iam.serviceAccounts.signBlob'}}]">

I have a feeling that using default() for getting my credentials isn't exactly what I should be doing, but I don't really know what other way I have of getting it.

@frankyn
Copy link
Member

frankyn commented Mar 14, 2024

Hi @devnicolas1,

You need to grant roles/iam.serviceAccountTokenCreator role to [email protected].

The client will do this on your behalf, iff you supply access token and service account email; can you try the following?

    import datetime

    from google.cloud import iam_credentials_v1
    from google.cloud import storage

    client = iam_credentials_v1.IAMCredentialsClient()
    service_account_email = "[email protected]"
    name = path_template.expand(
        "projects/{project}/serviceAccounts/{service_account}",
        project="-",
        service_account=service_account_email,
    )
    scope = [
        "https://www.googleapis.com/auth/devstorage.read_write",
        "https://www.googleapis.com/auth/iam",
    ]
    response = client.generate_access_token(name=name, scope=scope)
    
    storage_client = storage.Client()
    bucket = storage_client.bucket("bucket-name")
    blob = bucket.blob("object-name")
    # https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob#google_cloud_storage_blob_Blob_generate_signed_url

    url = blob.generate_signed_url(
        version="v4",
        # This URL is valid for 15 minutes
        expiration=datetime.timedelta(minutes=15),
        method="GET",
        service_account_email=service_account_email,
        access_token=response.access_token
    )

@devnicolas1
Copy link
Author

Hey @frankyn! First of all, thank you so much for your help!

When trying to execute response = client.generate_access_token(name=name, scope=scope), I got the error:

  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/google/cloud/iam_credentials_v1/services/iam_credentials/client.py", line 825, in generate_access_token
    response = rpc(
               ^^^^
  File "/usr/local/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/usr/local/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.PermissionDenied: 403 Permission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist). [reason: "IAM_PERMISSION_DENIED"
domain: "iam.googleapis.com"
metadata {
  key: "permission"
  value: "iam.serviceAccounts.getAccessToken"
}
]

Interesting enough, I had confirmation earlier from devops guy that all permissions were correctly configured, but this does get me thinking. I'll make sure to confirm everything is configured the way it should be, but I'll at least leave the traceback to keep it all registered.

Once again, thanks for the help! Will comeback with updates as soon as I can.

@devnicolas1
Copy link
Author

After a lot of testing and trying different things, we opted for a bit of different way.

We generated a pair of HMAC keys and used Boto3 for signing URLs. It worked out perfectly and was easier to make it work for both development/devops teams.

Still, thank you so much @frankyn for your help! Your comment surely contributed to our success in the end, even if we opted for a different solution.

If anyone finds this issue in the future: the link I added is an easy way of achieving the desired result with HMAC keys. Can't talk about using simply workload identity since we opted for another way.

Cheers!

@frankyn
Copy link
Member

frankyn commented Mar 18, 2024

@devnicolas1 are you using boto3 with GCS for signed URLs?

@devnicolas1
Copy link
Author

devnicolas1 commented Mar 18, 2024

Correct. I really don't know in details how does it work, but we got the idea from this example we found while searching. We did a quick test and everything actually worked out flawlessly, and still is in production right now.

Here is how the code look like:

from boto3 import client as boto3Client

client = boto3Client(
                "s3",
                region_name="southamerica-east1",
                endpoint_url="https://storage.googleapis.com",
                aws_access_key_id=environ['ACCESS_KEY_ID'],
                aws_secret_access_key=environ['ACCESS_KEY_SECRET'],
            )

return client.generate_presigned_url(
                'get_object',
                Params={
                    'Bucket': bucket.name,
                    'Key': name
                },
                ExpiresIn=convertTimeStringToSeconds(availabilityTime)
            )

Edit: just to make extra clear @frankyn, bucket is still the google.cloud.storage.Bucket object, and was retrieved using the GCP library as usual. Literally the only thing in the whole code that uses Boto3, is the URL signing

@vishal2-wiai
Copy link

@devnicolas1 nice workaround with using boto3.

Is this working for others?

@frankyn I'm still facing the same issue when trying to generate a signed url from a pod in GKE Autopilot cluster that has Workload Identity Federation (enabled by default with no option to disable). I have given the service account permissions for Service Account Token Creator.

@zukwung
Copy link

zukwung commented Jun 28, 2024

@frankyn is it okay if we open this as a feature request for python? there are similar requests in other clients like these issues:
googleapis/google-api-dotnet-client#2410
googleapis/google-cloud-java#10464
googleapis/google-cloud-go#4604

Like @vishal2-wiai and @devnicolas1 allude to, using workload identity is a common and secure way to authenticate to google cloud, whether that's in GKE or GitHub actions in my use case.

@maalucf
Copy link

maalucf commented Oct 15, 2024

@devnicolas1 when you connect to the boto3 client do you use GCP credentials or AWS ones?
I've managed to create an URL, but I can not access it due to this error: "The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method."
I was authenticating with my AWS credentials.

@devnicolas1
Copy link
Author

@maalucf I'm using GCP HMAC keys, so variables environ['ACCESS_KEY_ID'] and environ['ACCESS_KEY_SECRET'] (present on the snippet I shared above) are generated on GCP. I can not go into specific details on how to generate it, since I wasn't the one messing with Google Cloud Console myself, but I remember seeing it being made and I'm almost sure its just a few button clicks away.

There is absolutely no interaction with AWS at all. In fact, also in the snippet above in this issue, you can see I even change the endpoint URL, passing the parameter endpoint_url="https://storage.googleapis.com" to the Boto3 constructor. Absolutely nothing else was made using Boto3/AWS, only the snippet above. I think we eventually implemented a "does this object exist" condition somewhere that also uses Boto3, but thats not relevant to signing the object itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

6 participants