-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support specifying GCP account credentials as a config option. #4855
Conversation
This pull request has been linked to Shortcut Story #44515: Support specifying GCP account credentials as a config option.. |
I don't know why the tests are failing; they succeed on my local Windows machine in both Release and Debug mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should be able to get this tested sometime today, so will get back to you with information about whether everything works end-to-end once I have done so.
In the meantime, I have some minor thoughts about naming, to be consistent with the way Google names and documents things. None of these are requirements, just ideas to consider.
vfs.gcs.service_account_credentials
might be better namedvfs.gcs.[service_?]account_key
, since Google Cloud describes these files as "account keys". (As for the inclusion or exclusion ofservice_
, I think you might be able to provide a user account JSON string here and have it work, but I'm not positive. Even then, it's probably OK to include theservice_
since that’s the intended use case, but it could be omitted for to keep the config name shorter.)vfs.gcs.external_account_credentials
: maybeworkload_identity_config[uration]
,federated_identity_config[uration]
, oridentity_pool_config[uration]
since those better match to the public naming and documentation of workload identity federation. It's not really a credential in and of itself; it's more like a configuration that tells you how to fetch a credential. (That said it is sometimes described as an "external account" in code. Naming is hard ¯\_(°_o)_/¯.)
Out of everything I think I'm most confident in that we should use _key
for the service account key configuration field. Beyond that, it's just vibes.
I took the names from the Google Cloud functions
I checked it and unfortunately this logic exists only for Application Default Credentials (the file in |
The way I look at it, it’s taking some input (the key, the federation info) to make the credentials; i.e., the credentials are the output, not the input. Also see the specific doc comment on MakeServiceAccountCredentials where the input is referred to specifically as a key. |
I can confirm that everything seems to work here! I built this version of TileDB, pointed the Python library at it in a venv, and used this code: #!/usr/bin/env python
import argparse
import shutil
import tiledb
def main() -> None:
parser = argparse.ArgumentParser()
parser.add_argument("--output", help="file to write to", required=True)
parser.add_argument(
"--impersonate", help="impersonate these accounts to open the file"
)
group = parser.add_mutually_exclusive_group()
group.add_argument(
"--federated-config",
help="file with a federated workflow configuration",
type=argparse.FileType(),
)
group.add_argument(
"--service-account-key",
help="file with a service account key",
type=argparse.FileType(),
)
parser.add_argument("input_file", help="file to read from")
parsed = parser.parse_args()
tdb_cfg = {}
if fedfile := parsed.federated_config:
with fedfile:
tdb_cfg["vfs.gcs.external_account_credentials"] = fedfile.read()
if keyfile := parsed.service_account_key:
with keyfile:
tdb_cfg["vfs.gcs.service_account_credentials"] = keyfile.read()
if parsed.impersonate:
tdb_cfg["vfs.gcs.impersonate_service_account"] = parsed.impersonate
vfs = tiledb.VFS(tdb_cfg)
with vfs.open(parsed.input_file, "rb") as infile:
with vfs.open(parsed.output, "wb") as outfile:
shutil.copyfileobj(infile, outfile)
if __name__ == "__main__":
main() Cases I tested it with: $ ./open-with-tiledb.py \
> --output key-output.png \
> --service-account-key ./gcp-account-key-impersonator.json \
> --impersonate [email protected] \
> gcs://some-bucket/some-file.png
$ ./open-with-tiledb.py \
> --output federated-output.png \
> --federated-config ./impersonate-from-url.json \
> gcs://some-bucket/some-file.png Weirdly, it seems like using application-default credentials didn’t work right (i.e., specifying neither Here is the relevant output:
|
I will rename them tomorrow to |
I am happy with both of those names but feel free to use another if you come up with something you think is better. We have time to decide before this is finally merged. |
It was a me problem. There is a difference between being logged in to your project within the Google Cloud CLI and actually having application-default credentials. You are logged into Google Cloud services within the Google Cloud command-line utilities when you run After doing so, my ADCs were created and all three test cases worked perfectly. |
3d89cf1
to
a2037b6
Compare
Test failures were fixed. |
…nt_credentials` config options.
Fixes tests in CI. We set the `CLOUD_STORAGE_ENDPOINT` environment variable, which prevented the credentials config options from having effect. Because setting the credentials config options is a more explicit action, it should take precedence over the environment variable.
3e6922b
to
8f09cb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! There are a couple comments and strings that should probably be updated to use the new terminology but functionally everything is in order.
tiledb/sm/cpp_api/config.h
Outdated
* neither is specified, Application Default Credentials will be used. <br> | ||
* **Default**: "" | ||
* - `vfs.gcs.workload_identity_configuration` <br> | ||
* Set the JSON string with Workload Identity Federation credentials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: credentials
to configuration
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
tiledb/sm/filesystem/gcs.cc
Outdated
"Both GCS service account credentials and external account " | ||
"credentials were specified; picking the former"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another string to update with the new terminology (maybe the exact config key name?). I also might phrase this as something like "service account key set; ignoring workload identity configuration" to make it a little clearer (at least in my opinion; you are of course free to leave it as-is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd usually prefer fixing the precedence of the 2 configs+env in a test, it means it's intentional and we support it that way, I'm also ok with merging this as is.
cacae86
to
ec33fe3
Compare
[SC-44515](https://app.shortcut.com/tiledb-inc/story/44515/support-specifying-gcp-account-credentials-as-a-config-option) --- TYPE: CONFIG DESC: Add `vfs.gcs.service_account_credential` config option that specifies a Google Cloud service account credential JSON string. --- TYPE: CONFIG DESC: Add `vfs.gcs.external_account_credential` config option that specifies a Google Cloud Workload Identity Federation credential JSON string.
… a config option. (#4871) Backport 0ae11e2 from #4855. --- TYPE: CONFIG DESC: Add `vfs.gcs.impersonate_service_account` option that specifies a service account to impersonate, or a comma-separated list for delegated impersonation. --- TYPE: IMPROVEMENT DESC: Stop using deprecated Google Cloud SDK APIs.
[SC-44515](https://app.shortcut.com/tiledb-inc/story/44515/support-specifying-gcp-account-credentials-as-a-config-option) --- TYPE: CONFIG DESC: Add `vfs.gcs.service_account_credential` config option that specifies a Google Cloud service account credential JSON string. --- TYPE: CONFIG DESC: Add `vfs.gcs.external_account_credential` config option that specifies a Google Cloud Workload Identity Federation credential JSON string.
SC-44515
TYPE: CONFIG
DESC: Add
vfs.gcs.service_account_credential
config option that specifies a Google Cloud service account credential JSON string.TYPE: CONFIG
DESC: Add
vfs.gcs.external_account_credential
config option that specifies a Google Cloud Workload Identity Federation credential JSON string.