-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add artifact docker image data source #9521
Add artifact docker image data source #9521
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Hello! I am a robot. It looks like you are a: Community Contributor @roaks3, a repository maintainer, has been assigned to review your changes. If you have not received review feedback within 2 business days, please leave a comment on this PR asking them to take a look. You can help make sure that review is quick by doing a self-review and by running impacted tests locally. |
Hey @roaks3 when you get a chance, can you take a look at my PR? |
@dibunker sorry for the delay on this. I've taken a first pass and this mostly looks good. Running tests now, but I'm going to need to spend more time on this one verifying the fields and patterns being used, since this is a handwritten datasource. |
Tests analyticsTotal tests:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a first pass (and haven't looked at tests yet), but the main callout I have is the field names being used, since they don't exactly match the API. We should just be careful we are using intuitive terms.
Optional: true, | ||
Description: `Project ID of the project.`, | ||
}, | ||
"repository": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you find precedence for this sort of naming? The API itself uses a mix of terms: repository_name
, location
(and registry_location
), and docker_image
(and image_id
and digest
), and I mostly want to make sure that these fields will be clear to users, especially if we are parsing an API field into multiple parts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, there is a lot of similar field names going on. I was thinking about it this morning, would it be better to replace the project
, repository
, and region
with something like artifact_registry_repository_id
and have users reference the id
attribute from an google_artifact_registry_repository
datasource or resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @roaks3 I wrote a commit that removes the project
and location
fields and changes the repository
field to be the fully-qualified identity of the repository.
This is inline with how the repository
field is being used in the google_artifact_registry_repository resource, when they use repository
within the upstream_policies
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely reasonable, but I think I would stick with the original split, simply because after looking through a large number of GCP's datasources, this seems to be the far more common pattern. I think in particular it benefits users that have a project/region set up at the provider level. So I would suggest project
, location
, and repository_id
. IIRC location
implies a region or zone can be used, so you may want to check if a zone is valid, and then decide between location
and region
. repository_id
matches the repository resource field name.
You could still make an argument for the artifact_registry_repository_id
you're suggesting, but for consistency I think this is worth changing.
For the docker_image
field, that name seems fine to me (image_name
might also work), but I'm a little concerned about the digest
field. It seems that term is more often used for the whole name + sha, so could be ambiguous in this datasource, but TBH I'm not familiar enough with how these terms are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Subserial as a member of the service team, could you weigh in on the field names used here? Do they reasonably match the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing this to the repository data source, project
, location
and repository_id
are best for consistency.
image
, digest
and tag
describe the image, but I think it would be best as mentioned below to use a single image_name
value, then if necessary parsing out the tag to get the digest.
This additionally removes the ambiguity with the computed tags
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 After changing to image_name
field from the conversation below, I've split apart the repository
argument to be as @Subserial described: a project
, location
, and repository_id
.
...hird_party/terraform/services/artifactregistry/data_source_artifact_registry_docker_image.go
Outdated
Show resolved
Hide resolved
Optional: true, | ||
Description: `The image digest to fetch. This cannot be used if tag is provided.`, | ||
}, | ||
"tag": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the field names, this one in particular is a bit ambiguous, because the resource already has a list of tags
, and this is really more like a query/filter parameter. You may be able to find precedence for something like this as well, but if not I'll try to look around for similar data source patterns in the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, tag
and tags
would easily be mixed up. The intent of the issue was to have a data source that would return the URI of a particular image:tag
docker image, so I think it's fine to keep those two field names. I added the tags
simply because it was available information from the REST API -- it wasn't specifically asked for in the issue, but I could see someone using the tags information when populating the description
field for a compute instance or cloud run service.
I'd think renaming the tags
field to something less ambiguous would be good, maybe all_image_tags
or all_tags
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 I found that the google_compute_regions uses the status
field as a filter parameter. Could this be sufficient precedence for the tag
field used in this data source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think that makes sense in terms of some precedence. I don't see this come up very much in existing datasources, so the tag
+ tags
might be the best we can do in terms of modeling.
However, looking at https://registry.terraform.io/providers/calxus/docker/latest/docs/data-sources/image, we may want to consider using just image_name
, and having the tag and digest inferred from that value. This would be particularly appealing if we don't need to parse anything client-side, and can send the name up to the server for each case, because then we can rely on the server logic and the client has less ways to break or surprise users. Based on your current implementation though, I suspect the server doesn't recognize a tag in the image name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 Sounds good! I've implemented the logic to infer tags and digest from image_name
, and removed those arguments from the data source.
return fmt.Errorf("Error setting api endpoint") | ||
} | ||
|
||
urlRequest, err = transport_tpg.AddQueryParams(urlRequest, map[string]string{"orderBy": "update_time desc"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just verifying, is this sorting going to be relevant for all (or most) use cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more of an optimization: since there can be only one image with a particular label, it's most likely going to be one that was updated recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense.
I think I was also asking about this case (from your doc change): If a digest or tag is not provided, then the last updated version of the image will be fetched.
. Is the most recently updated going to typically be what users want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 Yes, I think that's a reasonable expectation that users have. It's also the behavior of other API calls, for example when calling the image of a family, the latest image is returned.
if err != nil { | ||
return err | ||
} | ||
out: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we refactor this into a function (or multiple functions) instead of using a break label? I think it would be more clear, and we barely make use of this construct in the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can refactor the for loop into a function which returns once an image is found, instead of having the break out, if that's more inline with the other constructions in the provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 I'm pushed a commit that refactors and replaces the break label construction into a function.
return dockerImages | ||
} | ||
|
||
func convertResponseToStruct(res map[string]interface{}) DockerImage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On first read, I'm wondering if this function is needed. If you didn't have it, and tweak the d.Set(...)
calls, that might be slightly more straight-forward / consistent with other resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it, because on a previous iteration I was getting compiler errors when iterating through the list of tags from the REST API response -- unmarshaling the response into a struct made the compiler happy, so that's what I went with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could do a d.Set()
call instead of the dockerImage.tags = stringTags
line you have, but keep the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow what you're suggesting? Isn't d.Set()
specific to the *schema.ResourceData
type? Here I'm setting fields in the DockerImage
struct.
@roaks3 just pinging in case this fell off your radar. Do the latest changes I submitted address all your concerns? |
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dibunker thank you for the ping, this did fall off of my plate and I apologize for the delay.
I was going to kick off the tests, but when I went to look, I think they'll need to be adjusted to work in our various test environments.
Beyond that, I do want to review the code again, but ran out of time this week. I will plan to do that early next week, and then we should have this close to complete.
// https://console.cloud.google.com/artifacts/docker/go-containerregistry/us/gcr.io | ||
// Currently, gcr.io does not provide a imageSizeBytes or buildTime field in the JSON response | ||
const testAccDataSourceArtifactRegistryDockerImageConfig = ` | ||
provider "google" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test config might cause issues in our test environment:
- You should remove the provider block
- When run in our test environments, we won't have these projects already created. You'll want to use the primary test project (which means you shouldn't need to specify it), but then you'll need to seed test data there. Maybe you can create what you need with another TF resource (it looks like other tests have done so)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I can remove the provider block. I just had it to make sure that the data block for testTag
and testDigest
were using the project defined within a provider block.
If that's not necessary, I can remove the provider block and define the project fields for the testTag
and testDigest
data sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I just checked and it does look like the block won't break things, but we have no other tests that define a provider-level default for project. I would stay consistent with those and specify project on each resource/data-source that needs it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roaks3 Alright, removed the provider block and updated the tests to include the project field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it looks like my second point from above got lost in the mix. We don't have a cloudrun
project already created in our test environment, so I'd expect the config to fail. You probably want to pass envvar.GetTestProjectFromEnv()
into your test, which is the id of our primary test project, but then you'd need to add config to your test to set it up so these data sources can retrieve data.
Does that make sense? What you have here now looks like it is written against resources that you created ahead of time, and we just need to replicate that.
@dibunker, this PR is waiting for action from you. Please address any comments or change requests, or re-request review from a core reviewer if no action is required. If no action is taken, this PR will be closed in 14 days. This notification can be disabled with the |
/gcbrun |
I'm going to try running tests just to make sure we get the error I'm expecting, but I think this is basically blocked on tests running due to #9521 (comment) (test project resource needs to be included in test config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like I can't actually run the checks (the CI pipeline has undergone a number of changes over the past month or two).
@dibunker sorry for the inconvenience, but this will need to be rebased to move forward.
No worries @roaks3 I should be able to look into it later this week. |
@roaks3 Were you saying that the |
If |
- Filter the list response by image name - Added the `DockerImage` struct.
Test for a data source using a tag, and one for using a digest
- Created a yellow box for the Oauth scopes required - Edited the attribute reference examples
Refactored the code a bit, too
The `buildTime` and `imageSizeBytes` are available in this repo.
Replace the break out logic with a function
Removing these in favor of using a fully-qualified repository ID. This is in-line with the API usage of `repository` within the `upstream_policies` block.
- Parse `image_name` to detect a digest, tag, or none.
- Remove `digest` and `tag` - Change `image` to `image_name` and updated description
- Splits into `project`, `location`, and `repository_id`
9387906
to
d9c4979
Compare
@roaks3 Okay, rebased the branch and the acceptance test worked on my local environment. The |
Awesome! I'll run tests now and see how it does. |
Tests analyticsTotal tests: Click here to see the affected service packages
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look good, and code looks good after scanning through again. Thanks for your patience with this one, will get it merged later today.
Closes hashicorp/terraform-provider-google#16557 and similar to hashicorp/terraform-provider-google#4317, which requested the same kind of data source but for gcr, although gcr is deprecated and that issue is still open.
This PR adds a data source that makes it easier to obtain information about docker images stored in an Artifact Registry repo.
I've added:
Acceptance test showing the data source can retrieve the docker image information when provided the following scenarios
Release Note Template for Downstream PRs (will be copied)