-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use community gallery for default VM images #5167
Use community gallery for default VM images #5167
Conversation
Skipping CI for Draft Pull Request. |
/test pull-cluster-api-provider-azure-ci-entrypoint |
/test pull-cluster-api-provider-azure-ci-entrypoint Published Kubernetes v1.29.5 images and retrying. Update: passed! |
/test pull-cluster-api-provider-azure-e2e |
/test pull-cluster-api-provider-azure-e2e Edit: AKS tests passed, with BYO nodepool restored. |
/test pull-cluster-api-provider-azure-e2e Published v1.28.14 images. |
Note to self: this also needs some documentation updates.
|
Since the means of looking up a particular image is basically by convention:
|
d66f4df
to
2a4e9ef
Compare
/test pull-cluster-api-provider-azure-build |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5167 +/- ##
==========================================
- Coverage 52.98% 52.73% -0.25%
==========================================
Files 273 270 -3
Lines 29197 29014 -183
==========================================
- Hits 15469 15301 -168
+ Misses 12926 12922 -4
+ Partials 802 791 -11 ☔ View full report in Codecov by Sentry. |
/test pull-cluster-api-provider-azure-build |
/test pull-cluster-api-provider-azure-e2e flake |
Flakes look mostly related to #5100. I think the last time I dug into that, I was only seeing failures in tests with multiple control plane nodes, where I see at least one test here flaking with only one control plane node. Not sure how much weight to put behind that observation though. |
/test pull-cluster-api-provider-azure-e2e |
3e0dda8
to
d94159b
Compare
/test pull-cluster-api-provider-azure-e2e |
@mboersma Can we try rebuilding the Windows images using the same version of Windows that we're currently using in tests? It looks like currently we're using "Windows Server 2019 Datacenter" build 17763 where this PR makes it so we're using an undated "Windows Server Datacenter" build 20348. |
/test pull-cluster-api-provider-azure-e2e-optional |
/retest "Standard SKU IP limit reached in this region" |
Looks like this error is cropping up on the ci-entrypoint job:
And should that be using Windows 2019 instead of 2022? |
If it doesn't run the curl-to-ilb test, we can probably get away with 2022 here. Thanks for finding that--indeed I hadn't populated that image yet, but I'll fill those in and retry. |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@@ -56,6 +56,12 @@ const ( | |||
DefaultImagePublisherID = "cncf-upstream" | |||
// LatestVersion is the image version latest. | |||
LatestVersion = "latest" | |||
// DefaultPublicGalleryName is the default Azure Compute Gallery. | |||
DefaultPublicGalleryName = "ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parroting @jsturtevant, are you planning to open a k8s.io PR to add this new gallery to the Terraform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, but I haven't started that yet. That's the third piece of the puzzle, in addition to this and kubernetes-sigs/image-builder#1578.
Edit: terraform PR is at kubernetes/k8s.io#7461
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this gallery name relate to anything in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't--I don't think there's a way to recreate a community gallery with the same unique name if it were to be deleted. It gets created for you, starting with your specified prefix, when you actually share the gallery to the world.
So AFAICT it isn't / shouldn't be part of the terraform for the gallery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies if the gallery were to be deleted, we'd have to change code to point to a new gallery with the same name prefix. That's not great. I can't add a "Delete" lock on the resource, because that has the side effect of disallowing new image definitions to be added or old versions to be deleted, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So once we apply the terraform we will change the value here to reflect the unique name that it gets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already got a unique name by virtue of me recreating everything in the CNCF community sub and then reverse-engineering the terraform. AFAICT, the unique name it gets is stored under its sharingProfile
which isn't in the terraform I generated...but maybe can be added?
% az sig show -g cluster-api-gallery --gallery-name community_gallery -o yaml
description: Shared image gallery for Cluster API Provider Azure
id: /subscriptions/46678f10-4bbb-447e-98e8-d2829589f2d8/resourceGroups/cluster-api-gallery/providers/Microsoft.Compute/galleries/community_gallery
identifier:
uniqueName: 46678f10-4bbb-447e-98e8-d2829589f2d8-COMMUNITY_GALLERY
location: northcentralus
name: community_gallery
provisioningState: Succeeded
resourceGroup: cluster-api-gallery
sharingProfile:
communityGalleryInfo:
communityGalleryEnabled: true
eula: https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/LICENSE
publicNamePrefix: ClusterAPI
publicNames:
- ClusterAPI-f72ceb4f-5159-4c26-a0fe-2ea738f0d019
publisherContact: [email protected]
publisherUri: https://github.com/kubernetes-sigs/cluster-api-provider-azure
groups: null
permissions: Community
sharingStatus: null
softDeletePolicy: null
tags:
DO-NOT-DELETE: UpstreamInfra
DateCreated: 10/24/2024
creationTimestamp: '2024-10-24T17:36:37Z'
jobName: image-builder-sig-ubuntu-2404
type: Microsoft.Compute/galleries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But from the point of view of CAPZ's code here, this is the unique identifier we need to access images. It shouldn't change except in a disaster recovery case where we have to set up a new community gallery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could try to refactor this away from being a constant into something created at runtime that the user could override if needed. (Or I can follow up with that change after this merges.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users can set the gallery name in the API, I don't think we need a separate toggle to change the default for now.
LGTM label has been added. Git tree hash: 35cefabcb48d7a1af907593bc20d4c4223cf677c
|
I'd like to add that overall, working with Azure compute galleries has been much easier than publishing to the Azure Marketplace.
|
This reverts commit 02e5ad2.
a1f5f46
to
f05ef21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 1cd4768c6b1902007483a4cb4d2107857b62c9cc
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nojnhuh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature
What this PR does / why we need it:
Use an Azure community image gallery as the default source for VM images.
Which issue(s) this PR fixes:
See also:
Special notes for your reviewer:
TODOs:
Release note: