-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of private_endpoint_subnetwork in private GKE stops deployment with error #20429
Comments
Hi @juangascon! I tried to replicate this issue with the following configuration, but everything was good without errors. Could you review it and make a new try?
|
Hello @ggtisc Again, I will try and come back to you. :-) |
Hello @ggtisc As soon as I use The source configuration runs over these versions:
My configuration with the last version of Terraform and the Google provider is as follows: resource "google_compute_network" "prototype" {
name = "${var.name_root}-${random_string.vpc_suffix.result}"
auto_create_subnetworks = "false"
enable_ula_internal_ipv6 = true # fixes CKV_GCP_76 - side effect - needed to allow ipv6 Internal
}
resource "google_compute_subnetwork" "prototype" {
lifecycle {
ignore_changes = [
secondary_ip_range,
]
}
name = local.subnetwork_prototype_name
region = var.region
network = google_compute_network.prototype.name
private_ip_google_access = true
ip_cidr_range = "10.40.1.0/24"
secondary_ip_range {
range_name = local.pods_range_name
ip_cidr_range = "10.240.0.0/14"
}
secondary_ip_range {
range_name = local.services_range_name
ip_cidr_range = "10.244.0.0/20"
}
stack_type = "IPV4_IPV6"
private_ipv6_google_access = "ENABLE_OUTBOUND_VM_ACCESS_TO_GOOGLE"
ipv6_access_type = "INTERNAL"
log_config { # fixes CHK_GCP_26
aggregation_interval = "INTERVAL_10_MIN"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}
resource "google_compute_subnetwork" "cluster_control_plane" {
name = local.control_plane_private_endpoint_subnet_name
region = var.region
network = google_compute_network.prototype.name
private_ip_google_access = true
ip_cidr_range = "192.168.40.16/28"
stack_type = "IPV4_IPV6"
private_ipv6_google_access = "ENABLE_OUTBOUND_VM_ACCESS_TO_GOOGLE"
ipv6_access_type = "INTERNAL"
log_config { # fixes CHK_GCP_26
aggregation_interval = "INTERVAL_10_MIN"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}
resource "google_container_cluster" "prototype" {
lifecycle {
ignore_changes = [
node_config,
ip_allocation_policy,
]
}
deletion_protection = false
name = local.cluster_prototype_name
location = local.cluster_zone
remove_default_node_pool = true
initial_node_count = 1
network = google_compute_network.prototype.name
subnetwork = google_compute_subnetwork.prototype.name
min_master_version = data.google_container_engine_versions.prototype.release_channel_latest_version[var.cluster_release_channel] # fixes CKV_GCP_67
release_channel {
channel = var.cluster_release_channel
}
master_auth {
client_certificate_config {
issue_client_certificate = false
}
}
network_policy { # fixes CKV_GCP_12 - first part
enabled = true
}
addons_config { # fixes CKV_GCP_12 - second part
network_policy_config {
disabled = false
}
}
ip_allocation_policy {
# Needed in private node
# If the parameters are commented or not written or
# if their values are empty (null), GCP will allocate them
cluster_secondary_range_name = local.pods_range_name
services_secondary_range_name = local.services_range_name
}
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = null # Explicitly set to null to avoid a bug of the provider that
# unexpectedly changes the attribute from null to false
# during the apply phase
private_endpoint_subnetwork = google_compute_subnetwork.cluster_control_plane.name
master_global_access_config {
enabled = false
}
}
binary_authorization { # fixes CKV_GCP_66 with updated parameter
evaluation_mode = var.cluster_binary_authorization ? "PROJECT_SINGLETON_POLICY_ENFORCE" : "DISABLED"
}
enable_intranode_visibility = true # fixes CHK_GCP_61
enable_shielded_nodes = true
node_config {
shielded_instance_config {
enable_integrity_monitoring = true
enable_secure_boot = true
}
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
labels = {
name = "catalyser"
lifecycle = "ephemeral"
}
workload_metadata_config { # fixes CKV_GCP_69
mode = "GKE_METADATA"
}
tags = ["ephemeral", "catalyser"]
}
vertical_pod_autoscaling {
enabled = var.cluster_enable_vertical_pod_autoscaling
}
workload_identity_config { # fixes CKV_GCP_69 - side effect
workload_pool = "${var.project_id}.svc.id.goog"
}
resource_labels = {
use_case = lower(var.use_case)
creator = lower(var.cluster_creator)
scope = "private"
}
} |
I'm ready to make a new try, but there are missing values that I need to replicate your configuration. Could you provide me the following values or confirm if I can use any values and configurations?
For sensitive data you can use examples like the following or specify that we can use any value and configuration:
|
@ggtisc |
Hello @ggtisc
In the subnetwork "prototype" the range_name of the secondary_ip_range are the same as the secondary_range_name in the cluster: secondary_ip_range {
range_name = "gke-tf-poc-oj1u-pods"
ip_cidr_range = "10.240.0.0/14"
}
secondary_ip_range {
range_name = "gke-tf-poc-oj1u-services"
ip_cidr_range = "10.244.0.0/20"
} The variables This is the code for the data.tf : # Data fetched from the GCP resources
# Get the available zones in the project region
data "google_compute_zones" "available" {
project = var.project_id
region = var.region
}
# Obtain the available cluster versions in the zone
data "google_container_engine_versions" "prototype" {
project = var.project_id
location = local.cluster_zone
} and a part of locals.tf resource "random_string" "cluster_suffix" {
# Suffix for the cluster name
length = 4
lower = true
upper = false
numeric = true
special = false
}
locals {
# Build the cluster name
cluster_prototype_name = "tf-poc-${random_string.cluster_suffix.result}"
# Define the zone where the cluster will be deployed
# the cluster will be deployed in the first available zone in the region
cluster_zone = data.google_compute_zones.available.names[0]
# Define Private control plane's endpoint subnet name
control_plane_private_endpoint_subnet_name = "gke-${local.cluster_prototype_name}-cp-subnet"
}
|
I can't replicate this issue, so I'm passing it to the next on-call member @NickElliot |
OK. I do not understand what is different on my side that creates this issue. |
I'm not sure I understand the issue -- are you receiving the "unexpectedly change" error when you configure your .tf config file a log of your terminal from when you type "terraform apply" to when you receive the error message? |
Sorry to answer this late. A very tragic personal issue has arrived to my family 15 days ago and I did not check on this. Sorry. The problem comes when we have, in the "private_cluster_config" section, My configuration is in this post: Though, for the tests, you have to change Thanks a lot for taking care of this. |
Community Note
Terraform Version & Provider Version(s)
Terraform v1.9.8
on linux_amd64
This happens also in both 5.44.2 and 6.1.0 versions of the provider.
The issue happens probably since the v5.18 when the attribute
private_endpoint_subnetwork
becomes Optional and not Read-Only.I do not know if this has a link with issue #15422
Affected Resource(s)
This happens in the
google_container_cluster
resource when configured as private and we configure the IP range of the control-plane subnetwork using the attributeprivate_endpoint_subnetwork
instead ofmaster_ipv4_cidr_block
.We have two ways of giving the CIDR range for the control plane endpoint:
Reading the documentation from GCP "Create a cluster and select the control plane IP address range" it is said:
So, if your organization's security constraints forces you to activate the VPC Logs in all subnetworks, you, and not GCP, have to create the subnet in order to toggle on the feature with Terraform. Terraform can't modify (it is REALLY complicated) the parameters of a resource created out of its scope.
Though, if I create a subnet and I put its name as value for the private_endpoint_subnetwork attribute, I get the following error:
Gemini 1.5 Pro explains the error:
The error message indicates that the google_container_cluster resource's private_cluster_config.enable_private_endpoint attribute is unexpectedly changing from null to false during the apply phase, even though it's not explicitly defined in your configuration.
Claude Sonnet 3.5 details even more:
The problem seems to be in how the provider handles the private_cluster_config state during plan and apply phases, specifically around the PSC (Private Service Connect) clusters.
While the bug is corrected in the provider, a quick bypass working solution proposed by Gemini 1.5 Pro is to explicitly set
enable_private_endpoint
tonull
in your configuration:Terraform Configuration
Debug Output
No response
Expected Behavior
The GKE deployment should end correctly with the custom user-created subnetwork associated to the private control plane.
Actual Behavior
The deployment stops with the following error:
Steps to reproduce
private_endpoint_subnetwork
instead of declaring a CIDR IP range with the attributemaster_ipv4_cidr_block
.terraform apply
Important Factoids
It seems that the file where the bug exists is : resource_container_cluster.go
I am not a Golang coder so, I do not know if the solution proposed by Claude Sonnet 3.5 via GitHub Copilot is good, neither I can submit a pull request.
Claude Sonnet 3.5 says:
References
No response
The text was updated successfully, but these errors were encountered: