Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP Terraform dataproc - ignores internal_ip_only value as False #17436

Open
tenstriker opened this issue Feb 28, 2024 · 12 comments
Open

GCP Terraform dataproc - ignores internal_ip_only value as False #17436

tenstriker opened this issue Feb 28, 2024 · 12 comments

Comments

@tenstriker
Copy link

tenstriker commented Feb 28, 2024

Terraform Version

google provider 4.75.0 and 5.17.0
Terraform v1.7.4

Affected Resource(s)

google_dataproc_cluster

Terraform Configuration

Happens with unchanged plan. Just trying to re-execute things. Only thing that happens on re-applying same stuff is update of terraform iteself

resource "google_dataproc_cluster" "my-cluster" {
  name    = local.cluster_name
  project = local.project.gcp_project_id
  region  = "us-central1"

 

  cluster_config {

    gce_cluster_config {
      
      service_account_scopes = ["useraccounts-ro", "storage-rw", "logging-write", "cloud-platform"]
    }

    master_config {
      num_instances = local.dataproc.master_config.num_instances
      machine_type  = local.dataproc.master_config.machine_type
    }

    worker_config {
      num_instances = local.dataproc.worker_config.num_instances
      machine_type  = local.dataproc.worker_config.machine_type
    }

    software_config {
      image_version = "2.2.0-RC3-debian11"
      override_properties = {
        "dataproc:dataproc.logging.stackdriver.enable"            = "true"
        "dataproc:jobs.file-backed-output.enable"                 = "true"
        "dataproc:dataproc.logging.stackdriver.job.driver.enable" = "true"
        "dataproc:dataproc.logging.stackdriver.job.yarn.container.enable" = "true"
        "spark:spark.history.fs.update.interval"                  = "900"
        "spark:spark.history.fs.cleaner.enabled"                  = "true"
        "spark:spark.history.fs.cleaner.interval"                 = "1d"
        "spark:spark.history.fs.cleaner.maxAge"                   = "30d"      
      }
      # optional_components = [
      #   "JUPYTER"
      # ]
    }

    endpoint_config {
      enable_http_port_access = "true"
    }
    dynamic "autoscaling_config" {
      for_each = "${local.workspace == "prod" ? [1] : []}"
      content {
        policy_uri = google_dataproc_autoscaling_policy.asp.name
      }
    }
  }
  
}

resource "google_dataproc_autoscaling_policy" "asp" {
  policy_id = "default-policy"
  location  = "us-central1"

  worker_config {
    max_instances = local.dataproc_asp.worker_config.max_instances
  }

  secondary_worker_config {
    max_instances = local.dataproc_asp.secondary_worker_config.max_instances
  }

  basic_algorithm {
    cooldown_period = "120s"
    yarn_config {
      graceful_decommission_timeout = "7200s"
      scale_up_factor               = 0.5
      scale_down_factor             = 0.5
    }
  }
}

Debug Output

Error messsage:
INVALID_ARGUMENT: Subnetwork 'default' does not support Private Google Access which is required for Dataproc clusters when 'internal_ip_only' is set to 'true'. Enable Private Google Access on subnetwork 'default' or set 'internal_ip_only' to 'false'.

Just pasting snippet as debug out has lot of confidential info. Issue is around not respecting default or explicit value of internal_ip_only when it sets to False . (it is set to false by default) instead upon Terraform Apply it consider it to be true. (based on error message)

You can see from debug log that internal_ip_only is completely missing from the Request. GCP TF eats it. I assume gcp backend marks it as true if its not part of request payload and fails the whole request.

https://gist.github.com/tenstriker/de36db2baf3ae0d309f73485fefb769c

2

Expected Behavior

gcp tf to send value of internal_ip_only as false by default . at least send it when set explicitly.

Actual Behavior

it throws 400 as it thinks internal_ip_only is set to true and network value is default.

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

No response

Fyi, Cluster creation works fine with gcloud cli with similar configuration and external ip gets assigned as well as I'm using default subnet

Update:
Seems like dataproc image versions 2.2.* has this breaking. Issue doesnt surface with dataproc image version 2.1.* (see my last comment)
b/327455169

update 03/01/2024
gcloud cli also doesn't work with dataproc image version 2.2.* after I updated gcloud cli itsefl using gcloud components update . The message on update was:

Your current Google Cloud CLI version is: 450.0.0
You will be upgraded to version: 466.0.0

so it was working in gcloud cli version 450 but does break in 466 at least.

ALso, Newly created projects which gets default network by default have all of the subnetworks with Google Private Access as off. that was not the case previously.

@tenstriker tenstriker added the bug label Feb 28, 2024
@github-actions github-actions bot added forward/review In review; remove label to forward service/dataproc labels Feb 28, 2024
@zli82016
Copy link
Collaborator

zli82016 commented Feb 28, 2024

@tenstriker , can you please provide the configuration to reproduce the issue?

After upgrading google provider from 4.75.0 to 5.17.0 and then running the command terraform apply, an error occurs. Is that the issue?

@tenstriker
Copy link
Author

tenstriker commented Feb 28, 2024

@tenstriker , can you please provide the configuration to reproduce the issue?

After upgrading google provider from 4.75.0 to 5.17.0 and then running the command terraform apply, an error occurs. Is that the issue?

No it's actually happens with both. It used to work with 4.75.0 but suddenly started failing so I tried latest version and it still fails. I wonder if the gcp backend api changed. means TF was never sending the value but in backend it was considered False by default but recently considering it to be True now? Debug log is taken with version 5.17 which clearly doesn't send this param value in request body.

@zli82016
Copy link
Collaborator

Thanks for the information, @tenstriker . Is it possible to provide the configuration?

@tenstriker
Copy link
Author

sure, I added in OP. I tried tweaking some parameters of cluster_config like removing autoscaling, optional_components, changing image version etc to no avail

@zli82016
Copy link
Collaborator

Forward this issue to the service team to check the reason for the error message.

@zli82016 zli82016 removed the forward/review In review; remove label to forward label Feb 28, 2024
@tenstriker
Copy link
Author

thanks. fyi, I ran both versions (4.75.0 and 5.17.0) with Terraform v1.7.4 and none of them seem to have internal_ip_only in request body.

@tenstriker
Copy link
Author

I think this might also have to do with image version I am using 2.2.0-RC3-debian11

@tenstriker
Copy link
Author

tenstriker commented Feb 29, 2024

Tested another 2.2.* image 2.2.3-debian12 . It works with dataproc image version 2.1.* without any issue.
so seem like dataproc 2.2.* image versions may have to do with this? I

@tenstriker
Copy link
Author

I added some more updates in OP

@cnauroth
Copy link

cnauroth commented Aug 2, 2024

I don't think this is a bug in Terraform, and perhaps we can close this issue. Instead, I think this was a change introduced in Dataproc for stronger security defaults:

https://cloud.google.com/dataproc/docs/release-notes#February_16_2024

Dataproc on Compute Engine: The internalIpOnly cluster configuration setting now defaults to true for clusters created with 2.2 image versions. Also see Create a Dataproc cluster with internal IP addresses only.

The timing of that release roughly correlates with the date this issue was created.

@cnauroth
Copy link

cnauroth commented Aug 2, 2024

...although #18503 suggests that even if you explicitly set internal_ip_only = false, it's not respected. That part would definitely be a bug.

@zli82016
Copy link
Collaborator

zli82016 commented Aug 6, 2024

...although #18503 suggests that even if you explicitly set internal_ip_only = false, it's not respected. That part would definitely be a bug.

This is still a bug for this case and needs to be fixed. I will leave this Github issue open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants