-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
google_service_networking_connection uses servicenetworking.delete when google console utilizes networks.removePeering #18834
Comments
@joekiller I can't find the original issue with details. But we decided to go with abandon state because removePeering has its own set of problems. When a peering is removed it is not deleted. It then sits in this in-between state. There is a limit on the number of stale peerings that can exist. Customers were getting API errors because too many peerings existed (everytime they terraform applied and terraform destroyed a new one was created but not deleted). While abandon doesn't "fix" the problem, it gives the terraform pipeline an opportunity to succeed without leaving any resources in an in-between state. A true fix would probably have to come from the product API side in CloudSQL that prevents immediate deletion (which would create another set of problems lol) |
100% reproducible using To reproduce:
Error message:
Terraform config:
|
@mike-callahan, I understand what you are saying and while I agree with the following vis a vie ensuring servicenetworking.delete works correctly I do not agree that choosing abandon and removing the networks.removePeering call as a better resolution.
As I mentioned originally, Google utilizes the networks.removePeering call when you click the delete peering button in the actual console. I had deployed and fully removed a cloudsql instance from the project and was only able to remove the peering via the console, which utilized the networks.removePeering function. There was no association left. I agree that Google should fix the apparently dangling association. However, I do not think if a choice is to be made regarding relying on servicenetworking.delete instead of networks.removePeering vs abandoning the connection when even Google doesn't use servicenetworking.delete that abandoning is the best point of action. I advocate that network.removePeering be used. I imagine that the in between situation where the peering connection is in flux due to infrastructure still utilizing it only existed when the dependsOn attribute was not properly applied which is specified in the documentation regarding this relationship needing to be specially applied when utilizing a private IP. I'm happy to change my opinion if you can surface the situations outside of that case. My deployment was properly using the dependsOn and deleted the cloudsql instance prior to attempting to delete the peering. Regardless of if I did the servicenetworking.delete two hours later or manually, it always failed. As soon as I used network.removePeering it cleaned up and didn't exist in a transient state. |
hi @joekiller! I tried to replicate this issue with this example. But everything works fine without errors creating and destroying the resources. The unique difference is that I added this property: This is the used terraform code, if you have something different share it with us to try to replicate this issue again:
|
@ggtisc, I agree that it works in the limited context that you have cited. The the removal fails if you try to reproduce utilizing the example Johnan provided earlier or the config I referenced, #16275 (where the decision to abandon vs utilize My argument is that since Google console still uses networks.removePeering instead of servicenetworking.delete that it is reasonable for Terraform to also still use the Yes this is very much in confluence with |
referencing possible linked issues #16735 |
Thanks for your thoughts @joekiller I think it would be good to solve this "correctly". At least have terraform mirror the functionality of the console. I will investigate further. It would likely be a breaking change. |
Thanks for fishing up those edge cases Mike. 😅 Seems like patching was definitely an improvement for those update situations. Hopefully a combination of patch for update, abandon options, and delete otherwise could generate the most satisfying conclusion. |
@rileykarson @zli82016 @c2thorn @roaks3 Service networking connection has been a persistent issue for a while now with various changes having unintended side-effects. We have used a lot of work arounds like abandon state and the latest update peering ranges fix. I propose we coordinate a comprehensive solution to this. Firstly I want to document the behavior of a service networking connection flow with CloudSQL so we are all in agreement. Private Service Access for CloudSQL and others works as follows (https://cloud.google.com/service-infrastructure/docs/enabling-private-services-access) The consumer
From the consumer perspective step 3 does more than one thing including:
The producer
Because Google managed services are deployed as service projects to a peered host project, only one service networking connection is required. New CloudSQL instances and other PSA services will deploy as service projects. That causes the following:
|
Users should be able to:
I propose some changes/additions to address these items:
Example
|
If you delete the VPC peering directly (equivalent of removePeering) (by using the "VPC NETWORK PEERING" pane) instead of using the service networking pane it will "delete" the service networking connection. The connection is still likely there though and this is a GUI bug (might be the cause of #15260 and b/305256825?). |
Is it good to forward this issue, or just leave it as it is for now? |
Sure thanks |
Yea, let's forward this to the service team so that they can weigh in on the proposed solution (FWIW the 3 changes do make sense to me, and don't look like breaking changes). @mike-callahan I'm going to unassign you for now, and if you end up working on an implementation to solve this, we can add you back |
Community Note
Terraform Version & Provider Version(s)
Terraform v1.9.0
on darwin_arm64
Affected Resource(s)
google_service_networking_connection
Terraform Configuration
The details in #16275 (comment) are still applicable.
Debug Output
No response
Expected Behavior
No response
Actual Behavior
No response
Steps to reproduce
terraform apply
Important Factoids
No response
References
The issue "google_service_networking_connection destroy calls appear to always fail in 5.x despite guidance" (#16275) was closed by GoogleCloudPlatform/magic-modules#9765 despite that being a bandaid and not a real fix.
Researching this issue, it appears that the google console still utilizes the networks.removePeering as the struct indicates it is monitoring
gceRemoveNetworkPeering
@mike-callahan suggested that the situation dictates
and the former is not a good solution. I advocate that the latter is implemented being that even google's console utilizes the latter technique.
There is a certain irony that even Google found this API call to not work and just worked around it instead of getting it fixed: GoogleCloudPlatform/magic-modules#8904
b/362749609
The text was updated successfully, but these errors were encountered: