Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate an issue with VPC deletion #470

Open
slysunkin opened this issue Oct 10, 2024 · 4 comments
Open

Investigate an issue with VPC deletion #470

slysunkin opened this issue Oct 10, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@slysunkin
Copy link
Contributor

Issue with cluster deletion: VPC can't be deleted because of the VPC Interface endpoints. The endpoints are used by the EKS for communication, so it looks like a bug in CAPA, since these endpoints look unmanaged.

@slysunkin slysunkin added the bug Something isn't working label Oct 10, 2024
@slysunkin slysunkin self-assigned this Oct 10, 2024
@slysunkin
Copy link
Contributor Author

The cause of the bug can be the same as we fixed with AWSCluster/Machine finalizers (#217)

@slysunkin
Copy link
Contributor Author

Investigating the process of removal the control plane:
k describe AWSManagedControlPlane eks-dev-slava-14-cp -n hmc-system
discovered these messages:

  Normal   SuccessfulDeleteEKSCluster        26m  aws-controller  Deleted EKS Cluster hmc-system_eks-dev-slava-14-cp
  Normal   SuccessfulDeleteSecurityGroup     26m  aws-controller  Deleted cluster managed SecurityGroup "sg-0bfc9b15c6f45fe42"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-0c11f339e13d2dcdd" from subnet "subnet-0c3932f4c41af37e1"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-0c11f339e13d2dcdd"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-06e62cdd483b7b37d" from subnet "subnet-047a351e31211afc1"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-06e62cdd483b7b37d"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-01dc8d0ed98558e75" from subnet "subnet-042f8dd2a5f76781a"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-01dc8d0ed98558e75"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-0ea1440ffa4b40a23" from subnet "subnet-036136b1b16dcd4d1"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-0ea1440ffa4b40a23"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-02b074d4db4ff4f89" from subnet "subnet-051ba44b89b0b0db8"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-02b074d4db4ff4f89"
  Normal   SuccessfulDisassociateRouteTable  26m  aws-controller  Disassociated managed RouteTable "rtb-047a2a339dec488f5" from subnet "subnet-078b08ee3a55db9a8"
  Normal   SuccessfulDeleteRouteTable        26m  aws-controller  Deleted managed RouteTable "rtb-047a2a339dec488f5"
  Normal   SuccessfulDeleteNATGateway        26m  aws-controller  Deleted NAT Gateway "nat-070f4b940ef366695" previously attached to VPC "vpc-01910d574b9ba221d"
  Normal   SuccessfulDeleteNATGateway        26m  aws-controller  Deleted NAT Gateway "nat-039f6acd381cefc92" previously attached to VPC "vpc-01910d574b9ba221d"
  Normal   SuccessfulDeleteNATGateway        26m  aws-controller  Deleted NAT Gateway "nat-03fc12dc502489202" previously attached to VPC "vpc-01910d574b9ba221d"
  Normal   SuccessfulDetachInternetGateway   25m  aws-controller  Detached Internet Gateway "igw-09becef7ddc7c6353" from VPC "vpc-01910d574b9ba221d"
  Normal   SuccessfulDeleteInternetGateway   25m  aws-controller  Deleted Internet Gateway "igw-09becef7ddc7c6353" previously attached to VPC "vpc-01910d574b9ba221d"
  Warning  FailedDeleteSubnet                25m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: d85e10dd-6997-4b6f-be79-13e2b8b2c976
  Warning  FailedDeleteSubnet  25m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: 644580c9-4cdd-46ea-9d98-023b595ce29a
  Warning  FailedDeleteSubnet  25m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: 7fd6235e-dc2d-491e-9e74-5106f1b80db6
  Warning  FailedDeleteSubnet  25m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: e2a7b45d-55e7-446f-bb12-ea4f14861e8b
  Warning  FailedDeleteSubnet  24m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: db4eeb21-d466-47ee-a343-779b3bddc1e7
  Warning  FailedDeleteSubnet  24m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: 563d9c5c-7f75-4682-b722-7144fda4d1d5
  Warning  FailedDeleteSubnet  24m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: 73921edf-be53-4792-b9c6-3f379e6fc23b
  Warning  FailedDeleteSubnet  24m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: 58b007f8-ee9f-4dff-92f4-585aa9d4dedb
  Warning  FailedDeleteSubnet  24m  aws-controller  Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: cab174f1-3468-4cfd-8db8-6f895f86085f
  Warning  FailedDeleteSubnet  86s (x779 over 24m)  aws-controller  (combined from similar events): Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: a64e80b4-7c92-4662-ada9-851427d121b0
  Warning  FailedDeleteSubnet  86s (x779 over 24m)  aws-controller  (combined from similar events): Failed to delete managed Subnet "subnet-042f8dd2a5f76781a": DependencyViolation: The subnet 'subnet-042f8dd2a5f76781a' has dependencies and cannot be deleted.
           status code: 400, request id: a64e80b4-7c92-4662-ada9-851427d121b0

This is the cause of the problem with VPC deletion.

@slysunkin
Copy link
Contributor Author

Created a bug report: kubernetes-sigs/cluster-api-provider-aws#5150

@slysunkin
Copy link
Contributor Author

More investigations demonstrated that deletion of VPC may be quite problematic. It may be caused by Amazon infrastructure and delays in their garbage collection. Sometimes VPC can be deleted by aws-nuke (by making more retries), but in some situations aws-nuke shows this after 30 retries:

┌────────────────────────────────────────────────────────────────────────────────────────┐
| Identifier               | Resource Type | Deleted Successfully                        |
| subnet-0bba42de2de2219a2 | Subnet        | ❌ DependencyViolation: The subnet 'subnet- |
| -------------------------------------------------------------------------------------- |
| subnet-01bcb1bcf457de8fe | Subnet        | ❌ DependencyViolation: The subnet 'subnet- |
| -------------------------------------------------------------------------------------- |
| vpc-0904283dda6216f67    | VPC           | ❌ 'Waiting for all Network interfaces to b |
| -------------------------------------------------------------------------------------- |
| subnet-09449bf09e07c6028 | Subnet        | ❌ DependencyViolation: The subnet 'subnet- |
└────────────────────────────────────────────────────────────────────────────────────────┘

and resources can be freed a couple of hours later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

No branches or pull requests

1 participant