feat: Add dns record delete validation #221
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
closes #181
fix e2e multi record test issues
feat: Add dns record delete validation
Adds validation around the deletion of DNSRecord resources to reduce the chances of unexpected records being left in the external dns provider (AWS Routes53 etc..) after the record has been garbage collected. It now takes into account the possibly that multiple records could be manipulating the same zone record set simultaneously which can happen if
a significant number of DNSRecord resources were contributing to the same rootHost record set in the provider and were all deleted at roughly the same time.
The deletion logic was changed to always run validation checks, at random intervals, for an amount of time (currently 15 seconds) after the last time a change was detected. A change here can either be that it had to update the provider itself or that some status on the record was changed, the most important one being
Status.DomainOwners
. If either of these changes occur the delete validation loop of 15 seconds is restarted and will only move on to remove the finalizer once it reaches the 15 seconds duration without any changes having been made.Note: Some providers, such as google, handle this better as they require the API requests to know the current state of the record you are updating/deleting before it will be accepted. Others such as AWS(Route53) unfortunately do not meaning that some things one record just deleted, may get added back by another depending on when it last queried the providers zone records.
Validation
Terminal 1:
Setup local env with multiple instances and watch all dnsrecords:
Terminal 2:
Tail all operator logs:
Terminal 3:
Run multi record e2e with 20 instances (AWS)
Run multi record e2e with 20 instances (GCP)