Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(delta): force eds response after cds change #8

Closed
wants to merge 1 commit into from

Conversation

lukidzi
Copy link

@lukidzi lukidzi commented Nov 22, 2024

When there is a CDS update, we need to send an EDS response to transition clusters from the warming phase. If there is no update, Envoy can end up waiting indefinitely for an EDS response. In Envoy 1.32, an EDS cache is enabled by default, and after the initial_fetch_timeout elapses, Envoy uses the cached data. However, if this timeout is disabled, the issue reoccurs. Additionally, waiting for the initial_fetch_timeout to expire slows down the propagation of changes.

Upstream issue: envoyproxy#1035
Upstream PR (no review): envoyproxy#981

@lukidzi
Copy link
Author

lukidzi commented Nov 22, 2024

Seems like sometimes there is still issue, I need to investigate more :/

Copy link

@slonka slonka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of questions but overall looks alright I think

}

if count := c.GetStatusInfo(key).GetNumDeltaWatches(); count != len(testTypes) {
t.Errorf("watches should be created for the latest version, saw %d watches expected %d", count, len(testTypes))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be useful to print the types that were missing?

Comment on lines +365 to +367
if count := c.GetStatusInfo(key).GetNumDeltaWatches(); count != len(testTypes)-2 {
t.Errorf("watches should be preserved for all but two, got: %d open watches instead of the expected %d open watches", count, len(testTypes)-1)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this, we check equality for -2 but we print -1 why? also, nitpick can we have 1/2 as a const and name it descriptively?


// validate response for endpoints
select {
case out := <-watches[testTypes[1]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assign this testTypes[1] to a variable so I know what type that is? is this rsrc.ClusterType?

assertResourceMapEqual(t, cache.IndexRawResourcesByName(out.(*cache.RawDeltaResponse).Resources), snapshot2.GetResources(rsrc.ClusterType))
vMap := out.GetNextVersionMap()
versionMap[testTypes[1]] = vMap
case out := <-watches[testTypes[0]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above ☝️

snapshot2.Resources[types.Cluster] = cache.NewResources(fixture.version2, []types.Resource{cluster})
assertResourceMapEqual(t, cache.IndexRawResourcesByName(out.(*cache.RawDeltaResponse).Resources), snapshot2.GetResources(rsrc.ClusterType))
vMap := out.GetNextVersionMap()
versionMap[testTypes[1]] = vMap
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we writing something when this code block is supposed to validate things? 🤔

@@ -496,6 +505,15 @@ func (cache *snapshotCache) respond(ctx context.Context, request *Request, value
}
}

func getForcePushEdsResourcesNames(typ types.ResponseType, response *RawDeltaResponse) ([]string, types.ResponseType, bool) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need type? it's either endpoint or unknown and we already have the last boolean 🤔

@@ -43,6 +43,14 @@ type StreamState struct { // nolint:golint,revive

// Ordered indicates whether we want an ordered ADS stream or not
ordered bool

// ForcePush indicates if should push the response even when the version is the same
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ForcePush indicates if should push the response even when the version is the same
// ForcePush indicates if the control plane should push the response even when the version is the same


// ForcePush indicates if should push the response even when the version is the same
// https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#resource-warming
// This is required in the situation:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This is required in the situation:
// This is required in the following situations:

// This is required in the situation:
// 1) There is a Cluster change and control plane responds with CDS
// 2) Envoy has a cluster in the warming phase until there is a EDS response, if endpoints haven't changed
// there is no EDS send and changes to the clusters are blocked
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// there is no EDS send and changes to the clusters are blocked
// there is no EDS sent and changes to the clusters are blocked

Copy link

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale label Dec 26, 2024
Copy link

github-actions bot commented Jan 2, 2025

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot closed this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants