Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

Merged
merged 2 commits into from
Feb 16, 2024

Conversation

valerian-roche
Copy link
Collaborator

Following the previous PRs, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:

  • reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
  • avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
  • sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
}
}
}

func (cache *LinearCache) CreateDeltaWatch(request *DeltaRequest, sub Subscription, value chan DeltaResponse) (func(), error) {
if request.GetTypeUrl() != cache.typeURL {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most changes in delta is to align with CreateWatch. The content is now the same except for the potential versions initialization

pkg/cache/v3/linear.go Show resolved Hide resolved
Copy link

@zhiyanfoo zhiyanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Just some questions about the way stuff are implemented. Also I don't think I'm familiar enough with the go-control-plane to verify correctness tbh.

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
…sourceVersionsComputed to make it state rather than action
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
Comment on lines 128 to 129
// WithComputeStableVersions ensures the cache tracks the resources stable versions from the beginning,
// avoiding the first watch stalling until the computation has been done on all resources.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I really don't understand this comment. Would you mind elaborating a bit? What situation might happen in the beginning that I should be aware of and that is controlled by this flag?

My intuition reading the comment is that I'm not sure we need this flag at all, maybe it should just be the default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment. I will actually probably completely invert the logic, which will greatly improve the performance on non-wildcard watches (so all grpc watches and most usecases of linear for envoy)

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
pkg/cache/v3/linear.go Outdated Show resolved Hide resolved
Copy link

@atollena atollena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work on this.

}
return out
}

func (cache *LinearCache) computeSotwResponse(watch ResponseWatch, ignoreReturnedResources bool) *RawResponse {
func (cache *LinearCache) computeResourceChange(sub Subscription, ignoreReturnedResources, useStableVersion bool) (updated, removed []string, err error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small suggestion: use alwaysReturnAllResources for the first boolean. I find that more clear than refering to "returned resources".

I think this method does enough to warrant a godoc comment explaining what it does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the name here and in other places where it is used
Also added a godoc

var changedResources []string
var removedResources []string

knownVersions := watch.subscription.ReturnedResources()
knownVersions := sub.ReturnedResources()
if ignoreReturnedResources {
// The response will include all resources, with no regards of resources potentially already returned.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is LDS and CDS in SOtW as hardcoded in the protocol spec. So I would suggest mentioning this in this comment rather making it sound like this could apply to any resource type in any mode. This special "ignoreReturnedResources" will immediately make sense if you explain this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this context it does not consider whether the response is full state or not. The new comment should hopefully make it clearer

Comment on lines 592 to 599
cache.log.Warnf("[linear cache] error computing watch response: %s", err)
return nil, fmt.Errorf("failed to compute the watch respnse: %w", err)
Copy link

@atollena atollena Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging and also returning the error is almost never the right thing to do, since there is high chances it will get logged twice. Here IIUC at least for ADS it will also severe the stream with an gRPC status.Unknown error because of https://github.com/envoyproxy/go-control-plane/blob/652ffb49f6ff95ea4c81dc20eaf6d2f36e0ba75d/pkg/server/sotw/v3/ads.go#L115.

The logic is cluttered with error handling that, IIUC, can only happen due to protobuf marshalling error, which I don't know how to make them happen. So maybe a better option would be to log and either swallow or panic directly inside the function that marshalls protobuf. I think for most users of go control plane, the protobuf structs are going to be generated directly inside the system, so it is not possible to make those marshalling errors happen outside really obvious programming errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment. I added them as it's quite painful to understand unit tests when they fail in a way otherwise, as the error does not get inlined with the logs.
On the subject of whether we can just panic or ignore the error when marshaling fails, I'm not sure what actual error cases do exist in protobuf. I wouldn't be surprised they're irrelevant, but I'd rather keep further changes to another PR in this case

@valerian-roche valerian-roche merged commit 8271c4b into dd/sotw-fixes Feb 16, 2024
3 of 4 checks passed
valerian-roche added a commit that referenced this pull request Feb 16, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
valerian-roche added a commit that referenced this pull request Feb 16, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
@valerian-roche valerian-roche deleted the vr/linear-reorganize branch February 16, 2024 20:34
valerian-roche added a commit that referenced this pull request Feb 23, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
zhiyanfoo pushed a commit that referenced this pull request Apr 10, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
valerian-roche added a commit that referenced this pull request Apr 10, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
valerian-roche added a commit that referenced this pull request May 9, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
valerian-roche added a commit that referenced this pull request Aug 8, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
shamdor pushed a commit that referenced this pull request Sep 23, 2024
…ore common between delta and sotw (#11)

Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:
 - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
 - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
 - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

Signed-off-by: Valerian Roche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants