[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

valerian-roche · 2024-02-05T23:49:51Z

Following the previous PRs, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup:

reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions
avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory)
sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different

pkg/cache/v3/linear.go

valerian-roche · 2024-02-06T16:03:32Z

pkg/cache/v3/linear.go

 }
 }
 }

 func (cache *LinearCache) CreateDeltaWatch(request *DeltaRequest, sub Subscription, value chan DeltaResponse) (func(), error) {
+ if request.GetTypeUrl() != cache.typeURL {


Most changes in delta is to align with CreateWatch. The content is now the same except for the potential versions initialization

pkg/cache/v3/linear.go

zhiyanfoo

Overall LGTM. Just some questions about the way stuff are implemented. Also I don't think I'm familiar enough with the go-control-plane to verify correctness tbh.

pkg/cache/v3/linear.go

…sourceVersionsComputed to make it state rather than action

pkg/cache/v3/linear.go

atollena · 2024-02-08T09:58:19Z

pkg/cache/v3/linear.go

+// WithComputeStableVersions ensures the cache tracks the resources stable versions from the beginning,
+// avoiding the first watch stalling until the computation has been done on all resources.


I'm sorry, but I really don't understand this comment. Would you mind elaborating a bit? What situation might happen in the beginning that I should be aware of and that is controlled by this flag?

My intuition reading the comment is that I'm not sure we need this flag at all, maybe it should just be the default.

Updated the comment. I will actually probably completely invert the logic, which will greatly improve the performance on non-wildcard watches (so all grpc watches and most usecases of linear for envoy)

pkg/cache/v3/linear.go

atollena

Really nice work on this.

atollena · 2024-02-16T16:24:02Z

pkg/cache/v3/linear.go

 }
 return out
 }

-func (cache *LinearCache) computeSotwResponse(watch ResponseWatch, ignoreReturnedResources bool) *RawResponse {
+func (cache *LinearCache) computeResourceChange(sub Subscription, ignoreReturnedResources, useStableVersion bool) (updated, removed []string, err error) {


small suggestion: use alwaysReturnAllResources for the first boolean. I find that more clear than refering to "returned resources".

I think this method does enough to warrant a godoc comment explaining what it does.

Updated the name here and in other places where it is used
Also added a godoc

atollena · 2024-02-16T16:24:30Z

pkg/cache/v3/linear.go

 var changedResources []string
 var removedResources []string

- knownVersions := watch.subscription.ReturnedResources()
+ knownVersions := sub.ReturnedResources()
 if ignoreReturnedResources {
 // The response will include all resources, with no regards of resources potentially already returned.


This is LDS and CDS in SOtW as hardcoded in the protocol spec. So I would suggest mentioning this in this comment rather making it sound like this could apply to any resource type in any mode. This special "ignoreReturnedResources" will immediately make sense if you explain this.

In this context it does not consider whether the response is full state or not. The new comment should hopefully make it clearer

atollena · 2024-02-16T16:48:45Z

pkg/cache/v3/linear.go

+ cache.log.Warnf("[linear cache] error computing watch response: %s", err)
+ return nil, fmt.Errorf("failed to compute the watch respnse: %w", err)


Logging and also returning the error is almost never the right thing to do, since there is high chances it will get logged twice. Here IIUC at least for ADS it will also severe the stream with an gRPC status.Unknown error because of https://github.com/envoyproxy/go-control-plane/blob/652ffb49f6ff95ea4c81dc20eaf6d2f36e0ba75d/pkg/server/sotw/v3/ads.go#L115.

The logic is cluttered with error handling that, IIUC, can only happen due to protobuf marshalling error, which I don't know how to make them happen. So maybe a better option would be to log and either swallow or panic directly inside the function that marshalls protobuf. I think for most users of go control plane, the protobuf structs are going to be generated directly inside the system, so it is not possible to make those marshalling errors happen outside really obvious programming errors.

Removed the comment. I added them as it's quite painful to understand unit tests when they fail in a way otherwise, as the error does not get inlined with the logs.
On the subject of whether we can just panic or ignore the error when marshaling fails, I'm not sure what actual error cases do exist in protobuf. I wouldn't be surprised they're irrelevant, but I'd rather keep further changes to another PR in this case

…uteResourceChange

…ore common between delta and sotw (#11) Following previous PRs on linear cache fixes in sotw, the implementation of linear cache has become more common between sotw and delta watches. This PR makes it clearer and reorganize the setup: - reorganize resource cache and watch tracking per resource, instead of per mode. This will enable changes like sotw using resource versions - avoid multiple marshaling of the resource when the version is already computed through marshaling (not in this PR, as potentially impactful on memory) - sotw and delta watches use a common model for their handling, with common code to compute subscription impact vs. cache. Only the generation of the response is different Signed-off-by: Valerian Roche <[email protected]>

valerian-roche force-pushed the vr/linear-reorganize branch from e1b22c6 to e76a085 Compare February 6, 2024 15:55

valerian-roche commented Feb 6, 2024

View reviewed changes

zhiyanfoo approved these changes Feb 7, 2024

View reviewed changes

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved

zhiyanfoo reviewed Feb 7, 2024

View reviewed changes

pkg/cache/v3/linear.go Outdated Show resolved Hide resolved

Improve comments and rename c.computeResourceVersion to c.areStableRe…

81b2834

…sourceVersionsComputed to make it state rather than action

valerian-roche force-pushed the vr/linear-reorganize branch from e76a085 to 81b2834 Compare February 7, 2024 15:06

atollena reviewed Feb 8, 2024

View reviewed changes

valerian-roche force-pushed the vr/linear-reorganize branch from c870956 to 81b2834 Compare February 8, 2024 16:33

valerian-roche mentioned this pull request Feb 8, 2024

Use lazy computation of stable versions in linear cache #13

Merged

valerian-roche added the to-upstream label Feb 9, 2024

valerian-roche requested a review from atollena February 15, 2024 00:59

atollena approved these changes Feb 16, 2024

View reviewed changes

atollena reviewed Feb 16, 2024

View reviewed changes

Rename attributes to better reflect the impact and add godoc for comp…

3094890

…uteResourceChange

valerian-roche force-pushed the vr/linear-reorganize branch from 8268981 to 3094890 Compare February 16, 2024 19:51

valerian-roche merged commit 8271c4b into dd/sotw-fixes Feb 16, 2024
3 of 4 checks passed

valerian-roche deleted the vr/linear-reorganize branch February 16, 2024 20:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

valerian-roche commented Feb 5, 2024

valerian-roche Feb 6, 2024

zhiyanfoo left a comment •

edited

Loading

atollena Feb 8, 2024

valerian-roche Feb 8, 2024

atollena left a comment

atollena Feb 16, 2024

valerian-roche Feb 16, 2024

atollena Feb 16, 2024

valerian-roche Feb 16, 2024

atollena Feb 16, 2024 •

edited

Loading

valerian-roche Feb 16, 2024

		// WithComputeStableVersions ensures the cache tracks the resources stable versions from the beginning,
		// avoiding the first watch stalling until the computation has been done on all resources.

		cache.log.Warnf("[linear cache] error computing watch response: %s", err)
		return nil, fmt.Errorf("failed to compute the watch respnse: %w", err)

[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

[Linear cache] rework watches and storage in the cache to make code more common between delta and sotw #11

Conversation

valerian-roche commented Feb 5, 2024

Choose a reason for hiding this comment

zhiyanfoo left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atollena left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atollena Feb 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiyanfoo left a comment •

edited

Loading

atollena Feb 16, 2024 •

edited

Loading