Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickfirst: Register a health listener when used as a leaf policy #7832

Merged
merged 11 commits into from
Dec 5, 2024

Conversation

arjan-bal
Copy link
Contributor

As part of the dualstack changes described in A61, pickfirst will become the universal leaf policy. This PR provides a EnableHealthListener function for petiole policies to inform pickfirst when it's functioning as a leaf policy.

When functioning as a leaf policy, pickfirst will subscribe to health updates using the SubConn.RegisterHealthListener API introduced in #7780 once the SubConn connectivity state becomes READY. The health state will be used to update the ClientConn state as long as the SubConn's connectivity state remains READY.

RELEASE NOTES: N/A

@arjan-bal arjan-bal added Type: Feature New features or improvements in behavior Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. labels Nov 13, 2024
@arjan-bal arjan-bal added this to the 1.69 Release milestone Nov 13, 2024
@arjan-bal arjan-bal requested review from easwars and dfawley November 13, 2024 07:34
Copy link

codecov bot commented Nov 13, 2024

Codecov Report

Attention: Patch coverage is 81.52174% with 17 lines in your changes missing coverage. Please review.

Project coverage is 82.08%. Comparing base (4c07bca) to head (886ad2f).
Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 81.52% 14 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7832      +/-   ##
==========================================
+ Coverage   81.84%   82.08%   +0.23%     
==========================================
  Files         377      377              
  Lines       38120    38179      +59     
==========================================
+ Hits        31201    31339     +138     
+ Misses       5603     5541      -62     
+ Partials     1316     1299      -17     
Files with missing lines Coverage Δ
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go 86.62% <81.52%> (-1.18%) ⬇️

... and 34 files with indirect coverage changes

@arjan-bal arjan-bal modified the milestone: 1.69 Release Nov 13, 2024
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
@easwars easwars assigned arjan-bal and unassigned easwars Nov 15, 2024
@arjan-bal arjan-bal assigned easwars and unassigned easwars Nov 18, 2024
@arjan-bal arjan-bal assigned easwars and unassigned arjan-bal Nov 19, 2024
@easwars easwars assigned arjan-bal and unassigned easwars Nov 19, 2024
Comment on lines 119 to 123
// EnableHealthListener updates the state to configure pickfirst for using a
// generic health listener.
func EnableHealthListener(attrs *attributes.Attributes) *attributes.Attributes {
return attrs.WithValue(enableHealthListenerKeyType{}, enableHealthListenerValue)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these types of functions we prefer to make them operate on the thing that contains the attributes. In this case, that would be a resolver.State.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the function to accept resolver.State.

mu sync.Mutex
state connectivity.State
mu sync.Mutex
connectivityState connectivity.State
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a comment, too, since there are now two very similar fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this kind of tracking be done in the subconn struct (scData) and not here? I would expect the lb policy only has the concludedState and each subchannel needs to track its real state and its effective state, accounting for sticky-TF and health reporting? It seems confusing to me that the LB policy itself is tracking two different states, but I'm willing to believe it's simpler this way if you tried it the other way already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring the Java implementation of Pickfirst and they were handling the sticky TF behaviour in the LB Policy. I don't see any issue with handling sticky TF in the subchannel state. I've update the PR to reflect the suggestions.

balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
@dfawley dfawley removed their assignment Nov 19, 2024
@arjan-bal arjan-bal force-pushed the pickfirst-health-listener branch from 2700efe to cf153cb Compare November 20, 2024 15:18
@arjan-bal arjan-bal assigned easwars and dfawley and unassigned arjan-bal Nov 20, 2024
@easwars easwars removed their assignment Nov 20, 2024
@arjan-bal arjan-bal force-pushed the pickfirst-health-listener branch from 64e7a8a to 13ad224 Compare November 21, 2024 07:52
@arjan-bal arjan-bal assigned easwars and unassigned arjan-bal Nov 21, 2024
@easwars easwars assigned arjan-bal and unassigned easwars Nov 21, 2024
@arjan-bal arjan-bal removed their assignment Nov 22, 2024
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
mu sync.Mutex
state connectivity.State
mu sync.Mutex
connectivityState connectivity.State
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this kind of tracking be done in the subconn struct (scData) and not here? I would expect the lb policy only has the concludedState and each subchannel needs to track its real state and its effective state, accounting for sticky-TF and health reporting? It seems confusing to me that the LB policy itself is tracking two different states, but I'm willing to believe it's simpler this way if you tried it the other way already.

balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
@dfawley dfawley assigned arjan-bal and unassigned dfawley Nov 22, 2024
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Nov 25, 2024
@@ -600,11 +639,11 @@ func (b *pickfirstBalancer) updateSubConnState(sd *scData, newState balancer.Sub
// a transport is successfully created, but the connection fails
// before the SubConn can send the notification for READY. We treat
// this as a successful connection and transition to IDLE.
if (b.state == connectivity.Ready && newState.ConnectivityState != connectivity.Ready) || (oldState == connectivity.Connecting && newState.ConnectivityState == connectivity.Idle) {
if (oldState == connectivity.Ready && newState.ConnectivityState != connectivity.Ready) || (oldState == connectivity.Connecting && newState.ConnectivityState == connectivity.Idle) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If oldState is Ready, doesn't newState have to be Idle?

And if newState is Idle, the previous state must have been Ready, right? We may have not gotten the update (because of the Connecting->Ready->Idle race), but....can this whole thing be replaced by just if newState.ConnectivityState == connectivity.Idle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If oldState is Ready, doesn't newState have to be Idle?

Yes, this is true because the health state is not sent via the connectivity listener anymore. So READY->TF->READY transitions are not possible and no duplicate READY updates are sent by addrConn.

And if newState is Idle, the previous state must have been Ready, right?

This is false as a transition from TF->IDLE is possible after the subchannel completes its backoff.

We can simplify the condition to oldState != TF && newState == IDLE, along with a comment explaining the assumptions that lead to this simplification. IMO, this may make it harder to understand. Do you think I should make the change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, please change to look like we want it to look (if we didn't have the bug) and add a clause that covers the bug:

if oldState == Ready || (oldState == Connecting && newState == Idle) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and add a TODO to remove the second half when the bug is fixed please)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
balancer/pickfirst/pickfirstleaf/pickfirstleaf.go Outdated Show resolved Hide resolved
@dfawley dfawley assigned arjan-bal and unassigned dfawley Dec 2, 2024
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Dec 3, 2024
@dfawley dfawley assigned arjan-bal and unassigned dfawley Dec 4, 2024
@arjan-bal arjan-bal merged commit 317271b into grpc:master Dec 5, 2024
15 checks passed
@arjan-bal arjan-bal deleted the pickfirst-health-listener branch December 5, 2024 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants