Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: LeaseTimeToLive returns error if leader changed #17642

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented Mar 25, 2024

The old leader demotes lessor and all the leases' expire time will be updated. Instead of returning incorrect remaining TTL, we should return errors to force client retry.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Fixes: #17506

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@fuweid
Copy link
Member Author

fuweid commented Mar 26, 2024

The old leader demotes lessor and all the leases' expire time will be
updated. Instead of returning incorrect remaining TTL, we should return
errors to force client retry.

Signed-off-by: Wei Fu <[email protected]>
@siyuanfoundation
Copy link
Contributor

/lgtm

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

It's also a minor bug, which we should backport to 3.4 and 3.5.

@serathius
Copy link
Member

Not sure about a solution here, what are we trying to fix? The flaking TestLeaseGrantKeepAliveOnce or fact that LeaseTimeToLive is not not linearizable?

If it's the first problem, then we should just retry in the test.

If it's the second, we need to make the request linearizable in similar way we did for Lease expiration #16822, we need to do a quorum read. I think the presented solution just gives a illusion of things of problem being solved.

What are the chances that a leader is changed between checking isLeader() and returning response, pretty small. The only blocking operation is the waitAppliedIndex as it has 1 second timeout. It's long enough to consider it, however as we learned from lease expiration, there is no guarantee that leader will notice that demotion at all. In your test case you added a clean sleep that provides you a clean leader change, however in reality it doesn't happen like this. In etcd default configuration it should take exactly 1 second to members to decide that leader election is needed.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to discuss the issue more.

@fuweid
Copy link
Member Author

fuweid commented Mar 27, 2024

Not sure about a solution here, what are we trying to fix? The flaking TestLeaseGrantKeepAliveOnce or fact that LeaseTimeToLive is not not linearizable?

I would like to say it's data race issue. The TestLeaseGrantKeepAliveOnce case shows us that if leader changed after s.isLeader(), the response is not correct. Since the raft node is running in background, the leader can be changed at any time. The TestLeaseGrantKeepAliveOnce hits it.

// from test case log
2024-02-28T11:10:09.1624683Z     logger.go:130: 2024-02-28T11:05:04.811Z	INFO	m1.raft	62d1ff821e702f1 became follower at term 3	{"member": "m1"}
2024-02-28T11:10:09.1626973Z     logger.go:130: 2024-02-28T11:05:04.811Z	INFO	m1.raft	raft.node: 62d1ff821e702f1 lost leader 62d1ff821e702f1 at term 3	{"member": "m1"}
2024-02-28T11:10:09.1628294Z     lease_test.go:180: 
2024-02-28T11:10:09.1629600Z         	Error Trace:	/home/runner/actions-runner/_work/etcd/etcd/tests/common/lease_test.go:180
2024-02-28T11:10:09.1631871Z         	            				/home/runner/actions-runner/_work/etcd/etcd/tests/framework/testutils/execute.go:38
2024-02-28T11:10:09.1634051Z         	            				/home/runner/actions-runner/_work/_tool/go/1.21.6/arm64/src/runtime/asm_arm64.s:1197
2024-02-28T11:10:09.1635263Z         	Error:      	"2" is not greater than "9223372036"
2024-02-28T11:10:09.1636251Z         	Test:       	TestLeaseGrantKeepAliveOnce/PeerAutoTLS

What are the chances that a leader is changed between checking isLeader() and returning response, pretty small.

Well, CI runner is resource-limited vm and I remember we hit it many times. It's small but it's data race issue actually.
The Demoted check is to make sure that the response is correct even if leader changed after that.

In your test case you added a clean sleep that provides you a clean leader change, however in reality it doesn't happen like this.

It's hard to write regression test case for data race issue. The sleep is just used to create the race timing condition for test purpose. There is possible in production since there are two goroutines running background. The older leader's raft node processes message and demotes all the leases.

For lease remaining TTL (Granted TTL is 10s), there are several possible responses:

  • 404
  • timeout (because of no leader)
  • <= 10s

It should not be 9223372036. I don't think we should ignore it. It's minor bug but it doesn't make sense to rerun the failure cases.

@serathius
Copy link
Member

I would like to say it's data race issue. The TestLeaseGrantKeepAliveOnce case shows us that if leader changed after s.isLeader(), the response is not correct. Since the raft node is running in background, the leader can be changed at any time. The TestLeaseGrantKeepAliveOnce hits it.

So this is a test issue, we should retry. As you said leader can change at any moment, by adding second check you just increase the window where we would detect leader change, but leader can still change after le.Demoted() is called.

What I'm trying to say is that, you can add as many check to le.Demoted() as you want, it doesn't fix the underlying issue of badly written test.

@fuweid
Copy link
Member Author

fuweid commented Mar 28, 2024

So this is a test issue, we should retry. As you said leader can change at any moment, by adding second check you just increase the window where we would detect leader change, but leader can still change after le.Demoted() is called.

If we choose to retry, what is condition for retry? I don't think it should be 9223372036 because user can set this value.
And we don't have doc for this behavior.

Based on current design, the lease remaining TTL will be reset after leader changed.
The le.Demoted is called to make sure that current call is correct. If we detect that leader has been changed, we should return unavailable (ErrLeaderChanged) and etcd client sdk will retry it.

Even if leader changed after le.Demoted check, what is different with current behavior?

What I'm trying to say is that, you can add as many check to le.Demoted() as you want

IMO, we just need to ensure that response is valid before return.

it doesn't fix the underlying issue of badly written test.

Would you mind sharing idea how to fix it? In this patch, I force server to return retryable error to client. Thanks

@ahrtr
Copy link
Member

ahrtr commented Mar 31, 2024

  • This is a production bug: leaseTimeToLive may return an invalid TTL (e.g. 9223372036) during the leader changes.
  • The PR guarantees that leaseTimeToLive will never return such an invalid TTL, instead it
    • either returns a valid TTL
    • or return errors.ErrLeaderChanged

@serathius
Copy link
Member

serathius commented Apr 2, 2024

Ok, so the issue you are fixing is the race between reading the least TTL and leader change causing reset of TTLs. Makes sense. Thanks @ahrtr

@serathius serathius merged commit 09769c4 into etcd-io:main Apr 2, 2024
39 checks passed
@fuweid fuweid deleted the fix-17506 branch April 4, 2024 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Flaking TestLeaseGrantKeepAliveOnce
6 participants