*: LeaseTimeToLive returns error if leader changed #17642

fuweid · 2024-03-25T15:21:19Z

The old leader demotes lessor and all the leases' expire time will be updated. Instead of returning incorrect remaining TTL, we should return errors to force client retry.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

Fixes: #17506

k8s-ci-robot · 2024-03-25T15:21:21Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

fuweid · 2024-03-26T09:38:41Z

cc @ahrtr @serathius @jmhbnz @siyuanfoundation

tests/integration/v3_lease_test.go

The old leader demotes lessor and all the leases' expire time will be updated. Instead of returning incorrect remaining TTL, we should return errors to force client retry. Signed-off-by: Wei Fu <[email protected]>

siyuanfoundation · 2024-03-26T17:16:38Z

/lgtm

ahrtr

/lgtm

It's also a minor bug, which we should backport to 3.4 and 3.5.

serathius · 2024-03-27T09:47:15Z

Not sure about a solution here, what are we trying to fix? The flaking TestLeaseGrantKeepAliveOnce or fact that LeaseTimeToLive is not not linearizable?

If it's the first problem, then we should just retry in the test.

If it's the second, we need to make the request linearizable in similar way we did for Lease expiration #16822, we need to do a quorum read. I think the presented solution just gives a illusion of things of problem being solved.

What are the chances that a leader is changed between checking isLeader() and returning response, pretty small. The only blocking operation is the waitAppliedIndex as it has 1 second timeout. It's long enough to consider it, however as we learned from lease expiration, there is no guarantee that leader will notice that demotion at all. In your test case you added a clean sleep that provides you a clean leader change, however in reality it doesn't happen like this. In etcd default configuration it should take exactly 1 second to members to decide that leader election is needed.

serathius

I think we need to discuss the issue more.

fuweid · 2024-03-27T10:13:56Z

Not sure about a solution here, what are we trying to fix? The flaking TestLeaseGrantKeepAliveOnce or fact that LeaseTimeToLive is not not linearizable?

I would like to say it's data race issue. The TestLeaseGrantKeepAliveOnce case shows us that if leader changed after s.isLeader(), the response is not correct. Since the raft node is running in background, the leader can be changed at any time. The TestLeaseGrantKeepAliveOnce hits it.

// from test case log
2024-02-28T11:10:09.1624683Z     logger.go:130: 2024-02-28T11:05:04.811Z	INFO	m1.raft	62d1ff821e702f1 became follower at term 3	{"member": "m1"}
2024-02-28T11:10:09.1626973Z     logger.go:130: 2024-02-28T11:05:04.811Z	INFO	m1.raft	raft.node: 62d1ff821e702f1 lost leader 62d1ff821e702f1 at term 3	{"member": "m1"}
2024-02-28T11:10:09.1628294Z     lease_test.go:180: 
2024-02-28T11:10:09.1629600Z         	Error Trace:	/home/runner/actions-runner/_work/etcd/etcd/tests/common/lease_test.go:180
2024-02-28T11:10:09.1631871Z         	            				/home/runner/actions-runner/_work/etcd/etcd/tests/framework/testutils/execute.go:38
2024-02-28T11:10:09.1634051Z         	            				/home/runner/actions-runner/_work/_tool/go/1.21.6/arm64/src/runtime/asm_arm64.s:1197
2024-02-28T11:10:09.1635263Z         	Error:      	"2" is not greater than "9223372036"
2024-02-28T11:10:09.1636251Z         	Test:       	TestLeaseGrantKeepAliveOnce/PeerAutoTLS

What are the chances that a leader is changed between checking isLeader() and returning response, pretty small.

Well, CI runner is resource-limited vm and I remember we hit it many times. It's small but it's data race issue actually.
The Demoted check is to make sure that the response is correct even if leader changed after that.

In your test case you added a clean sleep that provides you a clean leader change, however in reality it doesn't happen like this.

It's hard to write regression test case for data race issue. The sleep is just used to create the race timing condition for test purpose. There is possible in production since there are two goroutines running background. The older leader's raft node processes message and demotes all the leases.

For lease remaining TTL (Granted TTL is 10s), there are several possible responses:

404
timeout (because of no leader)
<= 10s

It should not be 9223372036. I don't think we should ignore it. It's minor bug but it doesn't make sense to rerun the failure cases.

serathius · 2024-03-28T11:22:20Z

I would like to say it's data race issue. The TestLeaseGrantKeepAliveOnce case shows us that if leader changed after s.isLeader(), the response is not correct. Since the raft node is running in background, the leader can be changed at any time. The TestLeaseGrantKeepAliveOnce hits it.

So this is a test issue, we should retry. As you said leader can change at any moment, by adding second check you just increase the window where we would detect leader change, but leader can still change after le.Demoted() is called.

What I'm trying to say is that, you can add as many check to le.Demoted() as you want, it doesn't fix the underlying issue of badly written test.

fuweid · 2024-03-28T11:38:35Z

So this is a test issue, we should retry. As you said leader can change at any moment, by adding second check you just increase the window where we would detect leader change, but leader can still change after le.Demoted() is called.

If we choose to retry, what is condition for retry? I don't think it should be 9223372036 because user can set this value.
And we don't have doc for this behavior.

Based on current design, the lease remaining TTL will be reset after leader changed.
The le.Demoted is called to make sure that current call is correct. If we detect that leader has been changed, we should return unavailable (ErrLeaderChanged) and etcd client sdk will retry it.

Even if leader changed after le.Demoted check, what is different with current behavior?

What I'm trying to say is that, you can add as many check to le.Demoted() as you want

IMO, we just need to ensure that response is valid before return.

it doesn't fix the underlying issue of badly written test.

Would you mind sharing idea how to fix it? In this patch, I force server to return retryable error to client. Thanks

ahrtr · 2024-03-31T15:40:58Z

This is a production bug: leaseTimeToLive may return an invalid TTL (e.g. 9223372036) during the leader changes.
The PR guarantees that leaseTimeToLive will never return such an invalid TTL, instead it
- either returns a valid TTL
- or return errors.ErrLeaderChanged

serathius · 2024-04-02T12:40:15Z

Ok, so the issue you are fixing is the race between reading the least TTL and leader change causing reset of TTLs. Makes sense. Thanks @ahrtr

k8s-ci-robot added the do-not-merge/work-in-progress label Mar 25, 2024

fuweid force-pushed the fix-17506 branch from 42b7609 to b976121 Compare March 25, 2024 15:29

fuweid marked this pull request as ready for review March 25, 2024 16:04

k8s-ci-robot removed the do-not-merge/work-in-progress label Mar 25, 2024

serathius reviewed Mar 26, 2024

View reviewed changes

tests/integration/v3_lease_test.go Outdated Show resolved Hide resolved

*: LeaseTimeToLive returns error if leader changed

d3bb6f6

The old leader demotes lessor and all the leases' expire time will be updated. Instead of returning incorrect remaining TTL, we should return errors to force client retry. Signed-off-by: Wei Fu <[email protected]>

fuweid force-pushed the fix-17506 branch from b976121 to d3bb6f6 Compare March 26, 2024 11:01

ahrtr approved these changes Mar 27, 2024

View reviewed changes

serathius requested changes Mar 27, 2024

View reviewed changes

serathius approved these changes Apr 2, 2024

View reviewed changes

serathius merged commit 09769c4 into etcd-io:main Apr 2, 2024
39 checks passed

jmhbnz added backport/v3.4 backport/v3.5 labels Apr 2, 2024

This was referenced Apr 4, 2024

[3.5] *: LeaseTimeToLive returns error if leader changed #17704

Merged

[3.4] *: LeaseTimeToLive returns error if leader changed #17705

Merged

fuweid deleted the fix-17506 branch April 4, 2024 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: LeaseTimeToLive returns error if leader changed #17642

*: LeaseTimeToLive returns error if leader changed #17642

fuweid commented Mar 25, 2024

k8s-ci-robot commented Mar 25, 2024

fuweid commented Mar 26, 2024

siyuanfoundation commented Mar 26, 2024

ahrtr left a comment

serathius commented Mar 27, 2024

serathius left a comment

fuweid commented Mar 27, 2024

serathius commented Mar 28, 2024

fuweid commented Mar 28, 2024 •

edited

Loading

ahrtr commented Mar 31, 2024

serathius commented Apr 2, 2024 •

edited

Loading

*: LeaseTimeToLive returns error if leader changed #17642

*: LeaseTimeToLive returns error if leader changed #17642

Conversation

fuweid commented Mar 25, 2024

k8s-ci-robot commented Mar 25, 2024

fuweid commented Mar 26, 2024

siyuanfoundation commented Mar 26, 2024

ahrtr left a comment

Choose a reason for hiding this comment

serathius commented Mar 27, 2024

serathius left a comment

Choose a reason for hiding this comment

fuweid commented Mar 27, 2024

serathius commented Mar 28, 2024

fuweid commented Mar 28, 2024 • edited Loading

ahrtr commented Mar 31, 2024

serathius commented Apr 2, 2024 • edited Loading

fuweid commented Mar 28, 2024 •

edited

Loading

serathius commented Apr 2, 2024 •

edited

Loading