-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix WARNING: DATA RACE issue when multiple goroutines access the backend #13875
Conversation
Codecov Report
@@ Coverage Diff @@
## main #13875 +/- ##
==========================================
+ Coverage 72.50% 72.55% +0.04%
==========================================
Files 468 468
Lines 38307 38315 +8
==========================================
+ Hits 27776 27799 +23
+ Misses 8745 8736 -9
+ Partials 1786 1780 -6
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -237,7 +237,11 @@ func (s *EtcdServer) Compact(ctx context.Context, r *pb.CompactionRequest) (*pb. | |||
// the hash may revert to a hash prior to compaction completing | |||
// if the compaction resumes. Force the finished compaction to | |||
// commit so it won't resume following a crash. | |||
// | |||
// `applySnapshot` sets a new backend instance, so we need to acquire the bemu lock. | |||
s.bemu.RLock() | |||
s.be.ForceCommit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting whether we have good test-coverage of 'swapping' backends in the middle of compaction.
The scheduledCompaction theoretically might finish on the 'old backends' why the recovered remains not compacted...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss & fix the race condition between compaction & defragmentation & snapshot
and applySnapshot
cases in separate issues/PRs. I will drive it.
Thank you for fixing this. |
cc @serathius @spzala Please also take a look. Thanks. The fix in this PR can increase pipeline stability. |
Fix issues/13873.
One of the CI failures in pull/13854 was also caused by this issue.