Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix WARNING: DATA RACE issue when multiple goroutines access the backend #13875

Merged
merged 1 commit into from
Apr 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions server/etcdserver/adapters.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,10 @@ func (s *serverVersionAdapter) GetMembersVersions() map[string]*version.Versions
}

func (s *serverVersionAdapter) GetStorageVersion() *semver.Version {
// `applySnapshot` sets a new backend instance, so we need to acquire the bemu lock.
s.bemu.RLock()
defer s.bemu.RUnlock()

tx := s.be.BatchTx()
tx.Lock()
defer tx.Unlock()
Expand All @@ -85,6 +89,10 @@ func (s *serverVersionAdapter) GetStorageVersion() *semver.Version {
}

func (s *serverVersionAdapter) UpdateStorageVersion(target semver.Version) error {
// `applySnapshot` sets a new backend instance, so we need to acquire the bemu lock.
s.bemu.RLock()
defer s.bemu.RUnlock()

tx := s.be.BatchTx()
tx.Lock()
defer tx.Unlock()
Expand Down
6 changes: 3 additions & 3 deletions server/etcdserver/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ type EtcdServer struct {

kv mvcc.WatchableKV
lessor lease.Lessor
bemu sync.Mutex
bemu sync.RWMutex
be backend.Backend
beHooks *serverstorage.BackendHooks
authStore auth.AuthStore
Expand Down Expand Up @@ -2229,8 +2229,8 @@ func (s *EtcdServer) parseProposeCtxErr(err error, start time.Time) error {

func (s *EtcdServer) KV() mvcc.WatchableKV { return s.kv }
func (s *EtcdServer) Backend() backend.Backend {
s.bemu.Lock()
defer s.bemu.Unlock()
s.bemu.RLock()
defer s.bemu.RUnlock()
return s.be
}

Expand Down
4 changes: 4 additions & 0 deletions server/etcdserver/v3_server.go
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,11 @@ func (s *EtcdServer) Compact(ctx context.Context, r *pb.CompactionRequest) (*pb.
// the hash may revert to a hash prior to compaction completing
// if the compaction resumes. Force the finished compaction to
// commit so it won't resume following a crash.
//
// `applySnapshot` sets a new backend instance, so we need to acquire the bemu lock.
s.bemu.RLock()
s.be.ForceCommit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting whether we have good test-coverage of 'swapping' backends in the middle of compaction.

The scheduledCompaction theoretically might finish on the 'old backends' why the recovered remains not compacted...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss & fix the race condition between compaction & defragmentation & snapshot and applySnapshot cases in separate issues/PRs. I will drive it.

s.bemu.RUnlock()
trace.Step("physically apply compaction")
}
if err != nil {
Expand Down
2 changes: 2 additions & 0 deletions server/storage/backend/backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -419,6 +419,8 @@ func (b *backend) run() {
func (b *backend) Close() error {
close(b.stopc)
<-b.donec
b.mu.Lock()
defer b.mu.Unlock()
return b.db.Close()
}

Expand Down