Skip to content
This repository has been archived by the owner on Mar 4, 2024. It is now read-only.

replication: Try to reinstate entries that have been truncated away #483

Merged
merged 6 commits into from
Nov 27, 2023

Conversation

cole-miller
Copy link
Contributor

Experiment with fixing #470

@cole-miller
Copy link
Contributor Author

please test downstream

@codecov
Copy link

codecov bot commented Sep 26, 2023

Codecov Report

Attention: 8 lines in your changes are missing coverage. Please review.

Comparison is base (b68076f) 76.84% compared to head (0cd0265) 77.22%.
Report is 5 commits behind head on master.

Files Patch % Lines
src/log.c 88.57% 1 Missing and 3 partials ⚠️
src/replication.c 50.00% 1 Missing and 2 partials ⚠️
src/fixture.c 97.67% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #483      +/-   ##
==========================================
+ Coverage   76.84%   77.22%   +0.37%     
==========================================
  Files          51       51              
  Lines        9748     9792      +44     
  Branches     2501     2504       +3     
==========================================
+ Hits         7491     7562      +71     
+ Misses       1075     1052      -23     
+ Partials     1182     1178       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cole-miller
Copy link
Contributor Author

please test downstream

@cole-miller
Copy link
Contributor Author

please test downstream

@cole-miller
Copy link
Contributor Author

Test results look encouraging!

@cole-miller
Copy link
Contributor Author

please test downstream

@cole-miller
Copy link
Contributor Author

please test downstream

@calvin2021y

This comment was marked as off-topic.

@freeekanayaka

This comment was marked as off-topic.

@cole-miller

This comment was marked as off-topic.

@calvin2021y

This comment was marked as off-topic.

@freeekanayaka

This comment was marked as off-topic.

@cole-miller
Copy link
Contributor Author

Fixes #470. To sum up: libraft's in-memory log implementation can only track a single log entry for each (index, term) pair. We also need to keep log entries around as long as they're referenced by an AppendEntries message that's waiting to be sent. This leads to a situation where a leader steps down suddenly because its disk write for an entry fails, but it's still holding on to that entry in its in-memory log. Another node that received the entry from the original leader can win the next election and send the original leader that same entry, and the original leader will encounter a collision when it tries to append the entry to its log for a second time.

The fix is to have the original leader notice in this situation that the entry it's just received is already tracked by the in-memory log, and just increment the refcount of that existing entry.

src/log.h Outdated Show resolved Hide resolved
@freeekanayaka
Copy link
Contributor

FWIW, if it could serve as inspiration I've worked around this issue with this approach cowsql/raft#101. Probably not ideal, but good enough for getting rid of the bug and the assertion failure (I'm planning a broader fix later).

@cole-miller cole-miller marked this pull request as ready for review November 22, 2023 13:51
src/fixture.c Outdated Show resolved Hide resolved
@cole-miller cole-miller merged commit d15a7a6 into canonical:master Nov 27, 2023
19 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants