-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always use Lucene index in peer recovery #2077
Always use Lucene index in peer recovery #2077
Conversation
With soft deletes no longer optional, peer recovery is switched to always use the lucene index instead of replaying operations from the translog. Signed-off-by: Nicholas Walter Knize <[email protected]>
Can one of the admins verify this patch? |
resources.add(retentionLock); | ||
final long startingSeqNo; | ||
final boolean isSequenceNumberBasedRecovery = request.startingSeqNo() != SequenceNumbers.UNASSIGNED_SEQ_NO | ||
&& isTargetSameHistory() | ||
&& shard.hasCompleteHistoryOperations("peer-recovery", historySource, request.startingSeqNo()) | ||
&& (historySource == Engine.HistorySource.TRANSLOG | ||
&& shard.hasCompleteHistoryOperations("peer-recovery", request.startingSeqNo()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very difficult to follow this if
s forest, may be a few intermediate boolean
s would help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd rather do that in a follow up because some of this is going to get refactored away anyway with coming segrep changes
server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Show resolved
Hide resolved
Signed-off-by: Nicholas Walter Knize <[email protected]>
shard.estimateNumberOfHistoryOperations("peer-recovery", historySource, startingSeqNo) | ||
); | ||
final Translog.Snapshot phase2Snapshot = shard.getHistoryOperations("peer-recovery", historySource, startingSeqNo); | ||
logger.trace("snapshot translog for recovery; current size is [{}]", estimateNumberOfHistoryOperations(startingSeqNo)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't directly related to the change here, but would it be better to refactor this logging statement so that estimateNumberOfHistoryOperations()
only gets computed if trace logging is actually enabled? It looks like that method is doing a non-trivial amount of work. I don't know the context in which this is called to know if it would make any practical difference though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 Good catch! Wrapped in logger.isTraceEnabled()
Signed-off-by: Nicholas Walter Knize <[email protected]>
@saratvemulapalli hm ... it is there but signature has changed a bit:
|
Correct. Just add true for accurate counts. Note that I almost removed this because it isn't used anywhere in core. I kept it around in case plugins (e.g., CCR) need it. We might consider refactoring to the CCR plugin if it's the only one that uses it, or we can refactor to common-utils module if we need to reuse across plugins. |
Sorry I should have been more specific. CCR was depending on |
Since soft deletes are no longer optional, Translog is also no longer an option for HistorySource. We now always use the Lucene Index per this PR. |
So if I understand this correctly, the only way for plugins to pull history is get the information from lucene index (a.k.a seqNos). |
CCR has to change to use the new signature, yes, and not depend on the "translog" file as a history source. CCR still gets operations history but it's now from the lucene operations index instead of the translog file. |
With soft deletes no longer optional, this PR switches peer recovery to always use the
lucene index instead of replaying operations from the translog. This reduces disk footprint
and storage costs for the end user by relying on improved stored fields compression. Note
there is a slight CPU performance penalty that has recently been improved in recent releases
of lucene. For now we accept the CPU performance penalty in trade for smaller disk footprint
and storage cost. If desired we can explore offering both options that are selectable by an
expert setting (similar to
index.codec
).