-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replication received_uuid blocker re snap to share promotion #2902 #2911
replication received_uuid blocker re snap to share promotion #2902 #2911
Conversation
…#2902 When promoting the oldest of the 3 read-only snapshots received & retained by the replication service (btrfs send/receive wrapper), use the force flag during ro-to-rw/snap-to-share transition. At the time of this transition, this received subvol is no longer used for comparison in all future replication (btrfs send/receive) events. It represents an older version of the sending systems associated replication source share. Necessarily older by way of the constraints of the btrfs send/receive architecture, and the safeguards of the replication wrapper: a cascade of ro snapshots.
TestingAlthough the indicated issue only occurs on the 4th replication event, and at the receiver system, we have insufficient docs (dev & end-user) so the following is to assist in filling in the in-code only mechanism used by Rockstor's replication wrapper. Please avoid commenting on this PR until this synopsis is completed: i.e. the entire replication life-cycle has been covered (5 replication events). Test setup is Leap 15.6 based (x86_64) both ends. Trivial data payload (empty) Share, to avoid the suspected cause of another recent replication blocker #2901 .
Initial stateBefore the first send and receive there exists only the original share to be replicated: no incidental or replication related snapshots.
Quotas:
|
1st Replication eventSenderrockstor.log:
journalctl: (note "2" is second replication tasks (not event) on this system)
We create the initial ro sender snapshot (subvol) .snapshots/repshare/repshare_2_replication_1
Qgroups:
which is then sent, in full, to receiver: ReceiverShows two shares in Web-UI, one starting with .snapshot is inadvertent and a known Web-UI bug re initial replication stages.
and
Qgroups
|
2nd ReplicationSenderrockstor.log
N.B. 6 in above due to missed replication events caused by disabling & reenabling send side replication service to analyse interim states. Qgroups
Source share now has 2 snapshots:
Second snapshot details:
ReceiverWeb-UI still shows the same shares as in the previous stage.
Qgroups:
Newest incremental snapshot
|
3rd ReplicationSenderrockstor.log
Qgroups
Source share now has 3 snapshots (as before these are displayed as expected in sender Web-UI):
Third snapshot details:
ReceiverWeb-UI Shares still show the original Share, and the anomaly first snapshot sent; as per the last 2 stages.
Qgroups
Newest incremental snapshot
And the view of our anomalous initial full subvol send which, currently at least, is parent host to 2nd subsequent differential send/receive subvols (receive side):
and the details of our 3rd (to date) received subvol :
With the latest having no snapshot info yet:
|
4th ReplicationSenderrockstor.log
Qgroups
Source share still has 3 snapshots, but they are the last 3, the first having been cleaned away.
Fourth/latest (in replication task run) snapshot details:
Receiverrockstor.log
Web-UI Shares are now looking more as expected; with the temporary anomolous .snapshot subvol not longer showing as the share.
Which relates to sending Appliance ID and original share name at sender Qgroups
And our newly supplanted subvol receiving share details:
All pool snapshot only volumes at this stage:
N.B. The following also indicates parent/received_uuid and path of all subvols on receiver.
|
5th ReplicationSenderrockstor.log
Qgroups
Source Share
latest (in replication task run) snapshot details:
Receiverrockstor.log
Qgroup
current statue of our recently supplanted receving share:
I.e. our latest share is now what repshare_2_replication_6 was, hence it now having repshare_2_replication_7 as it's recorded original snapshot. Note however that Rockstor shows this share as follows:
Which is as intended on the target system: Replication share represented as Share on target with 3 progressively newer snapshots that re rotated as newer snapshots are sent to the receiver. N.B. there is an argument for us returning the target Share to ro post our supplanting it's contents: to be addressed in a dedicated issue. As writes to it, will be lost as it is progressively supplanted by the oldest of the 3 cascading snapshots sent via replication. An artifact in-part of having insufficient documentation on replication and it's intended function: i.e. unidirectional differential block-level transit with history. |
From the 4th/5th replication task we assume a stable state: repeating what the previous replication event (4th) first establishes: the subvol migration of oldest of 3 ro snapshots to share. And we have also a share with now no received_uuid in evidence: Receiver rockstor replication clone (repclone)Our snap to share 'clone' facility enacts a similar process: enabling a share from snapshot transition.
|
@FroggyFlox @Hooverdan96 So we have now at least a little sketch of the replication workings clarified. I just wanted to have a newer reference to point to, and potentially use to guide a doc entry refresh. As per @Hooverdan96 finding, we very much look to have resolved this issue re the use of a |
I propose we merge this as-is given it simply add a I.e. from the following #2777 (comment)
And at that point we had the same behaviour as was reproduced here: using this PR branch (but with a Leap 15.6 receiver). |
sounds good to me. |
When promoting the oldest of the 3 read-only snapshots received & retained by the replication service (btrfs send/receive wrapper), use the force flag during ro-to-rw/snap-to-share transition. At the time of this transition, this received subvol is no longer used for comparison in all future replication (btrfs send/receive) events. It represents an older version of the sending systems associated replication source share. Necessarily older by way of the constraints of the btrfs send/receive architecture, and the safeguards of the replication wrapper: a cascade of ro snapshots.
Fixes #2902
Adds a
force
option (default False) to the existingbtrfs property set mnt_pt property_name property_value
wrapper, set_property(); and employs this force option when enacting a repclone (replication clone) snap-to-share promotion.