Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] dsync --write-verify option? #581

Open
markmoe19 opened this issue Aug 20, 2024 · 4 comments
Open

[Request] dsync --write-verify option? #581

markmoe19 opened this issue Aug 20, 2024 · 4 comments

Comments

@markmoe19
Copy link

Hi,

We are looking at ways to increase the performance of dsync -c (content) compare. As you would imagine, this can be very time-consuming but at the same time users want this "to be sure" that final sync has an exact copy of their data.

We can do a lot of syncs before final-sync and cut-over to a new file system. However, the dsync -c right now is all or nothing and that takes a long time then for a final-sync for cut-over and that final-sync is when users are "down" so they don't make any changes.

If dsync had an option somewhere between -c (content) and -lite (like dcmp), then it could be ideal when for when doing a series of dsync before final-sync.

This the -lite-content option compares size, mtime and atime (assuming --open-noatime is used as well) such that if all 3 match then content is not compared. But if one is different (even if just atime) the full content is compared.

Thank you for these great utilities!

  • Mark
@markmoe19
Copy link
Author

After a new (or changed) file is written to the target then it has to be re-read to compare content and make sure it was written correctly. This can possible be done with checksums. In this way, each subsequent run builds on the successful compared content of the previous run and no need to re-compare (unless size, mtime or atime changed). Thanks

@markmoe19
Copy link
Author

Talking about this internal to our company, I think what we are looking for is to replace -c (byte-compare) with --write-verify option.

When dsync writes a new file (because the target file does not exist or because the source file changed since last written), then after dsync completes the write we want it to verify the write. This could be using byte-compare or a checksum method (similar to rsync --checksum option). Potentially the checksum for the source side could be calculated as part of normal reading of source side file. The checksum of the target side would require an additional read of the recently written target file.

The end goal is for each subsequent dsync run of the same source and target to build with confidence on the previous run knowing that each written file was verified. But, if file time and size match then no need for a new write of target file to occur.

@markmoe19 markmoe19 changed the title [Request] dsync -lite-content option? [Request] dsync --write-verify option? Aug 30, 2024
@adilger
Copy link
Contributor

adilger commented Aug 30, 2024

One potential issue of doing the read verification immediately after write is that this may miss issues if the file is only in cache on a client node, and not saved persistently (or correctly) to storage on the server node.

At a minimum this should need to do the read verification on another node (like IOR does).

@markmoe19
Copy link
Author

maybe the read part of the write verification could be done at the end after issuing a drop caches commands
or, like you mention, on a node different than the write file node (assuming dsync was running on more than 1 nodes that should be possible), thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants