Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedup before upload #1114

Merged
merged 6 commits into from
Jan 11, 2024
Merged

Dedup before upload #1114

merged 6 commits into from
Jan 11, 2024

Conversation

zatteo
Copy link
Contributor

@zatteo zatteo commented Jan 8, 2024

backup dedup = do not upload medias already present in the cozy

To do so, we need to fetch all remote medias and then compare them with all local medias. It can take some times, especially if we have thousands of medias remotely and locally (up to tens of minutes). That's why we now want to make the dedup media per media before every upload and not for all media before every backup.

@zatteo zatteo force-pushed the feat/dedup-before-upload branch 4 times, most recently from 0fc50c0 to c0e4f70 Compare January 10, 2024 13:44
@zatteo zatteo marked this pull request as ready for review January 10, 2024 13:45
backup dedup = do not upload medias already present in the cozy

To do so, we need to fetch all remote medias and then compare them
with all local medias. It can take some times, especially
if we have thousands of medias remotely and locally (up to 10 minutes).
That's why we now want to make the dedup media per media before every upload
and not for all media before every backup.

Let's start by fetching all remote medias during the preparation and
store them in memory.
It does not change the heart of the algorithm. It will just be easier
to debug if we need to investigate date comparison issues.
Before every upload, we check if the media is present in the
list of remote medias we fetched before.
…ont end

It can be strange for the user to see that the number of medias to
backup is not the same as the number of medias really uploaded in his backup folder.
So we specify that x medias have been deduplicated and return it to front end.
@zatteo zatteo force-pushed the feat/dedup-before-upload branch from c0e4f70 to 4843eda Compare January 11, 2024 10:50
@zatteo zatteo merged commit 8d45978 into master Jan 11, 2024
1 check passed
@zatteo zatteo deleted the feat/dedup-before-upload branch January 11, 2024 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants