Two chunk instances downloading the same file #36

danielfireman · 2022-11-24T22:28:19Z

Do we care about the same file being downloaded simultaneously by two chunk instances? Asking because I believe this can a pre-release feature.

Example:

$ ./chunk HTTP://a.b/c &
$ ./chunk HTTP://a.b/c &

The result of this sequence of operations is unknown.

We could deal with it after #8, by augmenting the infrastructure in place.

cc/ @cuducos

cuducos · 2022-11-25T15:51:44Z

With the changes from #38 we started to have a more predictable scenario here.

This would still be a scenario we wouldn't be prepared for:

$ ./chunk HTTP://a.b/c &
$ ./chunk HTTP://a.b/c &

But this would be OK:

$ ./chunk HTTP://a.b/c &
$ cd another-directory && ./chunk HTTP://a.b/c &

What would happen here is that the file c would be downloaded twice since the local path would be different.

In my opinion, two processing downloading to the same local path should not be allowed (classic concurrency problem both on the downloaded file and on the progress file). I am just not sure about how to control and prevent that.

cuducos · 2022-11-25T17:47:07Z

If #38 is ok, what about we create a ~/.chunk/lock file to only allow one instance of chunk running at a time?

danielfireman · 2022-11-26T13:19:57Z

If #38 is ok, what about we create a ~/.chunk/lock file to only allow one instance of chunk running at a time?

It is possible, but I believe we don't need to make it too restrictive. Bellow, I try to do some brainstorming about a less restrictive option: consider each progress management file a lock file.

That has the same complexity as creating a single lock file but allows us to have as many chunk instances as needed, as long as they do not download the same file.

That said, there is one faulty case missing that we should treat either way: what happens when the chunk instance crashes?

There is a straightforward way to solve this problem:

keep and update the PID of the process in the progress management file
keep and update the timestamp of the last activity (chunk download)

With that, every time a download is started, chunk should do as follows:

verify if the progress manager file is there
1. if it is, check the owner PID (if it is the owner, proceed) and the last activity (if it is too long ago, become the owner and proceed)
2. otherwise, create a progress manager file as usual and proceed

Of course, we could introduce the --force-start flag in the future, which guarantees the current chunk instance rules them all.

danielfireman added the question Further information is requested label Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two chunk instances downloading the same file #36

Two chunk instances downloading the same file #36

danielfireman commented Nov 24, 2022 •

edited

Loading

cuducos commented Nov 25, 2022

cuducos commented Nov 25, 2022

danielfireman commented Nov 26, 2022 •

edited

Loading

Two chunk instances downloading the same file #36

Two chunk instances downloading the same file #36

Comments

danielfireman commented Nov 24, 2022 • edited Loading

cuducos commented Nov 25, 2022

cuducos commented Nov 25, 2022

danielfireman commented Nov 26, 2022 • edited Loading

danielfireman commented Nov 24, 2022 •

edited

Loading

danielfireman commented Nov 26, 2022 •

edited

Loading