Skip to content

Latest commit

 

History

History
71 lines (43 loc) · 3.13 KB

README.md

File metadata and controls

71 lines (43 loc) · 3.13 KB

Chunk

Tests Format Lint GoDoc

Chunk is a download tool for slow and unstable servers.

Usage

CLI

Install it with go install github.com/cuducos/chunk/cmd/chunk@latest then:

$ chunk <URLs>

Use --help for detailed instructions.

API

The Download method returns a channel with DownloadStatus statuses. This channel is closed once all downloads are finished, but the user is in charge of handling errors.

Simplest use case

d := chunk.DefaultDownloader()
ch := d.Dowload(urls)

Customizing some options

d := chunk.DefaultDownloader()
d.MaxRetries = 42
ch := d.Dowload(urls)

Customizing everything

d := chunk.Downloader{...}
ch := d.Download(urls)

How?

It uses HTTP range requests, retries per HTTP request (not per file), prevents re-downloading the same content range and supports wait time to give servers time to recover.

Download using HTTP range requests

In order to complete downloads from slow and unstable servers, the download should be done in “chunks” using HTTP range requests. This does not rely on long-standing HTTP connections, and it makes it predictable the idea of how long is too long for a non-response.

Retries by chunk, not by file

In order to be quicker and avoid rework, the primary way to handle failure is to retry that “chunk” (content range), not the whole file.

Control of which chunks are already downloaded

In order to avoid re-starting from the beginning in case of non-handled errors, chunk knows which ranges from each file were already downloaded; so, when restarted, it only downloads what is really needed to complete the downloads.

Detect server failures and give it a break

In order to avoid unnecessary stress on the server, chunk relies not only on HTTP responses but also on other signs that the connection is stale and can recover from that and give the server some time to recover from stress.

Why?

The idea of the project emerged as it was difficult for Minha Receita to handle the download of 37 files that adds up to just approx. 5Gb. Most of the download solutions out there (e.g. got) seem to be prepared for downloading large files, not for downloading from slow and unstable servers — which is the case at hand.