@s3/store: add backpressure + download incomplete part first #561

fenos · 2024-01-19T10:26:35Z

This PR improves the handling of incomplete parts and adds backpressure by limiting the number of concurrent uploads/files on the disk that are being created.

Incomplete Parts

The incomplete part logic was somewhat not always working as intended.

An issue I came across with the old implementation is that when i upload a file using chunked upload from the client, and the chunk is bigger than the preferred size on the server, at times the offset returned wasn't calculated properly resulting in the subsequent request to return 409 (Offset-Conflict) very much related to #501

This PR fixes this issue, by always downloading the incomplete part first and prepending it to the readstream.
By using this approach the offset is now calculated properly and also fixes #501 (see test)

Backpressure

Added support for backpressure. The way it works is by using a semaphore on the StreamSplitter.
Only X number of chunks on disk can be created at any one time (default 30) and the semaphore is released upon finishing the part upload, meaning that the backpressure is handled by how fast S3 can upload parts.

In the scenario when the limit of concurrent chunks has been reached, the default node mechanism of backpressure will kick in. This means that when the highwater mark on the readstream buffer is full, the stream will be paused until the flow of data continues and the semaphore has capacity.

This aligns with the issue #505

fenos · 2024-01-20T11:41:43Z

packages/server/src/handlers/BaseHandler.ts

-          proxy.destroy(err)
+          // we end the stream gracefully here so that we can upload the remaining bytes to the store
+          // as an incompletePart
+          proxy.end()


A very small but nice change here.
When an upload request is canceled, Instead of calling proxy.destroy() we can call proxy.end() to gracefully close the passthrough stream, so that we can flush the bytes of a chunk as an incomplete part and upload it to S3.

This will result in starting exactly where we left off as opposed to starting from the last successful part uploaded.

Murderlon

A great addition!

Might need to go over it one more time but some initial feedback here.

packages/server/src/models/StreamSplitter.ts

packages/s3-store/index.ts

…3-store/stream-handling-improvements

fenos · 2024-01-24T11:49:45Z

@Murderlon I've updated the Readme and added a detailed documentation on the maxConcurrentPartUploads flag.

I've also increased the value to 60 instead of 30 (~480MB), as this aligns better with tusd default.
Tusd uses 50MB partSize and 10 maxConcurrentPartUploads totaling at 500MB.

Let me know if there is anything else, this change will be awesome to have in

…upabase/tus-node-server into @s3-store/stream-handling-improvements

Murderlon

LGTM 👌

I think I would like a quick review from @Acconut on this before merging.

packages/s3-store/README.md

Co-authored-by: Merlijn Vos <[email protected]>

Acconut

Amazing work, @fenos!

Acconut · 2024-01-26T07:06:49Z

packages/s3-store/README.md

@@ -91,6 +91,22 @@ you need to provide a cache implementation that is shared between all instances

 See the exported [KV stores][kvstores] from `@tus/server` for more information.

+#### `options.maxConcurrentPartUploads`
+
+This setting determines the maximum number of simultaneous part uploads to an S3 storage service.


It might be helpful to state in this document what a part actually is. Maybe a brief explanation saying that this module uses S3 multipart uploads, which splits a file into multiple equally-sized parts (independent of the PATCH requests). With this context, it's easier to understand what concurrent part uploads actually means.

Acconut · 2024-01-26T07:19:47Z

packages/s3-store/README.md

+#### `options.maxConcurrentPartUploads`
+
+This setting determines the maximum number of simultaneous part uploads to an S3 storage service.
+The default value is 60. This default is chosen in conjunction with the typical partSize of 8MiB, aiming for an effective transfer rate of approximately 480MiB/s.


The calculation seems assume that one part is uploaded in 1s. Might be worth mentioning. Also transfer speeds are commonly given in Mbit/s not MiB/s. That would be 3.84 Gbit/s, which seems very high. I am not sure if the transfer speeds to S3 reach that level.

Traffic between Amazon EC2 and Amazon S3 uses up to 100 Gbps of bandwidth to Amazon Virtual Private Cloud (Amazon VPC) endpoints and public IP addresses in the same AWS Region.

According to google is 12.5 Gbit/s

packages/s3-store/index.ts

Acconut · 2024-01-26T07:44:01Z

packages/server/src/handlers/BaseHandler.ts

-          proxy.destroy(err)
+          // we end the stream gracefully here so that we can upload the remaining bytes to the store
+          // as an incompletePart
+          proxy.end()


packages/server/src/models/Semaphore.ts

packages/server/src/models/StreamSplitter.ts

Acconut · 2024-01-26T08:51:29Z

Regarding replacing internal code with dependencies, I would wait for a comment from @Murderlon. Maybe his position is different than mine :)

Murderlon · 2024-01-26T09:23:30Z

Regarding replacing internal code with dependencies, I would wait for a comment from @Murderlon. Maybe his position is different than mine :)

I'd personally would avoid adding a dependency if it's not much code, we likely rarely/never have to change it, and we don't foresee complex side-effects. So I don't think we need a package to concat streams or unique tmp files. For the semaphore I don't have a strong opinion, in this case it's a well tested copy paste from Shopify so from a trust perspective I'd say it's equal (or higher) to a package. But if there is a similar well tested package with a small footprint I would be okay with that too. I'm okay with either.

Acconut · 2024-01-29T06:54:31Z

Alright, if you prefer less dependencies, we can go with that route as well. That's probably a topic where personal preferences differ.

Murderlon · 2024-01-30T09:22:17Z

I just saw that @shopify/semaphore is available as a package. In that case I agree with @Acconut that a dependency is better.

socket-security · 2024-02-05T08:43:47Z

New dependencies detected. Learn more about Socket for GitHub ↗︎

Package	New capabilities	Transitives	Size	Publisher
npm/@shopify/[email protected]	None	`0`	8.77 kB	shopify-dep
npm/@types/[email protected]	None	`+1`	764 kB	types
npm/[email protected]	Transitive: environment	`+7`	196 kB	feross

View full report↗︎

fenos · 2024-02-05T08:48:57Z

Hello @Murderlon @Acconut

Sorry for the delay on this, has been very busy last week.
I've made all the changes we discussed on this PR here is the breakdown of my latest commit:

installed @shopify/semaphore as a dependency.
installed multistream dependency to handle stream concatenation. The reasoning behind this is that my implementation might have missed some edge cases that I'm not aware of. This library has been battle-tested for years and it seems good to rely on this for concatenating streams.
Fixed calculation on documentation

Let me know if there is anything else

Murderlon

Changes look good 👍 One Q

packages/s3-store/index.ts

…upabase/tus-node-server into @s3-store/stream-handling-improvements

fenos · 2024-02-05T11:51:13Z

@Murderlon all ready

Murderlon

Thanks!

fenos force-pushed the @s3-store/stream-handling-improvements branch from 7f85eed to cfebba6 Compare January 19, 2024 10:43

fenos requested a review from Murderlon January 19, 2024 10:43

fenos force-pushed the @s3-store/stream-handling-improvements branch 9 times, most recently from 1f376a2 to e51ab1a Compare January 20, 2024 11:36

fenos commented Jan 20, 2024

View reviewed changes

fenos force-pushed the @s3-store/stream-handling-improvements branch from e51ab1a to 8d805d4 Compare January 20, 2024 11:47

fenos mentioned this pull request Jan 20, 2024

@tus/s3-store: Issues with StreamSplitter #505

Open

fenos force-pushed the @s3-store/stream-handling-improvements branch 4 times, most recently from 33c82d5 to 138c7c4 Compare January 20, 2024 13:38

fenos requested a review from Acconut January 20, 2024 13:39

fenos force-pushed the @s3-store/stream-handling-improvements branch from 138c7c4 to 0e8006e Compare January 20, 2024 13:48

chore: add backpressure + download incomplete part first

6235da5

fenos force-pushed the @s3-store/stream-handling-improvements branch from 0e8006e to 6235da5 Compare January 22, 2024 11:23

Murderlon reviewed Jan 22, 2024

View reviewed changes

packages/server/src/models/StreamSplitter.ts Outdated Show resolved Hide resolved

packages/s3-store/index.ts Show resolved Hide resolved

packages/s3-store/index.ts Outdated Show resolved Hide resolved

review fixes

8a1d6c7

fenos force-pushed the @s3-store/stream-handling-improvements branch from 7ccd14f to 8a1d6c7 Compare January 22, 2024 15:57

fenos added 3 commits January 23, 2024 11:18

Merge branch 'main' into @s3-store/stream-handling-improvements

3d72893

document maxConcurrentPartUploads

b0dad84

Merge branch 'main' of https://github.com/tus/tus-node-server into @s…

a4ffbd0

…3-store/stream-handling-improvements

Murderlon mentioned this pull request Jan 24, 2024

@tus/s3-store: finalize incomplete parts #502

Closed

Merge branch '@s3-store/stream-handling-improvements' of github.com:s…

6499867

…upabase/tus-node-server into @s3-store/stream-handling-improvements

fenos force-pushed the @s3-store/stream-handling-improvements branch from 79fc3f6 to 6499867 Compare January 24, 2024 11:50

Murderlon approved these changes Jan 24, 2024

View reviewed changes

packages/s3-store/README.md Outdated Show resolved Hide resolved

packages/s3-store/README.md Outdated Show resolved Hide resolved

fenos and others added 2 commits January 24, 2024 12:02

Update packages/s3-store/README.md

c95db2b

Co-authored-by: Merlijn Vos <[email protected]>

Update packages/s3-store/README.md

84eaff5

Co-authored-by: Merlijn Vos <[email protected]>

Acconut requested changes Jan 26, 2024

View reviewed changes

review fixes

7236f44

fenos force-pushed the @s3-store/stream-handling-improvements branch from 7073828 to 9306d5f Compare February 5, 2024 08:49

Murderlon reviewed Feb 5, 2024

View reviewed changes

packages/s3-store/index.ts Outdated Show resolved Hide resolved

Merge branch '@s3-store/stream-handling-improvements' of github.com:s…

37db04f

…upabase/tus-node-server into @s3-store/stream-handling-improvements

fenos force-pushed the @s3-store/stream-handling-improvements branch from 9306d5f to 0f8098e Compare February 5, 2024 11:19

rebase: main

2aacf2e

fenos force-pushed the @s3-store/stream-handling-improvements branch from 0f8098e to 2aacf2e Compare February 5, 2024 11:31

Murderlon approved these changes Feb 5, 2024

View reviewed changes

Murderlon merged commit a5a4cd3 into tus:main Feb 5, 2024
4 checks passed

Murderlon added a commit that referenced this pull request Feb 5, 2024

Add changeset for #561

0393e75

fenos deleted the @s3-store/stream-handling-improvements branch February 5, 2024 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

@s3/store: add backpressure + download incomplete part first #561

@s3/store: add backpressure + download incomplete part first #561

fenos commented Jan 19, 2024 •

edited

Loading

fenos Jan 20, 2024 •

edited

Loading

Acconut Jan 26, 2024

Murderlon left a comment

fenos commented Jan 24, 2024

Murderlon left a comment

Acconut left a comment

Acconut Jan 26, 2024

Acconut Jan 26, 2024

fenos Feb 5, 2024 •

edited

Loading

Acconut Jan 26, 2024

Acconut commented Jan 26, 2024

Murderlon commented Jan 26, 2024 •

edited

Loading

Acconut commented Jan 29, 2024

Murderlon commented Jan 30, 2024

socket-security bot commented Feb 5, 2024 •

edited

Loading

fenos commented Feb 5, 2024

Murderlon left a comment

fenos commented Feb 5, 2024

Murderlon left a comment

@s3/store: add backpressure + download incomplete part first #561

@s3/store: add backpressure + download incomplete part first #561

Conversation

fenos commented Jan 19, 2024 • edited Loading

Incomplete Parts

Backpressure

fenos Jan 20, 2024 • edited Loading

Choose a reason for hiding this comment

Acconut Jan 26, 2024

Choose a reason for hiding this comment

Murderlon left a comment

Choose a reason for hiding this comment

fenos commented Jan 24, 2024

Murderlon left a comment

Choose a reason for hiding this comment

Acconut left a comment

Choose a reason for hiding this comment

Acconut Jan 26, 2024

Choose a reason for hiding this comment

Acconut Jan 26, 2024

Choose a reason for hiding this comment

fenos Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Acconut Jan 26, 2024

Choose a reason for hiding this comment

Acconut commented Jan 26, 2024

Murderlon commented Jan 26, 2024 • edited Loading

Acconut commented Jan 29, 2024

Murderlon commented Jan 30, 2024

socket-security bot commented Feb 5, 2024 • edited Loading

fenos commented Feb 5, 2024

Murderlon left a comment

Choose a reason for hiding this comment

fenos commented Feb 5, 2024

Murderlon left a comment

Choose a reason for hiding this comment

fenos commented Jan 19, 2024 •

edited

Loading

fenos Jan 20, 2024 •

edited

Loading

fenos Feb 5, 2024 •

edited

Loading

Murderlon commented Jan 26, 2024 •

edited

Loading

socket-security bot commented Feb 5, 2024 •

edited

Loading