Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estuary performance testing #8

Open
anjor opened this issue Dec 2, 2022 · 3 comments
Open

Estuary performance testing #8

anjor opened this issue Dec 2, 2022 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@anjor
Copy link

anjor commented Dec 2, 2022

Proposal: Estuary performance testing

Author Anjor
Status Draft
Revision

This is a WIP

Proposal/Overview

We should have metrics on estuary's data onboarding performance. We should be able to answer questions such as

  • What is the data throughput? How does it scale with increasing data size? Is there a sweet spot?
  • What is the maximum size estuary can handle?

The current plan is to set up datasets in increasing sizes ranging from 1GB up to 1TB and measure data onboarding performance.

Technical Design

The performance testing will be carried out on an equinix box. We will download public datasets ranging in sizes from 1GB up to 1TB and try uploading them to estuary.

Known problems

Files larger than 32GB might have issues. Once the endpoint is unable to handle the upload, we will attempt using different preparation tools such as barge and singularity.

@anjor
Copy link
Author

anjor commented Dec 2, 2022

The end goal here is to have a full end to end data onboarding story fleshed out.

@anjor
Copy link
Author

anjor commented Dec 6, 2022

Some initial results.

Size (in GB) Time (in seconds): attempt 1 Time (in seconds): attempt 2 Time (in seconds): attempt 3 Average time
1.8 47 41 41 43
3.6 91 86 98 91.66666667
5.4 142 149 130 140.3333333
7.2 173 162 176 170.3333333
9 215 221 223 219.6666667
18 439 415 429 427.6666667
27 655 617 664 645.3333333
         

Estuary performance

The above test was carried out using a c3.small.x86 server in the Silicon Valley region of equinix metal. Uploads were tested against shuttle-4 due to proximity of location (shuttle-1 had content adding disabled).

@anjor anjor added documentation Improvements or additions to documentation and removed proposals labels Dec 21, 2022
@anjor
Copy link
Author

anjor commented Jan 12, 2023

Results for shuttle 7

Size (in GB) Time (in seconds): attempt 1 Time (in seconds): attempt 2 Time (in seconds): attempt 3 Average time
1.8 104 115 113 110.6666667
3.6 207 202 241 216.6666667
5.4 314 335 320 323
7.2 443 417 494 451.3333333
9 550 522 480 517.3333333
18 1054 1056 1103 1071
27 1569 1441 1955 1655

Estuary performance - shuttle 7

The above test was carried out using a c3.small.x86 server in the Dallas region of equinix metal. Uploads were tested against shuttle-7 due to proximity of location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants