Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(utils): use multiple sources to download snark artifacts #65

Closed
sripwoud opened this issue Apr 26, 2024 · 12 comments · Fixed by #66
Closed

feat(utils): use multiple sources to download snark artifacts #65

sripwoud opened this issue Apr 26, 2024 · 12 comments · Fixed by #66
Assignees

Comments

@sripwoud
Copy link
Member

sripwoud commented Apr 26, 2024

I don't really know why we see bad gateway errors in the ci when attempting to fetch the artifacts.
https://github.com/privacy-scaling-explorations/zk-kit/actions/runs/8837164770/job/24265411154#step:9:666

As an improvement (hopefully fix), I am thinking about a retry mechanism:

The random sorting would happen on each download attempts (whether it is for a a different proof, or different artifact for a same proof...), which should protect us against throttling (if that is the root cause of the bad gateway issues...)

@sripwoud
Copy link
Member Author

Benchmark results
#23

@artwyman
Copy link

I'm following your work here closely, because I'm working on something similar in the Zupass repo. We're going to have some slightly different concerns and trade-offs, but I think a similar artifact-distribution scheme will work for us. I'm most focused on artifact download in-browser, while for Node servers and utests we're more likely to use local copying, or NPM devDependencies.

A heads-up on something I learned in my own testing, in case you haven't tested your solution in-browser yet. GitHub's raw links behave differently in a browser depending on whether you're going via github.com or githubusercontent.com, which set a different Access-Control-Allow-Origin header. Below are two equivalent URLs for one of my artifacts I'm testing. My tests are for files in the repo itself, rather than release packages, but I suspect the behavior will be similar:

https://github.com/proofcarryingdata/snark-artifacts/raw/artwyman/experimental/packages/proto-pod-gpc/proto-pod-gpc_1o-1e-5md-1x10l-2x2t-vkey.json
https://raw.githubusercontent.com/proofcarryingdata/snark-artifacts/artwyman/experimental/packages/proto-pod-gpc/proto-pod-gpc_1o-1e-5md-1x10l-2x2t-vkey.json

The first URL redirects to the second, but only the second one can be successfully accessed in a browser (outside of a github.com page).

The minimal documentation about raw downloads from GitHub makes me nervous about relying on it too heavily, as it could have rate-limits which aren't published. It seems it's widely used in the community, and it is mentioned (in passing) as "always available" in this doc article https://docs.github.com/en/repositories/creating-and-managing-repositories/repository-limits
which is about limits, but doesn't say anything about download bandwidth limiting or rate limiting.

@cedoor
Copy link
Member

cedoor commented May 15, 2024

Thank you very much @artwyman 🙏🏽

@sripwoud
Copy link
Member Author

sripwoud commented May 16, 2024

@artwyman I was waiting for privacy-scaling-explorations/zk-kit#275 to merge before getting back to working on this.
Thanks for sharing what you experienced with github URLs!

in case you haven't tested your solution in-browser yet

I hadn't tested dls in browser environments at all actually. I ll try to do some tests myself too.
I had just used a shell script to test/benchmark the downloads:
https://github.com/privacy-scaling-explorations/snark-artifacts/blob/main/scripts/bin/benchmark
#23

about raw downloads from GitHub makes me nervous about relying on it too heavily

If GitHub is 1 of 3 sources, probably that's ok in the beginning?
We would consider this an initial solution.

I am starting to wonder whether we should not move all this logic server side later (so more an issue to tackle in snark-artifacts):

  • Have our own delivery service like a cloudflare worker (or equivalent like amazon lambda@edge...).
  • The logic to try to fetch from several sources would be implemented server side in that worker

The benefits would be:

Wouldn't this be a more robust mitigation in the long term against the rate limiting challenge we are facing when attempting to download those relatively big artifacts files.

@artwyman
Copy link

Would your theoretical server-side solution need to "download" at all, or just be deployed with artifacts pre-installed? That's likely what I'll end up doing with the Zupass server, with the artifacts being delivered to the build+deploy system as an NPM package (a devDependency). Of course, that only allows serving up a single version, which is what the Zupass web client will need.

If you need to support multiple versions and don't want to store them all on your own server the answer might be more complicated. It's the separately-versioned cases for 3rd-party developers, and also for Zupass development during experimental phases where I'm anticipating using unpkg or GitHub, so I'm hopeful the usage rate won't lead to throttling in the near term.

But I know our requirements and trade-offs are different, so I'm sharing all this as useful context, not to say it's what you need to do as well. I'll keep an eye on what I can learn from your experience going forward too.

@sripwoud
Copy link
Member Author

Would your theoretical server-side solution need to "download" at all, or just be deployed with artifacts pre-installed?

I am still thinking about it ^^. There are indeed different solutions and trade offs I guess.

I invite you to continue discussing there if you get further ideas/thoughts as you've been sharing valuable inputs 🙏
#38

support multiple versions

yes that's a rather strong requirement we want to fulfill.

@sripwoud sripwoud transferred this issue from privacy-scaling-explorations/zk-kit May 26, 2024
@github-project-automation github-project-automation bot moved this from 🏗 In Progress to ✔️ Done in ZK-Kit May 27, 2024
@github-project-automation github-project-automation bot moved this to ✔️ Done in SNARK Artifacts May 27, 2024
@sripwoud
Copy link
Member Author

sripwoud commented Jun 21, 2024

@artwyman Just pinging you directly to bring your attention on the latest change we've done.
Instead of multiple public CDNs, we now use a single private CDN (s3 bucket + cloudfront) to serve all our artifacts.
See #91

Which has improved the download speed significantly
#96 (comment)

@artwyman
Copy link

@artwyman Just pinging you directly to bring your attention on the latest change we've done. Instead of multiple public CDNs, we now use a single private CDN (s3 bucket + cloudfront) to serve all our artifacts. See #91

Which has improved the download speed significantly #96 (comment)

Thanks for the heads-up. We may have to face the same issue in Zupass, though we haven't yet. We don't much usage of our GPC artifacts yet, and the usage inside Zupass is downloaded directly from the Zupass server, so the 3rd-party CDNs will only be stressed when we get more 3rd-party devs building integrations.

I have two follow-up questions to understand what you did here and how we might need to follow it:

  • Was the decision to move to S3+Cloudfront purely about performance, or was it driven by the failures mentioned in this issue? Did you ever get a better understanding of the cause of those, or did the retries resolve them?
  • Is there documentation anywhere of the specifics of your approach to the hosting side? I see the upload code in the repo. I haven't used S3+Cloudfront before, though I can figure it out when the time comes.

@sripwoud
Copy link
Member Author

sripwoud commented Jun 26, 2024

Was the decision to move to S3+Cloudfront purely about performance, or was it driven by the failures mentioned in this issue? Did you ever get a better understanding of the cause of those, or did the retries resolve them?

both because of performance and download failures.
Download failures were because the CDN we were using (esp unpkg) were throttling requests.
Speed was rather bad too.

Is there documentation anywhere of the specifics of your approach to the hosting side? I see the upload code in the repo. I haven't used S3+Cloudfront before, though I can figure it out when the time comes.

Would you be interested in a IaC implementation to make that config more transparent (I am thinking terraform or pulumi). We don't use it yet. But we might. What do you think @ntampakas ?

@ntampakas
Copy link
Contributor

ntampakas commented Jun 26, 2024

Hey @sripwoud ,
All AWS/Infra implementations are done using IaC, more specific in Terraform/Ansible. Currently, the code is in CodeCommit, but the DevOps team plans to migrate/transfer the code to GitHub.

@artwyman
Copy link

artwyman commented Jun 26, 2024

both because of performance and download failures. Download failures were because the CDN we were using (esp unpkg) were throttling requests. Speed was rather bad too.

Do they have published limits or was it all experimentally discovered? I'll need to decide at some point whether to prioritize this for non-Zupass users of our artifacts.

All AWS/Infra implementations are done using IaC, more specific in Terraform/Ansible. Currently, the code is in CodeCommit, but the DevOps team plans to migrate/transfer the code to GitHub.

This sounds great, thanks. I'll watch for this stuff to show up in GitHub (I assume in this repo?). It may be a while before we need something similar for Zupass-related artifacts, but when that time comes it'll be helpful to see what you did here. All my personal experience with cloud-hosting technologies was in the context of internal tools in a big company, but I'm sure I can figure out the AWS tools when I have to.

@sripwoud
Copy link
Member Author

Do they have published limits or was it all experimentally discovered? I'll need to decide at some point whether to prioritize this for non-Zupass users of our artifacts.

experimentally discovered, here is the recurring error we were seeing. I haven't found any docs (haven't searched very long) or detailed info about throttling in https://github.com/mjackson/unpkg/issues. But I assumed this was the reason as this error was only happening when trying to perform requests in parallel in the tests (I haven't seen errors when fetching sequentially).

I assume in this repo?

yes, if otherwise we ll keep you informed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✔️ Done
Status: ✔️ Done
4 participants