Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulsar's CI Roadmap #685

Open
7 tasks done
confused-Techie opened this issue Aug 23, 2023 · 23 comments
Open
7 tasks done

Pulsar's CI Roadmap #685

confused-Techie opened this issue Aug 23, 2023 · 23 comments
Labels
enhancement New feature or request

Comments

@confused-Techie
Copy link
Member

confused-Techie commented Aug 23, 2023

Have you checked for existing feature requests?

  • Completed

Summary

As some of you may know, CirrusCI (Which is the service we use to build all of our Pulsar binaries, including rolling and regular release) will soon be slashing the features of their free tier significantly. While we aren't totally opposed to paying for their services, the issue is, with our current usage, at the end of the month when these changes go into effect, it'll cost the Pulsar org ~$350USD per month.

This is more than it costs to run the backend per month by a pretty large margin, making it untenable for us to reasonably charge the Open Collective account for this. Meaning we have to find some other way to create our binaries.


While there have been several ideas of what we can do floating around:

  • Move to CircleCI
  • Slim down our CirrusCI usage as much as possible (but seems that may not be by much)
  • Move to GitHub Actions
  • Use some other non-CI service, such as cloud hosted VMs and build there

While there has been significant discussion about this over on Discord, it seems a popular solution is moving as much as possible to GitHub Actions.

And as a do-ocracy, I've elected to attempt this transition over on #682

This transition is working pretty well, being able to run visual tests with just as much previous success over on CirrusCI, as well as cleaner retry capabilities, and of course being able to build binaries just as we would expect.

Although there are several kinks to work out still, and to track what we know of, I'll compile this issue here to ensure proper traceability in solutions, and problems:

Platform Support

GitHub Actions only supports windows, macos, and ubuntu. Meaning we have to native way to build Linux ARM binaries, or Apple Silicon binaries. (Although it may be worth mentioning that Apple Silicon is on the GitHub Actions roadmap ).

Meaning we need to figure out support for:

  • Apple Silicon Binary Support
  • Linux ARM Binary Support

Possible Platform Support Solutions

We could decide to still build only these binaries on CirrusCI, especially since CirrusCI supports GitHub Actions runners on all their supported platforms, which means it'll integrate very nicely into our new build pipeline. But it's important to mention that with their pricing changes, last month (a lower usage month for us) would still cost us ~$305USD for Apple platforms, sure half of that would be removed since we aren't building Intel builds there, but that's still a possible cost of ~$115USD just for Apple Silicon builds every month on CirrusCI.

So very likely, another solution must be found.

One possible solution that's been discussed, considering the cost of Apple Silicon cloud environments, it may be worth it to purchase a low spec, Apple Silicon machine second hand. Which while pricy around ~$400USD, it would only take a few months to start saving on this purchase, and use the machine as a Pulsar dedicated GitHub Self-Hosted runner. The same may be possible with an Linux ARM machine.

Regular Release

This likely won't be as much of an issue. Since it's totally possible to download binaries manually from GitHub Actions, and upload them to our release, which is what we have been doing this whole time, although, it would be worth testing if electron-builder is actually capable of automatically uploading binaries to a GitHub draft release, if available. Which could actually mean this saves us more time in the long run.

Rolling Release

This will be a big issue.

Currently, binaries produced by GitHub Actions are wrapped up in a zip file. Meaning there is no easy way to provide them as downloads to users, likely meaning we will have to fully rewrite the download Microservice, that provides downloads for the Rolling Release as well as find a completely new system and methodology of providing and storing our rolling release. Although we could take some inspiration from the way Atom handled their nightly releases. But otherwise there are still several unanswered questions there.

Summary of issues presented above:

  • Support for Apple Silicon
  • Support for Linux ARM
  • Distribution Method of Rolling Release
  • Storage Method of Rolling Release

What benefits does this feature provide?

A CI platform to build binaries for Pulsar.

Any alternatives?

Many, that is the purpose of this issue.

Other examples:

No response

@confused-Techie confused-Techie added the enhancement New feature or request label Aug 23, 2023
@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 25, 2023

Great summary!

macOS: The easiest solution for macOS is to just provide intel binaries, and have Apple Silicon users take whatever performance hit there may be from translating ARM to x86 compatibility via Rosetta 2. [EDIT: Not great, but free. So, we can take this option (with its inherent trade-off between cost/perf) and keep it in mind when going forward.]

ARM Linux: Since ARM Linux is fairly cheap on Cirrus, we could continue building those on Cirrus.

This could be as an "also-supported" but not primary tier of binary release support, to give fair warning that, as the only thing left on Cirrus, and with fewer end-users, we might not immediately notice if it stops working, and it will be somewhat of a chore to maintain the second CI config for few end-users. Or... maybe it will be so easy to keep doing ARM Linux on Cirrus that we just don't think twice about it and it remains "tier-1" support. Time and experience will tell...

Cirrus Pricing: Also: I need to confirm the math I did pricing out Cirrus. They said something about much-reduced effective pricing for macOS and Windows, but I suspect the graph I consulted may not have been adjusted for those pricing adjustments that were said to have just started taking effect as of August 1.

So I'll reach out to Cirrus and see if they can explain that pricing situation over the next few days and clarify this for us...

EDIT: the graph matches the pricing listed here: https://cirrus-ci.org/pricing/#compute-credits (at least I checked for macOS. I checked that it matches the math for what the graph shows for macOS CPU hours converted to --> credits). I'm still not 1000% sure the new pricing is what they're showing on that page, or if that's the old pricing and the graph is also on the old pricing, hence the consistency? But that page was edited on August 4th, apparently, so it may be the new pricing...? Still might need to contact them to clarify.

EDIT AGAIN: Okay, looks like this is indeed the updated pricing. I found a Web Archive snapshot of the pricing page from July, the month just before the change took place August 1: https://web.archive.org/web/20230726163213/https://cirrus-ci.org/pricing/#compute-credits

Cirrus CI Compute Credit pricing BEFORE (July 2023) (click to expand):

1 compute credit can be bought for 1 US dollar. Here is how much 1000 minutes of CPU time will cost for different platforms:

  • 1000 minutes of 1 virtual CPU for Linux platform for 5 compute credits
  • 1000 minutes of 1 virtual CPU for FreeBSD platform for 5 compute credits
  • 1000 minutes of 1 virtual CPU for Windows platform for 10 compute credits
  • 1000 minutes of 1 Apple Silicon CPU or 2 Intel virtual CPUs for macOS platform for 40 compute credits

All tasks using compute credits are charged on per-second basis. 2 CPU Linux task takes 2 minutes? Pay 2 cents.

Cirrus CI Compute Credit pricing AFTER (August 2023) (click to expand):

1 compute credit can be bought for 1 US dollar. Here is how much 1000 minutes of CPU time will cost for different platforms:

  • 1000 minutes of 1 virtual CPU for Linux platform for 3 compute credits
  • 1000 minutes of 1 virtual CPU for FreeBSD platform for 3 compute credits
  • 1000 minutes of 1 virtual CPU for Windows platform for 4 compute credits
  • 1000 minutes of 1 Apple Silicon CPU for 15 compute credits

All tasks using compute credits are charged on per-second basis. 2 CPU Linux task takes 5 minutes? Pay 3 cents.

Best,
- DeeDeeG

@savetheclocktower
Copy link
Contributor

macOS: The easiest solution for macOS is to just provide intel binaries, and have Apple Silicon users take whatever performance hit there may be from translating ARM to x86 compatibility via Rosetta 2.

I think we can mitigate this a bit: it's certainly too onerous to ask somebody to build a new Silicon binary with each rolling release, but it can at least be part of the release checklist to have one of our Apple Silicon–havers build Pulsar manually once a month for the regular releases.

That said, I think that if Cirrus can give us a straight answer on the SSL stuff or we can figure out a way not to have our CI need to download ripgrep, then Cirrus still seems like the best option for Apple Silicon. If we moved other platforms to GitHub Actions and just used Cirrus for sanity checks on Silicon and binaries, then we could avoid most of the pain of this change.

We could even throttle Silicon usage on Cirrus to run less often than with every PR — maybe once a day, or only when PRs land, or something. I feel like this would be less worse than most of the options we've been discussing.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 25, 2023

How many Apple Silicon macOS runs we could do in 50 credits-equivalent on Cirrus

If we use Cirrus strictly for Apple Silicon macOS builds, we can get up to 833.333[...] minutes of macOS builds per month within the free 50 credits.

Our typical Apple Silicon task run lengths (and how many of those we could do) are:

  • Shortest recorded run: 14.35 minutes --> (up to 58 task runs per month)
  • Average run length: 17.947 minutes --> (up to 46 runs)
  • Median run length: 17.292 minutes --> (up to 48 runs)
  • Longest recorded run: 26.6 minutes --> (up to 31 runs)
How many real-world Apple Silicon CI run minutes can we do per month? Showing my work... (click to expand):

The essential numbers this is using, with sources:

The math:

  • 1000 CPU-minutes per 15 credits, divided by 4 CPU cores = 250 real-world minutes of macOS tasks per 15 credits
  • 50 credits' worth of free usage per month divided by 15 = 3.3333[...]. So, we get 3.3333[...] times the per-15-credits allotment of minutes of macOS tasks
  • 3.3333[...] times the per-15-credit real-world minutes (i.e. 250) = 833.3333[...] real-world minutes of macOS tasks per month

Table of our typical Apple Silicon macOS task durations... (click to expand):
Duration Decimal minutes Link to CI run commit SHA Date
18:47 18.783 https://cirrus-ci.com/build/6187472653123584 11f662c Created at 12:14:14 PM on Fri Aug 25 2023
16:51 16.85 https://cirrus-ci.com/build/5543977768714240 54eaba3 Created at 9:02:27 PM on Thu Aug 24 2023
26:36 26.6 https://cirrus-ci.com/build/5303988216659968 c7e2567 Created at 8:09:20 PM on Tue Aug 22 2023
21:58 21.967 https://cirrus-ci.com/build/6014699137925120 0c65971 Created at 7:43:50 PM on Tue Aug 22 2023
14:21 14.35 https://cirrus-ci.com/build/6329416758853632 cd16715 Created at 1:42:18 AM on Wed Aug 16 2023
15:13 15.217 https://cirrus-ci.com/build/6204189873799168 aabb845 Created at 8:20:02 PM on Wed Aug 09 2023
17:36 17.6 https://cirrus-ci.com/build/4910394007879680 76e358b Created at 7:26:52 PM on Tue Jul 25 2023
14:59 14.983 https://cirrus-ci.com/build/6175381313552384 099d5b8 Created at 10:40:41 PM on Sat Jul 22 2023
16:59 16.983 https://cirrus-ci.com/build/5118020444487680 5a2d4d9 Created at 10:19:15 PM on Mon Jul 10 2023
18:01 18.017 https://cirrus-ci.com/build/4696558575288320 0389eee Created at 12:12:32 AM on Thu Jul 06 2023
18:51 18.85 https://cirrus-ci.com/build/6534048663732224 90ec8e5 Created at 3:53:30 PM on Tue Jul 04 2023
15:10 15.167 https://cirrus-ci.com/build/6662199431659520 fc156b3 Created at 11:38:19 PM on Thu Jun 29 2023

Our typical Apple Silicon task run lengths (and how many of those we could do, given 833.33 real-world minutes of macOS runs we can do in a month for free) are:

  • Shortest recorded run: 14.35 minutes --> (up to 58 task runs per month)
  • Average run length: 17.947 minutes --> (up to 46 runs)
  • Median run length: 17.292 minutes --> (up to 48 runs)
  • Longest recorded run: 26.6 minutes --> (up to 31 runs)

So, I estimate we could do anywhere from 31-58 macOS runs per month (including re-runs for failures, wasted or mistakenly scheduled runs, etc), and that ~30 is a safer conservative estimate.

Conclusion:

That means we could run one just about daily, but not necessarily one literally every day, and keeping in mind that that would leave no run-time for any ARM Linux builds or separate "Regular" version number builds, or failed/mistakenly scheduled/otherwise "wasted" runs.

So, IMO we should look at ~30 and say we can't do it more than every couple days, every few days, something like that at most. And note that our ARM Linux usage could be quite a bit less, though I haven't run all the math on that yet.

P.S. (edit to add): Fun fact, the "Apple Silicon" builds take (really roughly) 1/2 the time as the intel ones. I suppose emulating x86 ain't cheap. SO. That means the Apple Silicon builds represent only about a third of our existing macOS usage on Cirrus. Which, IMO, is fantastic. Given that we can get intel macOS runs elsewhere.

@confused-Techie
Copy link
Member Author

With @DeeDeeG's amazing breakdown of price and the consideration of how often we could be hypothetically packaging builds for both Silicon and ARM, I thought it'd be a good idea and take a look at what sort of impact limiting the amount of rolling releases we have could be.

Since rolling releases are provided by our download microservice, I was able to take a look at the logs to determine how many downloads we are receiving of each OS. An important fact here, is our logs only go back 30 days, so that's all I'm able to see.

Rolling Release Download count by Platform over the past 30 days

  • Apple Silicon: 860
  • ARM Linux: 109
  • Windows: 2,370
  • Intel Mac: 550
  • Linux: 1,624

This means out of a total of 5,513 rolling release downloads over the last 30 days:

  • Apple Silicon: 15.59%
  • ARM Linux: 1.97%
  • Windows: 42.98%
  • Intel Mac: 9.97%
  • Linux: 29.45%

Truncated to two decimal places


So all of this is to say, it seems that we under no circumstances should consider stopping all Apple Silicon Rolling Release builds (Also I wouldn't suggest we stop Apple Silicon Regular Releases either, considering Apple Silicon makes up 4,959 downloads of all downloads of our Regular releases from GitHub, otherwise about 9.63% of all regular release downloads).

As for ARM Linux, while I would hate to stop those rolling releases, and it's cheaper to still produce them, I could see the argument being made to discontinue them, although I don't think we will have to.

But just thought I'd provide some insight into what kind of impact changes to the rolling release schedule might pose on our users. Although I do think a smart move would be to not generate these binaries for every PR. Instead generating them only, say on commits to the master branch. Although would be on board to even further reduce the usage, such as, like mentioned turning ARM Linux and Apple Silicon into a more nightly release schedule.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 26, 2023

if Cirrus can give us a straight answer on the SSL stuff or we can figure out a way not to have our CI need to download ripgrep, then Cirrus still seems like the best option for Apple Silicon.

For the record, the macOS builds have been unaffected by the SSL errors, IIRC. (That's been ARM Linux and Windows.)

The ripgrep thing could affect us on any CI, but GitHub Actions having its own dedicated API token makes it a simpler to keep those requests "fed" with API access. Still, a team member with low usage against the GitHub API could make a permissionless, non-expiring token and we'd basically be done with ripgrep issues, like 98+% of the time, knock on wood.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 26, 2023

As sad as it would be to think of ARM Linux as a "less priority" platform, it is our least populous category of users. That said, if we are smart and think ahead, I think we can leave enough CPU hours of Cirrus for ARM Linux, especially since each CPU-minute of ARM Linux is a literal fraction of the cost of macOS and we can scale down the CPUs to either 1 or 2 for max efficiency, if needed.

Side note about the "greedy" config option for Cirrus (click to expand):

(Also, tiny note: if using Cirrus, we should turn the "greedy" option on. They only bill you for the CPU-minutes you specify you need, but extra free CPUs in the data-center, if available, can speed up your run at no extra credit cost to you, thus making the duration of the run shorter, and maybe maybe reducing your billable CPU-minutes, since the minimum (i.e. billable) CPUs you requested, times run duration, goes down??? This config option only applies to Linux and Windows, tho, and their allocation of ARM Linux CPUs seems to be tighter than the x86 ones, so it might make no difference... Hmm. Just a thought.)


I wanna do the math on how many credits ARM Linux takes, cuz I suspect it's peanuts. The SSL thing will make it comparatively annoying and risky to run though, so we are going to have to make sure re-runs of ARM Linux don't eat up the credits and compromise our ability to get Apple Silicon macOS binaries out there. IMO.

@confused-Techie
Copy link
Member Author

So it seems as of now, the general consensus is to just use CirrusCI to generate our ARM binaries. Possibly only generating them on commits to the master branch, which should only occur after a PR is merged, which really is the point of these binaries. That does mean we won't be running our visual tests on them, but that doesn't worry me to much, as I'd be surprised (but knock on wood) to find an error only effecting ARM.

As for Apple Silicon, I still am worried about running it on CirrusCI, just because the cost is so high if we do go over the free limit. Although if the math works out like stated above, do we think we could get away with a bi-daily rolling release for Apple Silicon on Cirrus? Running it at midnight once every two days? That way assuming that every run is our longest recorded run we won't go above our free usage limits, and does allow some free runs, say for manually invoking a run for our regular releases. This again would mean we no longer generate them for PRs or even on every commit for master. Which does make testing slightly more difficult, but that may just have to be the price we pay for these more unique and uncommon platforms.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 26, 2023

I say we don't give Cirrus billing info, such that it just turns off the CI spigot if we go over, not charges us. (Unless we wanna support them at say, the fixed $10 a month tier or something with an upper bound, just out of appreciation and support for what they're doing, etc... EDIT: We would discuss this before doing it, not just offhand since I mentioned it, obviously.)

And yeah, I agree with the direction things seem to be going, toward running some ARM macOS and some ARM Linux on Cirrus.

TODO: I wanna know how much the typical ARM Linux + Apple Silicon macOS build would be. Then calculate how many of those we can safely get away with, and using a conservative estimate, fly safely within the limits. Keeping in mind we have to play it really safe, or we miss out on being able to build a Regular release timely.

EDIT to add: And the variability with the SSL errors means, to play it safe, we have to take the most conservative approach to (at least estimating and planning for) those re-runs. Maybe we just don't re-run most of the time for Rolling, since it's so low-priority and there'll be another in a few days?

Math for the ARM Linux runs pending after I eat something, I guess.

(Unless someone gets to it before me, I showed my work and anyone with admin for this repo can open the edit view for my comment to copy-paste the markdown table and just work from there.)

EDIT ALSO TO ADD: I suppose we need to re-write the download micro-service to return GitHub Releases binaries for those if we get that set up. Cuz they won't be fetchable from Cirrus, at this rate.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

How many ARM Linux runs we could do in 50 credits-equivalent on Cirrus

If we used Cirrus strictly for [ARM] Linux builds, we could get up to 8333.333[...] minutes of Linux builds per month within the free 50 credits.

(Not a typo, by the way, that's fully 10x the amount of real-world macOS minutes we could get, with same size of 50 credit-equivalent allowance, if I'm doing the math right.

Our typical ARM Linux task run lengths (and how many of those we could do) are:

  • Shortest recorded run: 28.367 minutes --> (up to 293 task runs per month)
  • Median run length: 30.208 minutes --> (up to 275 runs)
  • Average run length: 32.114 minutes --> (up to 259 runs)
  • Longest recorded run: 45.883 minutes --> (up to 181 runs)
How many real-world ARM Linux run minutes can we do per month? Showing my work... (click to expand):

The essential numbers this is using, with sources:

The math:

  • 1000 CPU-minutes (per 3 credits), divided by 2 CPU cores = 500 real-world minutes of Linux tasks per 3 credits
  • 50 credits' worth of free usage per month divided by 3 = 16.6666[...]. So, we get 16.6666[...] times the per-3-credits allotment of minutes of ARM Linux tasks
  • 16.6666[...] times the per-3-credit real-world minutes (i.e. 500) = 8333.3333[...] real-world minutes of Linux tasks per month

Table of our typical ARM Linux task durations... (click to expand):
Duration Decimal minutes Link to CI run commit SHA Date
29:04 29.067 https://cirrus-ci.com/build/6187472653123584 11f662c Created at 12:14:14 PM on Fri Aug 25 2023
48:53 45.883 https://cirrus-ci.com/build/5543977768714240 54eaba3 Created at 9:02:27 PM on Thu Aug 24 2023
28:23 28.383 https://cirrus-ci.com/build/5303988216659968 c7e2567 Created at 8:09:20 PM on Tue Aug 22 2023
28:22 28.367 https://cirrus-ci.com/build/6014699137925120 0c65971 Created at 7:43:50 PM on Tue Aug 22 2023
28:32 28.533 https://cirrus-ci.com/build/6329416758853632 cd16715 Created at 1:42:18 AM on Wed Aug 16 2023
30:59 30.983 https://cirrus-ci.com/build/6204189873799168 aabb845 Created at 8:20:02 PM on Wed Aug 09 2023
28:33 28.55 https://cirrus-ci.com/build/4910394007879680 76e358b Created at 7:26:52 PM on Tue Jul 25 2023
29:26 29.433 https://cirrus-ci.com/build/6175381313552384 099d5b8 Created at 10:40:41 PM on Sat Jul 22 2023
35:30 35.5 https://cirrus-ci.com/build/5118020444487680 5a2d4d9 Created at 10:19:15 PM on Mon Jul 10 2023
33:08 33.133 https://cirrus-ci.com/build/4696558575288320 0389eee Created at 12:12:32 AM on Thu Jul 06 2023
34:11 34.183 https://cirrus-ci.com/build/6534048663732224 90ec8e5 Created at 3:53:30 PM on Tue Jul 04 2023
33:21 33.35 https://cirrus-ci.com/build/6662199431659520 fc156b3 Created at 11:38:19 PM on Thu Jun 29 2023

Our typical ARM Linux task run lengths (and how many of those we could do, given 8333.33 real-world minutes of Linux runs we can do in a month for free) are:

  • Shortest recorded run: 28.367 minutes --> (up to 293 task runs per month)
  • Median run length: 30.208 minutes --> (up to 275 runs)
  • Average run length: 32.114 minutes --> (up to 259 runs)
  • Longest recorded run: 45.883 minutes --> (up to 181 runs)

Discussion (In summary: not super useful as-is, but combined with macOS data can tell us how much of each we can do, combined. That math coming soon?) (click to expand):

We are definitely looking into running Apple Silicon macOS builds as a first priority, and seeing about how much ARM Linux we can do in parallel. As such, these numbers aren't very useful. (We wouldn't actually consider using Cirrus only for ARM Linux builds at the moment.)

But it's part of the way there. One more round of math and I can see how much our runs cost in units of credit-equivalent cost, a decimal amount of credits for a given build consisting of the "ARM Linux task" usage plus the "Apple Silicon macOS task" usage.

MORE MATH INCOMING, as I get to it, hopefully right now I'm going to be able to work on it if no interruptions from outside stuff.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

Costs for "ARM Linux" + "Apple Silicon macOS" combined, on Cirrus, and How many builds of both we can do each month

Math: How many credits per minute, for ARM Linux and Apple Silicon macOS task execution, respectively? Showing my work. (click to expand):

Using the calculations above, we can derive how many credits-equivalent it costs per minute of ARM Linux task execution, and how many credits-equivalent it costs per minute of Apple Silicon macOS task execution.

How many credits-equivalent per minute of ARM Linux execution? (Assuming our requirement of 2 CPUs minimum): 50 credits ÷ 8333.3333[...] minutes we can run of real-world ARM Linux tasks in total per 50 credits = 0.006 credits-per-minute of ARM Linux task execution

How many credits-equivalent per minute of Apple Silicon macOS execution? (Assuming non-configurable 4 CPUs for public repos/free usage): 50 credits ÷ 833.3333[...] minutes we can run of real-world Apple Silicon macOS tasks in total per 50 credits = 0.06 credits-per-minute of Apple Silicon macOS task execution

Answer:

  • 0.006 credits per minute for ARM Linux task execution
  • 0.06 credits per minute for Apple Silicon task execution (10x as much cost-per-minute)

Table of credit-equivalent usage/costs for the 12 recent successful builds I collected data for (Apple Silicon macOS + ARM Linux tasks) (click to expand):
Apple Silicon macOS minutes (credits) ARM Linux minutes (credits) Total Credits (macOS + Linux) Link to CI run commit SHA Date
18.783 min (1.127 cred) 29.067 min (0.174 cred) 1.301 credits https://cirrus-ci.com/build/6187472653123584 11f662c 12:14 PM, Fri Aug 25 2023
16.85 m (1.011 c) 45.883 m (0.275 c) 1.286 credits https://cirrus-ci.com/build/5543977768714240 54eaba3 9:02 PM, Thu Aug 24 2023
26.6 (1.596) 28.383 (0.170) 1.766 credits https://cirrus-ci.com/build/5303988216659968 c7e2567 8:09 PM, Tue Aug 22 2023
21.967 (1.318) 28.367 (0.170) 1.488 credits https://cirrus-ci.com/build/6014699137925120 0c65971 7:43 PM, Tue Aug 22 2023
14.35 (0.861) 28.533 (0.171) 1.032 credits https://cirrus-ci.com/build/6329416758853632 cd16715 1:42 AM, Wed Aug 16 2023
15.217 (0.913) 30.983 (0.186) 1.099 credits https://cirrus-ci.com/build/6204189873799168 aabb845 8:20 PM, Wed Aug 09 2023
17.6 (1.056) 28.55 (0.171) 1.227 credits https://cirrus-ci.com/build/4910394007879680 76e358b 7:26 PM, Tue Jul 25 2023
14.983 (0.899) 29.433 (0.177) 1.076 credits https://cirrus-ci.com/build/6175381313552384 099d5b8 10:40 PM, Sat Jul 22 2023
16.983 (1.019) 35.5 (0.213) 1.232 credits https://cirrus-ci.com/build/5118020444487680 5a2d4d9 10:19 PM, Mon Jul 10 2023
18.017 (1.081) 33.133 (0.199) 1.28 credits https://cirrus-ci.com/build/4696558575288320 0389eee 12:12 AM, Thu Jul 06 2023
18.85 (1.131) 34.183 (0.205) 1.336 credits https://cirrus-ci.com/build/6534048663732224 90ec8e5 3:53 PM, Tue Jul 04 2023
15.167 (0.91) 33.35 (0.2) 1.11 credits https://cirrus-ci.com/build/6662199431659520 fc156b3 11:38 PM, Thu Jun 29 2023

How many credits do we typically use for these builds, and how many of these builds could we do, given one Apple Silicon macOS task and one ARM Linux task (no re-runs, no wasted or unintended runs, no "do-overs")?

  • Lowest credit-cost build recorded: 1.032 credits (up to 48 builds per month)
  • Median credit-cost build: 1.256 credits (up to 39 builds)
  • Average credit-cost build: 1.269 credits (up to 39 builds)
  • Highest credit-cost build recorded: 1.766 credits (up to 28 builds)
Projecting alternate scenarios, with either more or less frequent ARM Linux builds, to see if that can help us tailor our credit budget, given the higher priority placed on macOS builds being completed (click to expand):

Double the ARM Linux usage (very rough simulation of having to re-run every ARM Linux task one time, on account of SSL errors):

Table (click to expand):
Apple Silicon macOS minutes (credits) ARM Linux minutes (doubled credits) Total Simulated Credits (macOS + Linux*2) Link to CI run commit SHA Date
18.783 min (1.127 cred) 29.067 min * 2 (0.348 cred) 1.475 credits https://cirrus-ci.com/build/6187472653123584 11f662c 12:14 PM, Fri Aug 25 2023
16.85 m (1.011 c) 45.883 m * 2 (0.55 c) 1.561 credits https://cirrus-ci.com/build/5543977768714240 54eaba3 9:02 PM, Thu Aug 24 2023
26.6 (1.596) 28.383 * 2 (0.34) 1.936 credits https://cirrus-ci.com/build/5303988216659968 c7e2567 8:09 PM, Tue Aug 22 2023
21.967 (1.318) 28.367 * 2 (0.34) 1.658 credits https://cirrus-ci.com/build/6014699137925120 0c65971 7:43 PM, Tue Aug 22 2023
14.35 (0.861) 28.533 * 2 (0.342) 1.203 credits https://cirrus-ci.com/build/6329416758853632 cd16715 1:42 AM, Wed Aug 16 2023
15.217 (0.913) 30.983 * 2 (0.372) 1.285 credits https://cirrus-ci.com/build/6204189873799168 aabb845 8:20 PM, Wed Aug 09 2023
17.6 (1.056) 28.55 * 2 (0.342) 1.398 credits https://cirrus-ci.com/build/4910394007879680 76e358b 7:26 PM, Tue Jul 25 2023
14.983 (0.899) 29.433 * 2 (0.354) 1.253 credits https://cirrus-ci.com/build/6175381313552384 099d5b8 10:40 PM, Sat Jul 22 2023
16.983 (1.019) 35.5 *2 (0.426) 1.445 credits https://cirrus-ci.com/build/5118020444487680 5a2d4d9 10:19 PM, Mon Jul 10 2023
18.017 (1.081) 33.133 * 2 (0.398) 1.479 credits https://cirrus-ci.com/build/4696558575288320 0389eee 12:12 AM, Thu Jul 06 2023
18.85 (1.131) 34.183 * 2 (0.41) 1.541 credits https://cirrus-ci.com/build/6534048663732224 90ec8e5 3:53 PM, Tue Jul 04 2023
15.167 (0.91) 33.35 * 2 (0.4) 1.31 credits https://cirrus-ci.com/build/6662199431659520 fc156b3 11:38 PM, Thu Jun 29 2023

So, when doubling ARM Linux cost (rough simulation of re-running every ARM Linux task once for SSL errors, worst-case scenario-ish):

  • Lowest simulated credit-cost build: 1.203 credits (up to 41 builds per month)
  • Median simulated credit-cost build: 1.46 credits (up to 34 builds)
  • Average simulated credit-cost build: 1.462 credits (up to 34 builds)
  • Highest simulated credit-cost build: 1.936 credits (up to 25 builds)

You could say this knocks three to seven builds off the number we can comfortably predict we can run each month. I feel it's good to lean conservative here, especially for the first month -- once you're out of free credits, you're out of them. No more free builds. Would have to add some credits and start paying. (Although if we are close to the free 50-credit-equivalent usage, it wouldn't be a ton of cost.)

Half the ARM Linux usage (very rough simulation of just running relatively fewer ARM Linux builds, so as to save credits):

Table (click to expand):
Apple Silicon macOS minutes (credits) ARM Linux minutes (halved credits) Total Simulated Credits (macOS + Linux/2) Link to CI run commit SHA Date
18.783 min (1.127 cred) 29.067 min /2 (0.087 cred) 1.214 credits https://cirrus-ci.com/build/6187472653123584 11f662c 12:14 PM, Fri Aug 25 2023
16.85 m (1.011 c) 45.883 m / 2 (0.138 c) 1.149 credits https://cirrus-ci.com/build/5543977768714240 54eaba3 9:02 PM, Thu Aug 24 2023
26.6 (1.596) 28.383 / 2 (0.085) 1.681 credits https://cirrus-ci.com/build/5303988216659968 c7e2567 8:09 PM, Tue Aug 22 2023
21.967 (1.318) 28.367 / 2 (0.085) 1.403 credits https://cirrus-ci.com/build/6014699137925120 0c65971 7:43 PM, Tue Aug 22 2023
14.35 (0.861) 28.533 / 2 (0.086) 0.947 credits https://cirrus-ci.com/build/6329416758853632 cd16715 1:42 AM, Wed Aug 16 2023
15.217 (0.913) 30.983 / 2 (0.093) 1.006 credits https://cirrus-ci.com/build/6204189873799168 aabb845 8:20 PM, Wed Aug 09 2023
17.6 (1.056) 28.55 / 2 (0.086) 1.142 credits https://cirrus-ci.com/build/4910394007879680 76e358b 7:26 PM, Tue Jul 25 2023
14.983 (0.899) 29.433 / 2 (0.089) 0.988 credits https://cirrus-ci.com/build/6175381313552384 099d5b8 10:40 PM, Sat Jul 22 2023
16.983 (1.019) 35.5 / 2 (0.107) 1.126 credits https://cirrus-ci.com/build/5118020444487680 5a2d4d9 10:19 PM, Mon Jul 10 2023
18.017 (1.081) 33.133 / 2 (0.1) 1.181 credits https://cirrus-ci.com/build/4696558575288320 0389eee 12:12 AM, Thu Jul 06 2023
18.85 (1.131) 34.183 / 2 (0.103) 1.234 credits https://cirrus-ci.com/build/6534048663732224 90ec8e5 3:53 PM, Tue Jul 04 2023
15.167 (0.91) 33.35 / 2 (0.1) 1.01 credits https://cirrus-ci.com/build/6662199431659520 fc156b3 11:38 PM, Thu Jun 29 2023

So, when halving ARM Linux cost (rough simulation of just running relatively fewer ARM Linux builds, so as to save credits):

  • Lowest simulated credit-cost build: 0.947 credits (up to 52 builds per month)
  • Median simulated credit-cost build: 1.146 credits (up to 43 builds)
  • Average simulated credit-cost build: 1.173 credits (up to 42 builds)
  • Highest simulated credit-cost build: 1.681 credits (up to 29 builds)

You could say if we run ARM Linux tasks 1/2 as often as macOS tasks, that could save us... enough credits for one to four more macOS builds. Not really worth it, you could argue. But I suppose it can save us... a rather small amount of credits in a pinch.

If we go further than "half as often", and instead really run almost zero ARM Linux builds... Such as only offering ARM Linux builds for the Regular releases, it could save us enough credits for (asymptotically approaching -- that is to say almost) three to ten more macOS builds per month, depending on how optimistically you project it out, based on my data analyses above. (Which is all pretty rough and approximate, you can't predict the future, only try to define trends from the past.)

@confused-Techie
Copy link
Member Author

confused-Techie commented Aug 27, 2023

@DeeDeeG Amazing write up on all of this, and it's really good to be able to land on a final number, based on our current trends.

Which from above seems to be, that we should plan for a maximum 28 builds on Cirrus per month, assuming each build consists of a single run of ARM Linux and Apple Silicon. Although of course considering retries (especially with CirrusCI's issues with SSL right now) we must aim much lower than that.

Which seems to still work for the above idea of running once every other day. That should give us about 15 builds per month, which leaves a very healthy buffer for the reties needed for something like a Regular Release, or some other circumstance that'd cause us to run these any more than once every other day.

So seriously thanks for all this work!

I suppose this leaves on thing left to do, act on these plans.


As it looks now, here's the full list of everything we must do (borrowed from DeeDeeG on Discord)

This is quite a bit to tackle in the few short days we have. But luckily we already have a head start.

If anybody would like to take ownership of these following tasks please feel free to comment here, and lets see if we can accomplish this all in time to ensure the end user of rolling releases for these platforms never notices anything changed

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

I found the docs for scheduled (cron-style) Cirrus runs: https://cirrus-ci.org/guide/writing-tasks/#cron-builds

It is possible to configure invocations of re-occurring builds via the well-known Cron expressions. Cron builds can be configured on a repository's settings page (not in .cirrus.yml).

So I'll work toward that, it should be one of the easier parts of all this.

Also, disabling the non-ARM tasks in the .cirrus.yml should be easy as well: https://cirrus-ci.org/guide/writing-tasks/#conditional-task-execution. Likewise, we should disable building on push to branches, since we'l only be doing the cron (scheduled) builds. And if possible, restrict building on push to branches on the Cirrus settings dashboard so no-one can "resource exhaustion attack" our free credits allocation 👀.

@confused-Techie
Copy link
Member Author

Ahh thanks for finding this, some parts of those docs lead me to believe having to check for certain items, such as PR labels would still count towards our usage, but you seem to be correct against that idea.

So yes, working out the cron job we want to run for these builds is a must, as well as we should probably build in the functionality of publishing these artifacts.

Speaking of publishing rolling release artifacts, I know we've discussed this before, and I've looked at Atom's old nightly repo, but it seems the nightly repo just consisted of tags with source code tars.

Would you think uploading the binaries to a tag of the rolling release build number is the best method to do this?

Or, do we just add them to a repository itself, and each new binary for each platform replaces the old one, so that way there's only ever a single binary for each platform?

(The later option above, would make the download microservice trivial to change, which would be nice)

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

A proposed Cron schedule for Cirrus

working out the cron job we want to run for these builds is a must, [ . . . ]

I was looking into this. I propose "15:00 UTC, Mon/Wed/Fri every week".

As a cron expression: 0 0 15 ? * MON,WED,FRI *

(This handy thing linked from the Cirrus docs helped pick/format the expression: https://www.freeformatter.com/cron-expression-generator-quartz.html)

Thoughts on/rationale as to why this cron expression (click to expand):

The constraints I had in mind were, "should run twice or three times a week" (meaning at most 14 times a month, should be within our budget per above cost estimates), and "should happen on weekdays, at a time when most people in the team are likely to be awake". So that if a build goes out, and there's any issue, we can act on it rather than us all being asleep/busy when it goes out. (Also, I wanted something predictable, not something that acts drastically different for a shorter month like February, or that skips around days of the week like "every three days" would.)

Monday/Wednesday/Friday takes care of "three times a week" and "always on weekdays". (While being quite predictable and orderly).

Then there's still time of day... So... I looked at time zones for the Americas plus the UK. Plugged those time zones into an international meeting planner web app: https://www.timeanddate.com/worldclock/meetingtime.html?iso=20230828&p1=137&p2=179&p3=233&p4=136

15:00 UTC is during "business hours" for all four time zones I looked at, and if I figured this correctly, it is the only time of day that's business hours during Daylight Time and Standard time for all those time zones, maybe. So it should remain "business hours" all year, not just in Daylight Time like it is in three out of those four time zones right now, IIUC.

@confused-Techie
Copy link
Member Author

I've now gone ahead and updated my GitHub Actions PR for the following:

  • GitHub Actions supported platforms will build binaries on every PR and push to master, these binaries are available in the GitHub Actions Web UI (If logged in)
  • If a push is what triggered the build, then a script will be run that uploads any binaries it has locally to the repository pulsar-rolling-releases (Which doesn't exist yet), these binaries will create a new release on that repo, adding the assets as needed
  • Cirrus.yml has all tasks commented out other than Silicon and ARM

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

Posting this checklist for locking down Cirrus from the Discord and then really going to sleep (at least I had a nap already):

Something like:

  • ARM ONLY (disable the Windows, intel macOS, and x86 Linux tasks in .cirrus.yml) (done in @confused-Techie's PR)
  • NO PR BUILDS (Delete "on push to master branch" trigger from .cirrus.yml Make sure tasks are conditional on a check that the job is a cron job, or (extra credit) make it also work if triggered by a Regular release tag being pushed.)
  • CRON BUILDS (ONLY) (The cron schedule is enabled now ✅ )
    • (except if we can decide on how to automatically build for Regular release version tags being pushed?)
  • No builds if you don't have write access to the repo (that's what GitHub Actions is for now)
    • Waiting to flip this switch so we don't disable testing for current PRs over the next couple of days before GitHub Actions is up and running for those tests.
    • We should warn or communicate with PR authors about the need to be on the latest commits from master in order to have working CI on their PR's. Particularly, we need to make sure folks with write access to pulsar-edit/pulsar don't have any stale PR's open, since if those get updated, they may likely trigger a bunch of credit usage at once. Even after our best efforts to lock down stuff.
      • This possible credit catastrophe can be mitigated if we set Cirrus to build using the latest config from the default branchmaster. Or if we clean up all the old open PRs by folks with write access to pulsar-edit/pulsar to have the latest Cirrus config file updates from master, basically merge master into those PR's...
  • Consider making builds use the default branch (master)'s copy of the CI config (.cirrus.yml), instead of their own branch's .cirrus.yml, so folks can't mess with the config in various ways (downside: Makes it harder to intentionally test actual, proper changes to the Cirrus config?)

And ideally we have GitHub Actions in place to replace all the stuff Cirrus _was_doing for most PRs and branch pushes, will make transition of most things off of Cirrus feel complete and whole, not just disabled.

@confused-Techie
Copy link
Member Author

@DeeDeeG As for point 4 here, that may not be necessary?

Since if we only allow runs of people who have write permission already, then I'd argue they should be allowed to easily test their changes, and would hope they are aware of the effects changing the config will have.

If we really want to though, I believe we can disallow cirrus from running if a certain file or path has been changed, including it's own config.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 27, 2023

@confused-Techie I suppose it was to save us from ourselves, but you do have a point it would really be team members using credits. We have to ensure every team member with write access is as serious about preserving those credits as we are, having been neck deep in the pricing data like this, otherwise I think that pitfall should have a lid installed over it that we can temporarily remove if we need to tweak CI. Anyone with write access should be able to log into Cirrus to remove the restriction... Better safe than out of credits, IMO?? Open to more feedback on it, but that's my leaning.

EDIT to add: there would already need to be some intervention to get CI tweaks to run, since PR's won't be running on Cirrus at all anymore. So, hmm. Just layers of defense-in-depth against exhausting our credits. Maybe it's one two many layers and just making it cumbersome. I'll let this one percolate in my head a bit... This is a niche and mostly hypothetical concern, though, given the plans to disable PR runs in our Cirrus setup soon.

@confused-Techie
Copy link
Member Author

confused-Techie commented Aug 28, 2023

Alright, to put an update out on progress here, since at this point there's quite a few moving parts that seem to be about to converge (which is a good thing).

What is needed to actually switch:

Lets first layout our plan of action:

With all that said, it seems we are nearly complete, with the last things to do, would be to limit Cirrus runs, get Apple Signing secrets into GitHub, and finally test.

Please let me know if any further clarification is needed, but to further highlight the last tasks that need to be done:

@meadowsys
Copy link
Member

I have added the codesigning secrets to GHA now >w<

@confused-Techie
Copy link
Member Author

Seems the last thing to do, is limit our cirrus runs otherwise, then we can finally test how things look

@confused-Techie
Copy link
Member Author

I've also gone ahead and added limits to when the cirrus script will run, using:

only_if: $CIRRUS_CRON != "" || $CIRRUS_TAG == "regular_release"

This should mean, that the cirrus scripts will only be run, if triggered by a cron job (which again will be triggered every two days, or if the tag of a PR that's triggered the run has the label regular_release.

We will just need to create this label and begin using it for all the regular release tagged PRs, only members with write access can change the labels of their PR so we don't have to worry about drive by contributors using it, but we must then all understand that usage of this label will trigger cirrus, and if we aren't careful will use our very limited credits with cirrus, and could cause the lack of credits to hold back rolling releases or even regular releases of ARM and Silicon binaries if misused.

@DeeDeeG
Copy link
Member

DeeDeeG commented Aug 29, 2023

One more to do:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants