-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite deploy to minimize API requests #38
Conversation
…has to clobber top-level bin/)
I was testing a minor memory-usage improvement to `Set`, but it turns out it doesn't actually matter (and this helped me determine that, so I might as well keep it).
… manifests vs blobs
…rect HTTP error handling
…e text / comment text)
This is real big. For now, I've left it in "many WIP commits" mode, which might make it easier or more interesting to review, but I don't think we should merge all these commits (and I was planning to squash them). |
For purposes of comparison, the Go coverage percentage on |
rateLimitBuilds([ | ||
count: 1, | ||
durationName: 'hour', | ||
userBoost: true, | ||
]), | ||
pipelineTriggers([ | ||
// TODO https://github.com/docker-library/meta-scripts/issues/22 | ||
//upstream(threshold: 'UNSTABLE', upstreamProjects: 'meta'), | ||
cron('H H/2 * * *'), | ||
// (we've dropped to only running this periodically to avoid it clogging the whole queue for a no-op, which also gives build+meta more time to cycle and get deps so they have a higher chance to all go out at once -- see the above linked issue) | ||
upstream('meta'), | ||
cron('H H/6 * * *'), // run every few hours whether we "need" it or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This effectively sets us to run deploy
every time meta
succeeds, but at most once per hour (unless we run manually). I don't personally think this is right, but it's maybe close enough for first pass (triggers relatively frequently, but not so frequently that we cause problems for anyone else).
I'd also add the reminder that this job running is only a blocker for users to see new builds, not for dependent/child builds, so it's pretty reasonable for them to have a 1-2 hour delay (and we can always run manually if we know the builds are done and need to get a deploy out faster than that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, a couple questions
// panic instead of error because this should've already been handled/normalized above (so this is a coding error, not a runtime error) | ||
} | ||
|
||
return normal, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is quite a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeahhhhhhh -- I actually wrote this as part of main()
itself, and when I stepped back from the implementation I almost died and thus forced myself to re-write it in such a way that I could load it up with unit tests so I could have higher confidence that all the edge cases in every branch of this behemoth are actually covered properly.
This replaces our$2*(N+1)$ API requests to no-op, this new code should be only $1$ total per tag in the no-op case, even if we later add a pre-flight
crane cp
/crane index append
based deploy workflow with more custom code that can generate the full index/manifest list for each tag in a fully deterministic and offline way (based on the data we've already got locally inbuilds.json
), and code that can take that result and push it directly with minimal API requests, especially in the no-op case (wherecrane index append
would make something likeHEAD
request which this code currently skips in favor of directly doingPUT
and handling errors as a miss).This also includes new integration tests and even some new
jq
tests to cover the new code. As noted in a comment in the code, in theory we can useocimem
in the future to add more unit tests to this code, but it's harder to successfully/correctly test the edge cases that way (since we'd then have to synthesize content in-process instead of relying on pre-synthesized content pre-existing), so I felt pretty good about leaving that as aTODO
for now.(I have been using this successfully for deploying my own personal images, FWIW. 👍)
Closes #22