-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: setup AWS kola job & move aws replication to release job #128
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, though I think templating is going to be very important for making it easy to hack on the pipeline, as well as for supporting branches.
Jenkinsfile
Outdated
stage('Kola Runs') { | ||
container('jnlp') { | ||
utils.shwrap(""" | ||
oc start-build --wait fedora-coreos-pipeline-kola-aws \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's template the name of the buildconfig to aid forking and branching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The job itself (similar to the release
job) is only meant to have one of it existing and developer pipelines can just re-use it. Did you have something else in mind?
Jenkinsfile.kola.aws
Outdated
|
||
stage('AWS Kola Run') { | ||
utils.shwrap(""" | ||
kola run -p aws --aws-ami ${ami} --aws-region ${ami_region} -b fcos || : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also an interesting topic here - should we e.g. store the kola results in the builddir too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be something that could become useful in the future but for now I don't think it hurts to just store them in jenkins.
s3_stream_dir = "${s3_bucket}/prod/streams/${params.STREAM}" | ||
utils.shwrap(""" | ||
coreos-assembler buildprep s3://${s3_stream_dir}/builds | ||
coreos-assembler aws-replicate --build=${params.VERSION} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK, though it's making the assumption that the VERSION
is the latest one. Which to be fair is likely almost always the case, though it'd be nice to have something a bit stronger here. Perhaps something like buildprep --build=${params.VERSION}
and then buildupload --build=${params.VERSION}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlebon: buildprep
doesn't seem to accept a specific build as a parameter just the URL; does upstream work there need to happen first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, yeah. Was just spitballing a new UX in those cosa
commands. Definitely doesn't need to block this.
coreos-assembler buildprep s3://${s3_stream_dir}/builds | ||
coreos-assembler aws-replicate --build=${params.VERSION} | ||
git clone https://github.com/coreos/fedora-coreos-releng-automation /var/tmp/fcos-releng | ||
/var/tmp/fcos-releng/coreos-meta-translator/trans.py --workdir . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh yeah, hmm we should probably just bake that into the cosa container, or have a sidecar container in the pod with utilities we need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it doesn't fit in cosa directly so I didn't have a good idea for where to fit it. Maybe just abstracting it into a generic function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can punt on this for now. Once scripts from that repo are used some more, we can think about how to avoid git clone
ing all over the place.
Jenkinsfile.kola.aws
Outdated
properties([ | ||
pipelineTriggers([]), | ||
parameters([ | ||
string(name: 'BUILDID', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this VERSION
for consistency? (Even though they're not exactly the same thing, but for our use case they are).
description: 'Fedora CoreOS Build ID to test', | ||
defaultValue: '', | ||
trim: true), | ||
string(name: 'S3_STREAM_DIR', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not choice(STREAM)
to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a generic job it doesn't have a developer split so it cannot construct the S3_STREAM_DIR
, as such my intention was to be able to still support running AWS tests in developer pipelines.
Jenkinsfile.kola.aws
Outdated
podTemplate(cloud: 'openshift', label: 'coreos-assembler', yaml: pod, defaultContainer: 'jnlp') { | ||
node('coreos-assembler') { container('coreos-assembler') { | ||
def ami, ami_region | ||
stage('Fetching Build Details') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just "Fetch Metadata" ?
if (meta.amis.size() > 0) { | ||
ami = meta['amis'][0]['hvm'] | ||
ami_region = meta['amis'][0]['name'] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an early exit or maybe better error out if there are no AMIs in the metadata?
Jenkinsfile.kola.aws
Outdated
|
||
stage('AWS Kola Run') { | ||
utils.shwrap(""" | ||
kola run -p aws --aws-ami ${ami} --aws-region ${ami_region} -b fcos || : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want some --parallel
value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably but the value itself highly depends on the account it's being ran in. I don't remember the exact numbers on the default resource limits so we probably want to keep it low, maybe 5 tests at a time which shouldn't ever hit more than 15 concurrent instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh gotcha. Hmm, I guess this gets us back to https://github.com/coreos/mantle/issues/1037#issuecomment-517299593.
Yeah, let's go with 10
for now and go from there? The Fedora testing account at least doesn't barf on that, and e.g. my personal account doesn't either. :)
Jenkinsfile
Outdated
stage('Kola Runs') { | ||
container('jnlp') { | ||
utils.shwrap(""" | ||
oc start-build --wait fedora-coreos-pipeline-kola-aws \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: let's not --wait
for this? Essentially, consider kola AWS testing as a follow-up job, not something that's necessarily part of the build process itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm interesting - i would have figured we'd wanted it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons why I suggested this is that kola AWS tests can take a while, and there's no reason to delay building other things while that's ongoing (though related: #68). I also think errorring the build job if the AWS test fails isn't practical.
@@ -30,6 +30,23 @@ pod = pod.replace("COREOS_ASSEMBLER_IMAGE", "coreos-assembler:master") | |||
|
|||
podTemplate(cloud: 'openshift', label: 'coreos-assembler', yaml: pod, defaultContainer: 'jnlp') { | |||
node('coreos-assembler') { container('coreos-assembler') { | |||
// Replicate the newly uploaded AMI to other regions. Intentionally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should then remove the aws-replicate
bit from the Jenkinsfile
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, I haven't refreshed this PR in a while so I think that commit got lost.
26ad4a3
to
bfb040f
Compare
Rebased & updated to support comments. Note that this shouldn't be merged until the upstream account has the credentials needed for replication added. |
👍 - i'm looking at permissions now for the aws replication issue, so can consider this too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I tested out #130 extensively today, which builds on top of this PR. We should be able to merge them both around the same time and update creds in prod to finally have aws replication and aws kola tests enabled. |
Jenkinsfile
Outdated
@@ -298,6 +286,19 @@ podTemplate(cloud: 'openshift', label: 'coreos-assembler', yaml: pod, defaultCon | |||
} | |||
} | |||
|
|||
|
|||
if (!params.MINIMAL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be conditional on s3_stream_dir
Jenkinsfile.kola.aws
Outdated
ami_region = meta['amis'][0]['name'] | ||
} else { | ||
// no AMI found | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC return
inside a stage doesn't actually do anything. Also, thinking more on this now, I think we should just error out? It'd be pretty easy to mistake a job ending successfully because of tests passing vs actually a no-op.
Creates a new job specifically for running the kola tests on AWS. As a result moves the AWS replication out of the main `Jenkinsfile` into the `Jenkinsfile.release`. We'll likely have to move the AWS replication to it's own job down the line if we want to start signing the `meta.json` / `release.json` files.
bfb040f
to
2ab8fac
Compare
Update pushed addressing comments ⬆️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Will let you or @dustymabe time the final merge.
Creates a new job specifically for running the kola tests on AWS. As a
result moves the AWS replication out of the main
Jenkinsfile
into theJenkinsfile.release
.We'll likely have to move the AWS replication to it's own job down the
line if we want to start signing the
meta.json
/release.json
files.