Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

05core: remove automatic reboot after timeout #1665

Merged
merged 1 commit into from
Apr 22, 2022

Conversation

prestist
Copy link
Contributor

@prestist prestist commented Apr 6, 2022

Removed the timeout for reboot on ftb if in error state.

coreos/fedora-coreos-tracker#928

@jlebon
Copy link
Member

jlebon commented Apr 7, 2022

So for coreos/fedora-coreos-tracker#928, the idea is that we only remove the timeout. But we still want e.g. the Ignition virtio journal dump bits (this is used by cosa to make it easier to surface initrd failures) and printing of the failed units.

@prestist prestist force-pushed the remove-emergency-timeout branch 2 times, most recently from bbfa105 to 9311142 Compare April 11, 2022 14:29
@prestist
Copy link
Contributor Author

prestist commented Apr 11, 2022

@jlebon, sorry I misunderstood what the goal was. I went ahead and made the correction so the journal dump still happens.

I was wondering about the naming convention of the folder in modules.d.

Where does the '99' come from, and would it make sense to just drop the '-timeout'?

@bgilbert
Copy link
Contributor

This is still removing the printing of failed units. You'll need to go through the code and delete only the relevant parts, not just delete the whole file.

It would make sense to drop the -timeout. 99 is a Dracut module priority, and AIUI only matters for resolving conflicts where two Dracut modules ship different versions of the same file. I think the value is mostly arbitrary here.

@prestist prestist closed this Apr 12, 2022
@prestist
Copy link
Contributor Author

Closing this really quick, I am going through a workflow that updates pr, and I dont want to cause un-nessary reviews.

@prestist
Copy link
Contributor Author

This is still removing the printing of failed units. You'll need to go through the code and delete only the relevant parts, not just delete the whole file.

Ah, yes @jlebon talked to me about that yesterday. I haven't yet gotten back to fix that.

It would make sense to drop the -timeout. 99 is a Dracut module priority, and AIUI only matters for resolving conflicts where two Dracut modules ship different versions of the same file. I think the value is mostly arbitrary here.

Ah thank you! I was wondering about that.

@bgilbert
Copy link
Contributor

Closing this really quick, I am going through a workflow that updates pr, and I dont want to cause un-nessary reviews.

We usually just leave them open, and post a comment when the PR is ready for re-review.

@prestist
Copy link
Contributor Author

Closing this really quick, I am going through a workflow that updates pr, and I dont want to cause un-nessary reviews.

We usually just leave them open, and post a comment when the PR is ready for re-review.

Ah okay, sorry, since I am going to be doing cosa - inits I was worried that I would be bugging everyone.

@prestist prestist reopened this Apr 12, 2022
@prestist prestist force-pushed the remove-emergency-timeout branch 9 times, most recently from e676369 to 592cc70 Compare April 12, 2022 20:52
@prestist prestist force-pushed the remove-emergency-timeout branch from 592cc70 to 53f307e Compare April 20, 2022 19:13
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a rebase but it LGTM overall! For the commit message, how about something like 05core: stop automatic reboot after timeout? And then in the commit message, you can include a short rationale and a link to the tracker issue.

Also the comment at the top of the emergency-shell.sh needs to be updated.

@bgilbert
Copy link
Contributor

bgilbert commented Apr 21, 2022

LGTM generally. 👍 The module should be renamed from 99emergency-timeout.

@prestist
Copy link
Contributor Author

This needs a rebase but it LGTM overall! For the commit message, how about something like 05core: stop automatic reboot after timeout? And then in the commit message, you can include a short rationale and a link to the tracker issue.

Also the comment at the top of the emergency-shell.sh needs to be updated.

This needs a rebase but it LGTM overall! For the commit message, how about something like 05core: stop automatic reboot after timeout? And then in the commit message, you can include a short rationale and a link to the tracker issue.

Also the comment at the top of the emergency-shell.sh needs to be updated.

First, thank you for the feedback! , small question about your pr comment.
"... and then in the commit message, you can include a short rationale and a link to the tracker issue."
I was thinking something more like "05core: enter emergency shell on error during ftb"
In the past I have had smaller commit messages, do I want to add the bit about the rationale and the tracker issue to the message like this?
The result would be something like this =>
"05core: enter emergency shell on error during ftb. The reboot and consequently the timeout where removed to prevent the suppression of errors during the first time boot. fix:coreos/fedora-coreos-tracker#928"

@prestist prestist changed the title WIP:remove emergency timeout 05core: enter emergency shell on error during ftb Apr 21, 2022
@prestist prestist force-pushed the remove-emergency-timeout branch from 53f307e to 8a70923 Compare April 21, 2022 21:47
@prestist prestist marked this pull request as ready for review April 21, 2022 21:48
@prestist prestist force-pushed the remove-emergency-timeout branch from 8a70923 to 8359a1a Compare April 21, 2022 21:52
@prestist
Copy link
Contributor Author

LGTM generally. 👍 The module should be renamed from 99emergency-timeout.

Thank you for the feedback!

I took a swing at the folder name, let me know if it makes sense.

@prestist prestist force-pushed the remove-emergency-timeout branch from 8359a1a to 445a676 Compare April 22, 2022 13:07
@prestist prestist changed the title 05core: enter emergency shell on error during ftb 05core: enter emergency shell on error during FTB Apr 22, 2022
jlebon
jlebon previously approved these changes Apr 22, 2022
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

For the commit message,

05core: enter emergency shell on error during FTB

(OOC, what does FTB stand for?)

I think it'd be better to focus the message more on what changed. In this case, it's the fact that we're no longer automatically rebooting after a timeout.

For the rationale and link, you would include it in the message body, not as part of the title (check out https://cbea.ms/git-commit/#separate).

The reboot and consequently the timeout masked valuable debug
information. The reboot also caused some cascading errors due to the
fact that the system would try and run as if all required dependencies
were satisfied during the first boot.

Closes coreos/fedora-coreos-tracker#928.
@prestist prestist force-pushed the remove-emergency-timeout branch from ff71493 to 279292a Compare April 22, 2022 15:27
@prestist prestist changed the title 05core: enter emergency shell on error during FTB 05core: remove automatic reboot after timeout Apr 22, 2022
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great commit message, thanks!

@jlebon jlebon enabled auto-merge (rebase) April 22, 2022 15:31
@jlebon jlebon merged commit c3f5185 into coreos:testing-devel Apr 22, 2022
yussufsh added a commit to yussufsh/ocp4-upi-powervm that referenced this pull request Oct 15, 2022
Reason for removal is that rhcos does not automatically reboot after timeout now ref: coreos/fedora-coreos-config#1665

Signed-off-by: Yussuf Shaikh <[email protected]>
yussufsh added a commit to yussufsh/ocp4-upi-powervs that referenced this pull request Jan 29, 2023
Reason for removal is that rhcos does not automatically reboot after timeout now ref: coreos/fedora-coreos-config#1665

Signed-off-by: Yussuf Shaikh <[email protected]>
yussufsh added a commit to yussufsh/ocp4-upi-powervs that referenced this pull request Jan 30, 2023
Reason for removal is that rhcos does not automatically reboot after timeout now ref: coreos/fedora-coreos-config#1665

Signed-off-by: Yussuf Shaikh <[email protected]>
Prajyot-Parab pushed a commit to Prajyot-Parab/ocp4-upi-powervs that referenced this pull request Nov 11, 2023
Reason for removal is that rhcos does not automatically reboot after timeout now ref: coreos/fedora-coreos-config#1665

Signed-off-by: Yussuf Shaikh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants