grub: extend boot prompt timeout to 5 seconds #1263

dustymabe · 2020-03-19T17:08:17Z

At 1 second it's almost impossible to catch the boot prompt if you need
to change the kernel command line parameters. Let's extend it to 5
seconds so users have a fighting chance to catch the prompt.

This follows from a similar change made to the Live ISO:
coreos/fedora-coreos-config#281

At 1 second it's almost impossible to catch the boot prompt if you need to change the kernel command line parameters. Let's extend it to 5 seconds so users have a fighting chance to catch the prompt. This follows from a similar change made to the Live ISO: coreos/fedora-coreos-config#281

openshift-ci-robot · 2020-03-19T17:08:45Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dustymabe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [dustymabe]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgwalters · 2020-03-19T17:56:53Z

Kind of torn... see e.g. this thread where @crawford was wary about another 5 seconds in node bootup.

We could make this platform specific? E.g. only do it on metal and vmware to start?

cgwalters · 2020-03-19T17:57:12Z

(And then on AWS make it zero)

cgwalters · 2020-03-19T17:59:12Z

I use e.g. cosa run and kola spawn a lot to get a quick shell, and our bootup is already really slow compared to other distributions; we do a lot in the initramfs and hook in some nontrivial things to multi-user.target including starting zincati/rpm-ostree. This would be a noticeable hit to boot time for that scenario too.

dustymabe · 2020-03-19T19:09:32Z

Kind of torn... see e.g. this thread where @crawford was wary about another 5 seconds in node bootup.

yeah I agree I'm torn too. Extending boot time isn't great.

We could make this platform specific? E.g. only do it on metal and vmware to start?

I'd like to have it for all platforms where people can reasonably get to the console of the machine. This happens to be one of (and maybe the only) way to rollback a machine in the case of an upgrade failure that has your system borked (where the system being borked could have a lot of different meanings).

maybe I can make it platform specific, but it would be worth us saying which platforms users can reasonably get to a console on..

qemu
openstack
vmware
bare metal
which clouds??

bgilbert · 2020-03-19T19:24:45Z

which clouds??

At least GCP, Azure, DO, and Packet. Not AWS.

jlebon · 2020-03-19T19:33:35Z

Do all those clouds have a CLI for accessing the console though? I think Packet at least does, but otherwise if it's through the cloud's web UI, being able to access the console right at the start within a 5s window wouldn't be trivial. (My argument being let's not do this if it's actually not usable anyway.)

I'm guessing this is mostly about network args, right? (And I guess in the bare metal case, installer kargs, though we should be emphasizing the CLI path now: coreos/fedora-coreos-docs#26). In which case, can we limit this to just where that's relevant? Most clouds shouldn't really need network karg tweaks on first boot.

dustymabe · 2020-03-19T19:35:45Z

I'm guessing this is mostly about network args, right?

I don't think so. I think it's about any karg you'd want to add ephemerally because of need (debugging problems) or test (exploring features). Also, choosing an older bootentry in the case you need to.

bgilbert · 2020-03-19T19:38:40Z

if it's through the cloud's web UI, being able to access the console right at the start within a 5s window wouldn't be trivial.

The consoles don't generally close when the instance restarts, so it's still possible to catch the GRUB prompt on reboot.

cgwalters · 2020-03-19T19:40:50Z

How much would we want this if we had Ignition support for kernel arguments ?

dustymabe · 2020-03-19T20:52:11Z

How much would we want this if we had Ignition support for kernel arguments ?

Ignition support for kargs would set them persistently, right? In #1263 (comment) I argue more that the value of this is for ephemeral arguments.

cgwalters · 2020-03-19T20:55:33Z

Ignition support for kargs would set them persistently, right? In #1263 (comment) I argue more that the value of this is for ephemeral arguments.

Can you describe a more specific use case for ephemeral? I can imagine it but it helps to have it written down.

That said we could invent ephemeral arguments too though there's some interesting twists there.
(It might be most easily done via ostreedev/ostree#435 )

dustymabe · 2020-03-19T21:12:50Z

Can you describe a more specific use case for ephemeral? I can imagine it but it helps to have it written down.

init=/bin/sh
- need to set/recover root password
- need to somehow fix selinux labels that got messed up
single
- need to recover root password (once we fix it to work)
systemd.mask=
- debugging a weird failure with systemd
systemd.log_level=debug systemd.log_target=console systemd.journald.forward_to_console=1
- for a lot of logs during bootup
rolling back a failed deployment
enforcing=0

I'm sure we could name many more

cgwalters · 2020-03-19T21:30:48Z

Right, though if this is for local development/testing, particularly with e.g. cosa run, it would be pretty easy to add cosa run --kargs="systemd.log_level=debug systemd.log_target=console" similarly to #1219

cgwalters · 2020-03-19T21:32:00Z

In fact I'm just going to do that now

For the use case of enabling systemd debugging, etc. coreos#1263 (comment)

cgwalters · 2020-03-19T21:57:28Z

Done in #1265

(with the usual requisite duplication between cosa and mantle...it's like some sort of strange recurring pattern)

cgwalters · 2020-03-19T22:05:29Z

Now, using "easy to do w/qemu" tricks that don't work in other places is a tradeoff, because it does make debugging an issue elsewhere harder. But, IMO we should be making an OS that works absolutely perfectly in qemu - there's no excuse not to do so, particularly for early bootup stuff.

For the use case of enabling systemd debugging, etc. coreos#1263 (comment)

cgwalters · 2020-03-20T12:25:48Z

The consoles don't generally close when the instance restarts, so it's still possible to catch the GRUB prompt on reboot.

If the use case includes debugging on reboot, then Ignition support for kargs would work right?

Another topic here - on bare metal at least...is there some sort of standard way one can ask the bootloader to display a menu via the UEFI firmware? Because it is a bit silly to have a timeout "press f12 to enter setup" or whatever, then another timeout "press a key to edit the bootloader".

dustymabe · 2020-03-20T13:27:24Z

Right, though if this is for local development/testing,

I'm really not too concerned about that case with this proposed change. As developers we can hack and slash images to do whatever we want. I'm concerned about the user experience, which will make our lives easier too. If I'm helping someone debug and I need to ask them to add a kernel command line parameter, then explaining to them they only have 1 second to catch the prompt and whatever interface they're using might not even give them a chance at all to hit it in that time is bad for them and for us. As a user, I might consider moving to another offering.

cgwalters · 2020-03-20T13:56:53Z

As developers we can hack and slash images to do whatever we want.

Well yes, but I can say for sure I've often done cosa run + "catch grub prompt" for things because...it's easier than rebuilding a new image or hacking the image manually. And that would be unnecessary after #1265

If I'm helping someone debug and I need to ask them to add a kernel command line parameter, then explaining to them they only have 1 second to catch the prompt

Right. OK first, this github issue isn't the first time the bootloader timer has been discussed 😉
In fact https://fedoraproject.org/wiki/Changes/HiddenGrubMenu is highly relevant here.

Second, I get the use case but I'm arguing for platform specifics and more sophistication/thought.

In particular a bottom line for me is: We should set the bootloader timeout to zero in AWS, because anything else makes no sense at all.

jlebon · 2020-03-20T14:01:56Z

One idea for ephemeral args is to have a file in /boot where you can write them down. On reboot, the GRUB config reads that file in and appends it to the list. We then delete the file during boot. Heck, we could even make this part of rpm-ostree kargs --once or something. (But note even now, although cumbersome, it should be totally fine to rpm-ostree kargs --append foobar and then delete it after you're done with it.)

For the "select an older boot entry", what's the scenario you're thinking of? If the boot still works enough to SSH in, then you can rpm-ostree rollback. If the boot is completely broken, then I think that's where we want coreos/fedora-coreos-tracker#47 (which yeah, we need to push forward on...).

jlebon · 2020-03-20T14:03:52Z

Even simpler is a one-boot stamp file e.g. like /boot/grub-sleep which makes GRUB sleep a little longer.

dustymabe · 2020-03-20T14:05:54Z

As developers we can hack and slash images to do whatever we want.

Well yes, but I can say for sure I've often done cosa run + "catch grub prompt" for things, and that would be unnecessary after #1265

I support #1265 - thanks for that.

In particular a bottom line for me is: We should set the bootloader timeout to zero in AWS, because anything else makes no sense at all.

I'm with you. I support having a 0 timeout on platforms where it's feasibly impossible to catch the grub prompt. If we can agree on the platforms where it's impossible to get to a grub prompt (no console access) then I can try to rework this to take that into account and have 0 for those platforms and 5 for the ones where you can access it.

dustymabe · 2020-03-20T14:11:02Z

One idea for ephemeral args is to have a file in /boot where you can write them down. On reboot, the GRUB config reads that file in and appends it to the list. We then delete the file during boot. Heck, we could even make this part of rpm-ostree kargs --once or something. (But note even now, although cumbersome, it should be totally fine to rpm-ostree kargs --append foobar and then delete it after you're done with it.)

I appreciate the ideas here, but I don't think there is anything we're going to do to cover all cases where someone is going to need to access the grub prompt/kernel command line. @jlebon you even wrote this: https://docs.fedoraproject.org/en-US/fedora-coreos/access-recovery/

It feels like we are trying to over-engineer this.

For the "select an older boot entry", what's the scenario you're thinking of? If the boot still works enough to SSH in, then you can rpm-ostree rollback. If the boot is completely broken, then I think that's where we want coreos/fedora-coreos-tracker#47 (which yeah, we need to push forward on...).

if the boot still works enough to SSH in - yeah that's great if it does, but who says that it will in every case?

Even if we had automatic rollback working I'd still want to be able to get the grub prompt just in case.

jlebon · 2020-03-20T16:01:07Z

I appreciate the ideas here, but I don't think there is anything we're going to do to cover all cases where someone is going to need to access the grub prompt/kernel command line.

I think what bothers me is that (1) we're slowing down every boot across almost all platforms, and (2) the boot menu is not a very nice interface. So covering the major use cases via a nicer UX is a double win. And a sleep stamp file is a catch all for anything else.

Anyway, would it be unreasonable to start with just bare metal and VMware as suggested higher up (since that's where you expect bootup to take longer anyway)?

dustymabe · 2020-03-20T19:25:44Z

Yeah I'm cool with only implementing it on a subset of platforms and then we can extend that mechanism based on further conversations we have.

So i'll start to try to make this generic so one can specify the timeout for platforms.

dustymabe · 2020-03-20T19:25:53Z

/hold

For the use case of enabling systemd debugging, etc. #1263 (comment)

jlebon · 2021-08-23T18:04:34Z

This should be more tractable once we have coreos/fedora-coreos-tracker#110.

jlebon · 2022-06-20T16:52:30Z

@dustymabe I'd suggest transforming this into a tracker issue RFE and closing this since an implementation of it would likely extend the work we did for platform-specific console instead.

dustymabe · 2022-06-22T02:02:20Z

@dustymabe I'd suggest transforming this into a tracker issue RFE and closing this since an implementation of it would likely extend the work we did for platform-specific console instead.

Broke out into coreos/fedora-coreos-tracker#1236

openshift-ci-robot requested review from arithx and jlebon March 19, 2020 17:08

openshift-ci-robot added the approved label Mar 19, 2020

cgwalters added a commit to cgwalters/coreos-assembler that referenced this pull request Mar 19, 2020

Support cosa run --kargs

1da0d51

For the use case of enabling systemd debugging, etc. coreos#1263 (comment)

cgwalters mentioned this pull request Mar 19, 2020

Support cosa run --kargs #1265

Merged

cgwalters added a commit to cgwalters/coreos-assembler that referenced this pull request Mar 19, 2020

Support cosa run --kargs

42de1f9

For the use case of enabling systemd debugging, etc. coreos#1263 (comment)

cgwalters removed the approved label Mar 20, 2020

openshift-ci-robot added the do-not-merge/hold label Mar 20, 2020

openshift-merge-robot pushed a commit that referenced this pull request Mar 23, 2020

Support cosa run --kargs

e8d2825

For the use case of enabling systemd debugging, etc. #1263 (comment)

cgwalters mentioned this pull request Jun 1, 2020

bootimages: Downloading and updating bootimages via release image openshift/enhancements#201

Closed

miabbott mentioned this pull request Sep 14, 2021

kargs handling regressions? #2430

Open

dustymabe mentioned this pull request Jun 22, 2022

extend grub boot prompt timeout on platforms with full console access coreos/fedora-coreos-tracker#1236

Closed

dustymabe closed this Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

grub: extend boot prompt timeout to 5 seconds #1263

grub: extend boot prompt timeout to 5 seconds #1263

dustymabe commented Mar 19, 2020

openshift-ci-robot commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

bgilbert commented Mar 19, 2020

jlebon commented Mar 19, 2020

dustymabe commented Mar 19, 2020 •

edited

Loading

bgilbert commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 20, 2020

dustymabe commented Mar 20, 2020

cgwalters commented Mar 20, 2020 •

edited

Loading

jlebon commented Mar 20, 2020

jlebon commented Mar 20, 2020

dustymabe commented Mar 20, 2020 •

edited

Loading

dustymabe commented Mar 20, 2020

jlebon commented Mar 20, 2020

dustymabe commented Mar 20, 2020

dustymabe commented Mar 20, 2020

jlebon commented Aug 23, 2021

jlebon commented Jun 20, 2022

dustymabe commented Jun 22, 2022

grub: extend boot prompt timeout to 5 seconds #1263

grub: extend boot prompt timeout to 5 seconds #1263

Conversation

dustymabe commented Mar 19, 2020

openshift-ci-robot commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

bgilbert commented Mar 19, 2020

jlebon commented Mar 19, 2020

dustymabe commented Mar 19, 2020 • edited Loading

bgilbert commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

cgwalters commented Mar 19, 2020

dustymabe commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 19, 2020

cgwalters commented Mar 20, 2020

dustymabe commented Mar 20, 2020

cgwalters commented Mar 20, 2020 • edited Loading

jlebon commented Mar 20, 2020

jlebon commented Mar 20, 2020

dustymabe commented Mar 20, 2020 • edited Loading

dustymabe commented Mar 20, 2020

jlebon commented Mar 20, 2020

dustymabe commented Mar 20, 2020

dustymabe commented Mar 20, 2020

jlebon commented Aug 23, 2021

jlebon commented Jun 20, 2022

dustymabe commented Jun 22, 2022

dustymabe commented Mar 19, 2020 •

edited

Loading

cgwalters commented Mar 20, 2020 •

edited

Loading

dustymabe commented Mar 20, 2020 •

edited

Loading