Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-708: Extract and merge kernel arguments from /proc/cmdline #3856

Merged
merged 1 commit into from
Sep 18, 2023

Conversation

ori-amizur
Copy link
Contributor

In firstboot MCO checks if reboot can be skipped. In order for reboot to be skipped, the kernel arguments of the current (booted) system and the expected system need to match.
Currently, in firstboot the list of the current kargs is assumed to be empty. To reflect the actual list of arguments the system was booted with, this change extracts the set of booted kargs from /proc/cmdline to be used for comparison.
Only kargs that appear both in the requested kargs and /proc/cmdline are used for comparison.

- What I did
Extracted the kernel arguments from /proc/cmdline to be used for comparison with the expected kernel arguments
- How to verify it

  1. regression
  2. With assisted installed with other changes in the epic, verify that reboot is skipped and installation is successful.
    - Description for the changelog

Extract and merge kernel arguments from /proc/cmdline

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 15, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Aug 15, 2023

@ori-amizur: This pull request references MCO-708 which is a valid jira issue.

In response to this:

In firstboot MCO checks if reboot can be skipped. In order for reboot to be skipped, the kernel arguments of the current (booted) system and the expected system need to match.
Currently, in firstboot the list of the current kargs is assumed to be empty. To reflect the actual list of arguments the system was booted with, this change extracts the set of booted kargs from /proc/cmdline to be used for comparison.
Only kargs that appear both in the requested kargs and /proc/cmdline are used for comparison.

- What I did
Extracted the kernel arguments from /proc/cmdline to be used for comparison with the expected kernel arguments
- How to verify it

  1. regression
  2. With assisted installed with other changes in the epic, verify that reboot is skipped and installation is successful.
    - Description for the changelog

Extract and merge kernel arguments from /proc/cmdline

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ori-amizur
Copy link
Contributor Author

/assign @sinnykumari

@ori-amizur
Copy link
Contributor Author

/uncc @djoshy

@openshift-ci openshift-ci bot removed the request for review from djoshy August 15, 2023 10:10
@ori-amizur
Copy link
Contributor Author

/retest

Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic won't be reliable and can create undesired behavior in certain cases. For example, consider someone applied kargs hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=6 through MCO (very common through PAO/NTO). For such kargs ordering matters, what it means for our example is we are requesting 4 hugepages of size 1G size and 6 hugepages of size 2M. Now, suppose /proc/cmdline contains hugepagesz=1G hugepages=6 . With the proposed logic, it will consider that we already have two kargs and will skip them and apply remaining.
For context, see past bug https://bugzilla.redhat.com/show_bug.cgi?id=1866546#c10 .

From past experience, handling kargs reliably is tricky and we would want to stick with what user have provided with respecting the order.
What I am proposing is instead of merging un-applied kargs to oldconfig what we can do here is in compareMachineConfig() , when we see that diff is only in kargs. Add an extra comparison where we merge all kargs supplied in newConfig separated by space and see if this complete set of merged kargs string is present in /proc/cmdline. If yes, compareMachineConfig would return true. If not, then it means that something didn't match and we fallback to performing update with reboot in order for MCO to apply MCO managed kargs present in newwConfig.

After having offline conversation with Ori, realized that we always perform update on firstboot from nil MC err = dn.update(nil, &mc, false). So, any changes made in oldConfig is mainly used for comparison in compareMachineConfig(). So, it it should be fine.

@@ -122,6 +122,24 @@ func (dn *Daemon) performPostConfigChangeAction(postConfigChangeActions []string
return dn.triggerUpdateWithMachineConfig(state.currentConfig, state.desiredConfig, true)
}

func mergeRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on what and why we are doing this would be helpful.

@sinnykumari
Copy link
Contributor

Will be good to have a unit test for this.

@ori-amizur ori-amizur force-pushed the MCO-708 branch 2 times, most recently from 7fa9f39 to 279be2d Compare August 31, 2023 10:07
@ori-amizur
Copy link
Contributor Author

/retest

1 similar comment
@ori-amizur
Copy link
Contributor Author

/retest

}

func setRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error {
b, err := os.ReadFile("/proc/cmdline")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's very similar code in validateKernelArguments, which parses rpm-ostree kargs (which is basically the same thing as /proc/cmdline in practice).

But, we also already have a parseKernelArguments function here.

(BTW, I do think longer term we need to drop this kernel argument handling out of the MCO; I think containers/bootc#22 (comment) may be the best path to do it)

Copy link
Contributor Author

@ori-amizur ori-amizur Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseKernelArguments parses the kernel arguments from machine config, not from cmdline.
Although it may be used for splitting the cmdline assuming the first and only element of the provided slice is the cmdline.
I am not sure how validateKernelArguments can help for comparing the existing kernel arguments with the desired kernel arguments, since it is used for validation that it matches kargs from rpm-ostree not from cmdline.
In addition, the function doesn't preserve the order as requested by @sinnykumari

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is attempt in parseKernelArguments to parse each of the kernel arguments provided in the machine config. On the other hand KernelArguments that were not parsed in machine-configs are compared in the function compareMachineConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgwalters what are you suggesting here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I think we shouldn't have one part of the MCO code reading /proc/cmdline and another running rpm-ostree kargs. They're not the same thing, but...what we're doing is aiming to be close, and the quotation bug below shows why it's good to only have one path.

pkg/daemon/update.go Outdated Show resolved Hide resolved
config.Spec.KernelArguments = nil
for _, split := range splits {
for _, reqKarg := range requestedKargs {
if strings.ReplaceAll(reqKarg, "\"", "") == strings.ReplaceAll(split, "\"", "") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this line, what's the ReplaceAll doing here?

I would say we should try to align with validateKernelArguments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In /proc/cmdline the arguments appear without quotes even if the quotes were provided to ostree as kernel arguments. To align them, the quotes are removed here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if the quotes were provided to ostree as kernel arguments

What quotes given to ostree? The ostree side just passes through these things. Where exactly are the quotes coming from?

Copy link
Contributor Author

@ori-amizur ori-amizur Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4.14 if we take the kernelArguments in the file /etc/ignition-machine-config-encapsulated.json we get:

[
  "systemd.unified_cgroup_hierarchy=1",
  "cgroup_no_v1=\"all\"",
  "psi=1"
]

But after they are applied, in the cmdline file, it looks like (partial):

systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=1

So the second one appears without quotes in the cmdline file.

Copy link
Member

@cgwalters cgwalters Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's be clear then, we're talking about cgroup_no_v1="all" as the weird outlier here. Looks like that comes from this code:

kernelArgsv2 = []string{"systemd.unified_cgroup_hierarchy=1", "cgroup_no_v1=\"all\"", "psi=1"}

Which...what the heck is that? (a bit of time passes) OK, I found it in https://docs.kernel.org/admin-guide/cgroup-v2.html

But I think whoever wrote that code was likely confused...it certainly seems to me like the Linux kernel is intending to parse cgroup_no_v1=all, not cgroup_no_v1="all". (Though I bet I see why the code author made that mistake because the kernel docs seem to include the quotes in one case but not another).

Compounding the fun here, also AFAICS there are zero warnings for unknown/unhandled parameters here. So specifying cgroup_no_v1="all" just silently does nothing (edit: Nope, see below).

Now the next large problem here is that something in the stack is "eating" these quotes.

Playing with this using e.g.:

$ rpm-ostree kargs --append='cgroup_no_v1="all"'
$ reboot
...
$ cat /proc/cmdline
... cgroup_no_v1=all

However, notice the bootloader entry does have those quotes:

$ grep options /boot/loader/entries/ostree-2-rhcos.conf 
options ignition.platform.id=qemu ... cgroup_no_v1="all" 
$

If I was a betting person here (and I am) then my money would be on grub doing the wrong thing.

Wait, wait no perhaps not: I think it's pretty clear, check out this kernel code:
https://github.com/torvalds/linux/blob/7ba2090ca64ea1aa435744884124387db1fac70f/lib/cmdline.c#L224

It's clearly implementing a minimal subset of "shell like" parsing...

but can't escape "

Oh man, what an ad-hoc ill-specified mess!

Copy link
Member

@cgwalters cgwalters Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offhand I think this logic dates back to the Linux kernel pre-history torvalds/linux@1da177e

So here's what I'd say:

  • Factor out a helper function named e.g. parseKernelArgumentsLikeLinux that performs the same steps as the Linux kernel in "parsing" command line arguments (and I'm now realizing that we probably need to do this in ostree side too)
  • Cross-reference the Linux kernel code
  • Consider changing the kubelet controller code to drop the unnecessary quotes because we've just painted ourselves into an unfortunate corner case with this

Alternatively - notice that rpm-ostree kargs outputs what it wrote into the bootloader config (i.e. it includes quotes). So instead of parsing /proc/cmdline, we could parse that (which again is what the other MCO code is doing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using rpm-ostree kargs sounds like a good solution.

@ori-amizur ori-amizur force-pushed the MCO-708 branch 2 times, most recently from dc7301f to 67d9448 Compare August 31, 2023 16:50
@ori-amizur
Copy link
Contributor Author

/retest

@@ -122,6 +122,28 @@ func (dn *Daemon) performPostConfigChangeAction(postConfigChangeActions []string
return dn.triggerUpdateWithMachineConfig(state.currentConfig, state.desiredConfig, true)
}

func setRunningKargsWithCmdline(config *mcfgv1.MachineConfig, requestedKargs []string, cmdline []byte) error {
splits := splitKernelArguments(strings.TrimSpace(string(cmdline)))
config.Spec.KernelArguments = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to make sure that calling twice wouldn't cause it to behave badly.

Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for adding unit test and comment.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 4, 2023
In firstboot MCO checks if reboot can be skipped.  In order for reboot
to be skipped, the kernel arguments of the current (booted) system and
the expected system need to match.
Currently, in firstboot the list of the current kargs is assumed to be
empty.  To reflect the actual list of arguments the system was booted
with, this change extracts the set of booted kargs from /proc/cmdline
to be used for comparison.
Only kargs that appear both in the requested kargs and /proc/cmdline are used for comparison.
@ori-amizur ori-amizur marked this pull request as draft September 8, 2023 04:53
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 8, 2023
@ori-amizur ori-amizur force-pushed the MCO-708 branch 3 times, most recently from c8e1191 to f2877d6 Compare September 8, 2023 10:14
@ori-amizur ori-amizur marked this pull request as ready for review September 8, 2023 10:19
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 8, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 8, 2023

@ori-amizur: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn f2877d6 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cgwalters
Copy link
Member

Thanks!
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 8, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 8, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, ori-amizur, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cgwalters,sinnykumari]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD f7333c4 and 2 for PR HEAD f2877d6 in total

@openshift-merge-robot openshift-merge-robot merged commit 7aacc8a into openshift:master Sep 18, 2023
}

func setRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error {
rpmostreeKargsBytes, err := runGetOut("rpm-ostree", "kargs")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very neat and consistent recommendation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants