Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

troubleshooting: Add instructions for enabling kdump #198

Merged
merged 1 commit into from
Dec 7, 2020

Conversation

kelvinfan001
Copy link
Member

@kelvinfan001 kelvinfan001 commented Oct 19, 2020

Since kexec-tools is currently not yet in the base image, I included a step to layer kexec-tools. Should I hold this PR until we've included kexec-tools in the base image?

@cgwalters
Copy link
Member

It is probably simplest to ship it now and then hold the PR until it's reached stable, but you could add a note in the meantime to layer if it's not installed?

+
[source, bash]
----
sed -i "s/^path.*/path \/sysroot\/ostree\/deploy\/fedora-coreos\/var\/crash/" /etc/kdump.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That...looks like a bug?

modules/ROOT/pages/debugging-kernel-crashes.adoc Outdated Show resolved Hide resolved
@kelvinfan001 kelvinfan001 force-pushed the kfan-kdump-doc branch 3 times, most recently from 9005a29 to 7504fda Compare October 20, 2020 13:36
@jlebon
Copy link
Member

jlebon commented Oct 23, 2020

kexec-tools is added in coreos/fedora-coreos-config#708. Maybe let's also get #199 in and then this document can link to that?

@dustymabe
Copy link
Member

#199 is in now

+
[source, bash]
----
sudo rpm-ostree kargs --append='crashkernel=256M'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc suggests 128, maybe we should keep that 'default'? Unless there is a reason fcos needs more memory.

https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based off of a local qemu build of the current testing-devel, it seems like 128M does not work.

@kelvinfan001
Copy link
Member Author

@dustymabe Looks like kexec-tools is in the stable stream now; this should be OK to merge?

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, just some minor things.

@@ -0,0 +1,32 @@
= Enabling kdump to debug kernel crashes
Copy link
Member

@jlebon jlebon Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about rewording this as "Debugging kernel crashes using kdump" so that the title matches the menu item more closely?

----
xref:kernel-args.adoc[More information] on how to modify kargs via `rpm-ostree`.

. Make sure the path in which the vmcore will be saved is somewhere in the `/var` directory of the FCOS stateroot. It is also possible to write the dump over the network; for additional information, consult the mkdumprd man page and the comments in /etc/kdump.conf.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
. Make sure the path in which the vmcore will be saved is somewhere in the `/var` directory of the FCOS stateroot. It is also possible to write the dump over the network; for additional information, consult the mkdumprd man page and the comments in /etc/kdump.conf.
. Make sure the path in which the vmcore will be saved is somewhere in the `/var` directory of the FCOS stateroot. It is also possible to write the dump over the network; for additional information, see `mkdumprd(8)` and the comments in `/etc/kdump.conf`.

+
[source, bash]
----
sed -i "s/^path.*/path \/sysroot\/ostree\/deploy\/fedora-coreos\/var\/crash/" /etc/kdump.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just use /var/crash?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm you're right. I just tested and it seems like the default /var/crash seems to have no problems. I don't remember why I thought we had to modify it.

sudo systemctl reboot
----

TIP: For additional information on how to test that kdump is properly armed and how to analyze the dump, refer to the https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes[kdump documentation for Fedora].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kelvinfan001 kelvinfan001 merged commit dee15ef into coreos:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants