Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osqueryd repeatedly faulting on Linux (EFAULT) #1773

Closed
ankon opened this issue Jul 9, 2024 · 14 comments
Closed

osqueryd repeatedly faulting on Linux (EFAULT) #1773

ankon opened this issue Jul 9, 2024 · 14 comments

Comments

@ankon
Copy link

ankon commented Jul 9, 2024

From dmesg:

[Tue Jul  9 12:40:09 2024] show_signal: 68 callbacks suppressed
[Tue Jul  9 12:40:09 2024] traps: SchedulerRunner[2848] general protection fault ip:55f3cffa1892 sp:7f4734bf9fd0 error:0 in osqueryd[55f3cfa02000+2eb2000]
[Tue Jul  9 12:43:14 2024] SchedulerRunner[4480]: segfault at 18 ip 000055d8f49b0892 sp 00007fb3c0bfb250 error 4 in osqueryd[55d8f4411000+2eb2000] likely on CPU 2 (core 2, socket 0)
[Tue Jul  9 12:43:14 2024] Code: df 48 89 c6 e8 1d 93 90 02 48 8d 35 46 30 5d fe 49 89 e7 4c 89 ff 48 89 da e8 8b bc 90 02 49 8b 06 6a 01 5e 4c 89 f7 4c 89 fa <ff> 50 18 48 89 e7 e8 0d 8b 90 02 64 48 8b 04 25 28 00 00 00 48 3b
[Tue Jul  9 13:09:57 2024] traps: SchedulerRunner[6507] general protection fault ip:5577165b1892 sp:7fa485ffa1b0 error:0 in osqueryd[557716012000+2eb2000]
[Tue Jul  9 13:40:41 2024] SchedulerRunner[7375]: segfault at 0 ip 0000000000000000 sp 00007f80fdffa818 error 14 in osqueryd[55b939ad2000+1f44000] likely on CPU 1 (core 1, socket 0)
[Tue Jul  9 13:40:41 2024] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[Tue Jul  9 14:08:42 2024] SchedulerRunner[7985]: segfault at 1 ip 00007f86db38fd1c sp 00007f86d0bf9dd8 error 4 in libc.so.6[7f86db238000+16d000] likely on CPU 5 (core 5, socket 0)
[Tue Jul  9 14:08:42 2024] Code: ff ff 48 89 f8 0f bc c9 f3 a4 c3 0f 1f 00 f3 0f 1e fa 89 f8 62 a1 fd 00 ef c0 25 ff 0f 00 00 3d e0 0f 00 00 0f 87 24 01 00 00 <62> f1 7d 20 74 07 c5 fb 93 c0 85 c0 74 16 0f bc c0 c3 66 90 0f bc

launcher system journal:

Jul 09 12:36:51 minerva osqueryd[2588]: osqueryd started [version=5.12.2]
Jul 09 12:40:09 minerva osqueryd[4225]: osqueryd started [version=5.12.2]
Jul 09 12:43:14 minerva osqueryd[6247]: osqueryd started [version=5.12.2]
Jul 09 13:09:58 minerva osqueryd[7116]: osqueryd started [version=5.12.2]
Jul 09 13:40:42 minerva osqueryd[7729]: osqueryd started [version=5.12.2]
Jul 09 14:08:42 minerva osqueryd[8373]: osqueryd started [version=5.12.2]

System is up-to-date, restarting did not fix these.

Linux minerva 6.9.7-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 27 18:11:45 UTC 2024 x86_64 GNU/Linux

I don't see any symbols in osqueryd, so unfortunately don't see a quick way of pointing anywhere closer.

@ankon
Copy link
Author

ankon commented Jul 9, 2024

Forgot the version:

$ rpm -q launcher-kolide-k2
launcher-kolide-k2-1.4.2-1.x86_64

@directionless
Copy link
Contributor

If you're running launcher, could you please run sudo /usr/local/kolide-k2/bin/launcher flare and send us the output? It will create a debugging tarball and upload it to our cloud, and print out the remote file name

@ankon
Copy link
Author

ankon commented Jul 10, 2024

{"caller":"main.go:36","msg":"Launcher starting up","revision":"db7106fe13683c7b1c656de23672cbe5f50e7b59","severity":"info","ts":"2024-07-10T09:41:57.520941275Z","version":"1.4.2"}
{"caller":"library_lookup.go:195","component":"tuf_library_lookup","msg":"found executable matching current release","path":"/var/kolide-k2/k2device.kolide.com/updates/launcher/1.8.1/launcher","severity":"info","ts":"2024-07-10T09:41:57.541273749Z","version":"1.8.1"}
{"caller":"main.go:242","msg":"got new version of launcher to run","new_binary_path":"/var/kolide-k2/k2device.kolide.com/updates/launcher/1.8.1/launcher","new_binary_version":"1.8.1","old_version":"1.4.2","severity":"info","ts":"2024-07-10T09:41:57.541327692Z"}
{"caller":"main.go:219","msg":"preparing to exec new binary","new_binary":"/var/kolide-k2/k2device.kolide.com/updates/launcher/1.8.1/launcher","old_version":"1.4.2","severity":"info","ts":"2024-07-10T09:41:57.541343773Z"}
{"time":"2024-07-10T09:41:57.551326109Z","level":"INFO","msg":"launcher starting up","launcher_run_id":"01J2E0AS5FWV2W2NB5DPJD0T8V","version":"1.8.1","revision":"22bf14babfa22ff5ddf7e744f42a825dd638ff7c"}
{"time":"2024-07-10T09:41:57.560509096Z","level":"INFO","msg":"found executable matching current release or pinned version","launcher_run_id":"01J2E0AS5FWV2W2NB5DPJD0T8V","component":"tuf_library_lookup","binary":"launcher","update_channel":"stable","pinned_version":"","executable_path":"/var/kolide-k2/k2device.kolide.com/updates/launcher/1.8.1/launcher","executable_version":"1.8.1","span_id":"0000000000000000","trace_id":"00000000000000000000000000000000","trace_sampled":false}
{"time":"2024-07-10T09:41:57.560528272Z","level":"INFO","msg":"nothing newer","launcher_run_id":"01J2E0AS5FWV2W2NB5DPJD0T8V"}
{"time":"2024-07-10T09:42:01.077718273Z","level":"INFO","msg":"flare creation complete","launcher_run_id":"01J2E0AS5FWV2W2NB5DPJD0T8V","status":"flare uploaded successfully","file":"2024/07/10/01J2E0ASGYW4G7M8Q3Z5AHW6T8.zip"}
time=2024-07-10T09:42:01.077Z level=INFO source=/home/runner/work/launcher/launcher/cmd/launcher/flare.go:103 msg="flare creation complete" launcher_run_id=01J2E0AS5FWV2W2NB5DPJD0T8V status="flare uploaded successfully" file=2024/07/10/01J2E0ASGYW4G7M8Q3Z5AHW6T8.zip

@directionless
Copy link
Contributor

Thank you so much for sending that in. We'll dig in

@directionless
Copy link
Contributor

directionless commented Jul 12, 2024

@RebeccaMahany
Copy link
Contributor

Also followed up in osquery slack: https://osquery.slack.com/archives/C08V7KTJB/p1720792822595459

@ankon
Copy link
Author

ankon commented Jul 13, 2024

As far as I can see I cannot access these, so: Let me know if I can be of any help.

Current rough counters:

$ sudo dmesg -T | grep osqueryd  | sed -re 's,  , ,' | cut -f 1,2,3 -d ' '  | sort | uniq -c
     27 [Fri Jul 12
     17 [Sat Jul 13
     29 [Thu Jul 11
     17 [Tue Jul 9
     28 [Wed Jul 10

@RebeccaMahany
Copy link
Contributor

Sorry, closed by accident

@RebeccaMahany
Copy link
Contributor

RebeccaMahany commented Jul 30, 2024

An issue was opened in osquery here: osquery/osquery#8384

@ankon
Copy link
Author

ankon commented Aug 2, 2024

I see things moving in the right direction, but I am having a bit of trouble understanding how many steps are there between "osquery has a (resolved/fixed) issue" and "osquery getting updated for kolide".

Is there a way for me to do the update manually, for instance?

@RebeccaMahany
Copy link
Contributor

@ankon It looks like that fix has not made it into an osquery release yet -- I don't see it in 5.13.0, anyway -- so I don't think there's anything you can do manually at the moment. Will tag in @directionless for a better explanation of the osquery release process than I can give. 🙂

@directionless
Copy link
Contributor

In general, Kolide uses the official osquery releases. This means our process is something like:

  1. Work to fixing osquery
  2. Osquery releases (roughly every 2 months)
  3. Kolide deploys osquery as a beta (roughly 1-2 weeks)
  4. Kolide deploys to stable

In this case, I know that osquery is talking about cutting a 5.13.2 release with this fix. Though I'm delaying that slightly, because there is another linux crash. I would estimate to see an osquery release deployed out in the next 2 weeks.

@RebeccaMahany
Copy link
Contributor

RebeccaMahany commented Aug 19, 2024

@ankon -- we just released osquery 5.13.1 to stable for Kolide. You should hopefully see this autoupdate about an hour, and it should resolve the segfault issue. Let us know how it works for you!

@ankon
Copy link
Author

ankon commented Sep 2, 2024

I can confirm that indeed no more segfaults in my dmesg. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants