-
-
Notifications
You must be signed in to change notification settings - Fork 21.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Breakpad for crash dump generation #56014
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
Oh yeah, thanks. I misremembered where that button was supposed to be, and couldn't see it there... |
I have approved cicd. |
Some more static check failures and try to collapse the commits by squashing. We prefer one unless you have a logical reason (we allow to have a few commits). |
I have ran clang-format and renamed that one place where a file was incorrectly referred to in a file copyright header. I haven't pushed those changes yet as I wanted to check the Windows compile and determine if some of the common files are even needed for any platform. I'll do a manual squash as the last thing once I don't need to tweak the code anymore, as that'll get rid of some intermediate files that would otherwise be in git history. |
a7e507a
to
b5675f3
Compare
All issues detected by CI should now hopefully be fixed. I also removed the commented out parts in the build config file and did some last tweaks. Also made sure it compiles on Windows, I'm not super confident about how the string conversion should be used in 4.0, in 3.4 I think I was able to figure out it pretty well. I think this now has all the changes on the engine side that are needed for crash dump generation. Though, I've not yet gotten to test the entire flow of creating a crash dump and being able to decode it. As an additional difficulty Windows MinGW builds seem a bit problematic regarding extracting symbols but this breakpad fork: https://github.com/DaemonEngine/breakpad seems to have a working tool for that. |
I've just ran into a pretty big issue, it seems even though I found a tool that can dump the symbols of a mingw created exe on Linux, it doesn't seem to contain any symbols besides the standard library. It doesn't even detect that any of the Godot source files are included. |
Let me get back to you but I think I posted a llvm-mingw workflow. |
Here's the exact command I use. PATH=/opt/llvm-mingw/bin:$PATH scons werror=no platform=windows target=release_debug -j`nproc` use_lto=no deprecated=no use_mingw=yes use_llvm=yes use_thinlto=yes LINKFLAGS=-Wl,-pdb= CCFLAGS='-g -gcodeview' debug_symbols=no" The important part is llvm-mingw and |
Kind of expected, but the Godot podman build image scripts don't install llvm, so I get this error with that:
it shouldn't be too difficult to modify my local copy to also install llvm in the build image to test out that way of making the build. First I'll try that thinlto one with the mingw gcc to see if that has any effect. Edit: turning off lto and turning on the thin lto, seems to have increased the symbols file size to 100 MB, which now seems to have quite many Godot symbols in it. |
SConstruct
Outdated
@@ -135,6 +135,7 @@ opts.Add(BoolVariable("opengl3", "Enable the OpenGL/GLES3 video driver", True)) | |||
opts.Add("custom_modules", "A list of comma-separated directory paths containing custom modules to build.", "") | |||
opts.Add(BoolVariable("custom_modules_recursive", "Detect custom modules recursively for each specified path.", True)) | |||
opts.Add(BoolVariable("use_volk", "Use the volk library to load the Vulkan loader dynamically", True)) | |||
opts.Add(BoolVariable("breakpad_enabled", "Enable Breakpad crash dump creation.", False)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The common pattern is "use_breakpad". Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change it easily enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this to "use_breakpad" unless you have reasoning against?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can. I got stuck yesterday trying to fix that problem I commented about (#56014 (comment)), and I didn't want to switch over to changing this as I have a bunch of experimental local changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the scons option and the defines used in C++ as volk seemed to also use that format of preprocessor defines.
I ran into another issue: in release mode on Linux when Godot doesn't register the signal handlers, but instead I leave that to Breakpad, it doesn't work for some reason. I might need to do a workaround where the Godot handlers are always installed, or perhaps the breakpad version I got has a bug, but that would be a pretty serious slip by Google... |
How do I recreate your bug case? Maybe posting some info here can reveal the answer as I'm not in this area. |
After debugging with gdb I think I figured it out. The mono runtime actually messes with the signals. I noticed when putting breakpoints on Here's where mono messes with the signal handlers:
This means that using a workaround of always registering the Godot crash handler signal handlers and passing that onto breakpad, should work. I don't have enough time to test that today, though. |
Seems like not even adding code like:
works, even though that's the same as in debug mode. Note that the editor properly runs the crash handlers. So there's definitely something in the Linux Mono runtime when running in release mode exported game that makes it override Godot's and also Breakpad's signal handlers. I doubt that even Godot's own crash handler was able to work in this mode, but then again it's always disabled in the release mode, so I guess no one has encountered this before. It doesn't seem there's an easily tweakable place in gd_mono.cpp where this happens. So I might just have to put in a workaround in gd_mono.cpp (or some place higher up in the callstack), where I re-initialize Breakpad (on Linux) just after Mono has been loaded to fix the signals it messed up. |
8131d5b
to
4018d18
Compare
b12cb05
to
8ebf879
Compare
Thanks to @Chaosed0 this PR has been updated to be rebased on master and should be working in Godot 4. I haven't personally done a full end to end test as my game currently always crashes on shutdown (due to: #89941) so this wouldn't be helpful to me currently in any case. Also I didn't yet squash the commits into one as this probably has some tweaks left to do and I was unsure how I should squash things in the case of having multiple authors on a PR. |
void report_user_data_dir_usable(); | ||
|
||
// Due to Mono runtime initialization in release mode overriding signal handlers, Breakpad needs re-initialization after loading it | ||
void report_mono_loaded_to_breakpad(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are some leftover mono bits of the code. I can remove these bits when someone would be able to do a PR review / there's a chance that this PR could be betting any closer to being merged.
Please could you squash those commits? |
What's the policy on squashing commits from multiple authors? Adding co-authored by lines to the squashed commit is required to not lose attribution for the multiple people's work. Also I didn't squash this yet as it seems unlikely that this would be any more likely to be merged sooner even if I did that. And I kind of suspect that this is going to break at least once more before this can be merged... I can squash this up if it makes it any more likely that this PR would move forward (and I'd need to know how to handle squashing commits from multiple authors). When is the next possible merge window opening up? |
Not sure, but 4.3 is nearly out so this is probably the best time to get it ready to guarantee in 4.4. Considering how many issues there are with completely worthless crash messages, this PR should be considered top priority, at least I think so. I've been told by @akien-mga that testing PRs are more likely to get them merged. So this can be the pilot PR to prove it.
programmer humor Checkout and Build
Rebased onto master (383a6e4) without conflicts.
TestI placed a bug into a known script-callable function.
After building I removed the pdb to ensure a worthless log. Result
In addition to the usual log, a .dmp file was created in "%appdata%\Godot\app_userdata\testproj\crashes" |
Appreciate your time testing the PR. In case it helps speed things along, I really don't care about credit for this - hhyyrylainen did most of the work anyway, and really I just want crash dumps :) so squash my commits as necessary. |
@hhyyrylainen |
Well as far as I know this is still good to go after being fixed in June. So if anyone with merge permissions would like to review this, that'd be pretty great. Or can someone tell me where I'd need to go to ask someone to review this? This isn't yet super time critical for me as for my game it seems only Godot 4.4 will stop it from crashing 100% of the time, so currently for me these changes are not useful yet. |
First it needs to be squashed, then sometime next week I'll start campaigning for this to be merged. |
It's not a requirement to squash before it has been reviewed. Once approved, it can be squashed, but there will likely be more work needed once someone does a deep dive to test and review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it works as expected.
Example crash dumps: crashes.zip
Example output (I added CRASH_NOW_MSG("Crash message example.")
to project_manager.cpp
for testing):
ERROR: Crash message example.
at: _notification (./editor/project_manager.cpp:103)
Crash dump created at: /home/hugo/.local/share/godot/app_userdata/[unnamed project]/crashes/bff06af4-d567-4e06-600635bf-35b8f3ae.dmp
================================================================
handle_crash: Program crashed with signal 4
Engine version: Godot Engine v4.4.dev.custom_build (c1f12813bf9d07cff283fae793aea385272640f9)
Dumping the backtrace. Please include this when reporting the bug to the project developer.
[1] /lib64/libc.so.6(+0x40d00) [0x7ff362c4fd00] (??:0)
[2] bin/godot.linuxbsd.editor.x86_64() [0x4cc0130] (/home/hugo/Documents/Git/godotengine/godot/./editor/project_manager.cpp:103 (discriminator 2))
[3] bin/godot.linuxbsd.editor.x86_64() [0x4cefd9f] (/home/hugo/Documents/Git/godotengine/godot/./editor/project_manager.h:56 (discriminator 14))
[4] bin/godot.linuxbsd.editor.x86_64() [0x9b46534] (/home/hugo/Documents/Git/godotengine/godot/./core/object/object.cpp:873)
[5] bin/godot.linuxbsd.editor.x86_64() [0x5c52cd5] (/home/hugo/Documents/Git/godotengine/godot/./scene/main/node.cpp:279)
[6] bin/godot.linuxbsd.editor.x86_64() [0x5c52c45] (/home/hugo/Documents/Git/godotengine/godot/./core/templates/hash_map.h:455)
[7] bin/godot.linuxbsd.editor.x86_64() [0x5c6b713] (/home/hugo/Documents/Git/godotengine/godot/./scene/main/node.cpp:3204)
[8] bin/godot.linuxbsd.editor.x86_64() [0x5d24c41] (/home/hugo/Documents/Git/godotengine/godot/./scene/main/scene_tree.cpp:473)
[9] bin/godot.linuxbsd.editor.x86_64() [0x2b1c28b] (/home/hugo/Documents/Git/godotengine/godot/./servers/display_server.h:60)
[10] bin/godot.linuxbsd.editor.x86_64() [0x2b11771] (/home/hugo/Documents/Git/godotengine/godot/platform/linuxbsd/godot_linuxbsd.cpp:85)
[11] /lib64/libc.so.6(+0x2a088) [0x7ff362c39088] (??:0)
[12] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7ff362c3914b] (??:0)
[13] bin/godot.linuxbsd.editor.x86_64() [0x2b11565] (??:?)
-- END OF BACKTRACE --
================================================================
[1] 101724 IOT instruction (core dumped) bin/godot.linuxbsd.editor.x86_64
Some comments:
-
When the crash happens within the editor or project manager, the dump is saved in a project-specific folder. It should go to the editor data folder instead (
EditorPaths::get_data_dir()
). -
Could the dump filename be prefixed with an ISO 8601 timestamp? This would allow it to be sorted by date even if filesystem timestamps get overwritten for some reason (or when sharing the file to others).
- For reference, we use
YYYY-MM-DDTHHMMSS
for editor screenshots (whereT
is a literal "T"). I suggest using the same format for consistency.
- For reference, we use
-
I suggest rewording the message to make it easier to understand for people not familiar with Breakpad:
Breakpad crash dump created at: /path/to/crash.dmp
Please attach this file when reporting issues to the project developer.
- Interestingly, GitHub doesn't allow uploading
.dmp
files in issue comments, even though the error message says it's one of the allowed extensions:
We don’t support that file type.
Try again with GIF, JPEG, JPG, MOV, MP4, PNG, SVG, WEBM, CPUPROFILE, CSV, DMP, DOCX, FODG, FODP, FODS, FODT, GZ, JSON, JSONC, LOG, MD, ODF, ODG, ODP, ODS, ODT, PATCH, PDF, PPTX, TGZ, TXT, XLS, XLSX or ZIP.
-
use_breakpad
isFalse
by default, which means this feature is disabled by default. This might surprise people compiling from source, so maybe it should be enabled by default? I'm not sure, since official builds won't be able to benefit from it until debugging symbols are distributed or at least preserved somewhere. -
Is the
use_breakpad
SCons option needed when there'smodule_breakpad_enabled
, which could default toFalse
? We do something similar for C#, wheremodule_mono_enabled
isFalse
by default.
# TODO: find out when these are needed (if at all) | ||
dwarf_module = False | ||
stabs_module = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did anyone find that these were needed in the end? If not, I'd remove the relevant code (and perhaps keep a second branch on your fork with this code present).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember right, no one has tried this out on mac yet. So there is a chance that these would be required on mac.
If it is determined necessary to get this PR merged, I now have a mac I can use for development so I can check things there to make sure the mac version compiles and can create crash dumps.
@Calinou thanks for the review. Most of that sounds doable.
Sounds easily doable. I'll make changes soon (I'm preparing a new release of my game this week which I has to get done this week so that takes priority).
I think this is going to be pretty difficult as I just tried looking and it doesn't seem like breakpad allows overwriting the file name. If I found the right place in the Breakpad code then a crash dump is always generated with a random GUID as the name. So it might be that the only way to update that part is to patch the library itself. If I remember right, minidumps should have the crash time in the headers, so the actual crash time can be verified by reading the minidump file headers. And I did take a quick peek at the Breakpad code and it looks like it has some header update logic in the minidump writer class that updates the time:
I'll do this. Also if it isn't a bad idea, I'll add a tiny bit of delay like half a second (and trying to flush output) before quitting after printing that message. I'm hoping it helps with my usecase where a parent process reading the output for some reason never sees the crash dump info, though it does always appear in terminal. Also could slightly help with command prompt usage if it closes with a slight delay there's a chance the user kind of can see that something was printed.
I guess that would be fine, but the official builds should not be compiled with Breakpad enabled until symbols can be distributed, because otherwise exported games will create crash dumps that no one will be able to decode. So I think those would cause more confusion than help with anything. If there's some way to default to on only when compiling outside the official release builds, I could do that.
I didn't actually know that there was a separate mechanism already to enable / disable parts of the engine. I think I saw some other feature also having a similar SCons option so I followed that, but I can switch to just having a |
I started working on a simple Breakpad crash reporting integration to Godot related to this proposal: godotengine/godot-proposals#1896
Some current caveats:
I have not tested this with Godot 4, I originally made this against the 3.4 stable label and cherry picked to this branch (that other branch is here: https://github.com/Revolutionary-Games/godot/tree/crash_dumper_3.x). I only did a very quick compile fix and checked that Godot starts on this branch. There didn't seem to be a working sample project available in the asset library(?) so I couldn't test with a project.There's some cleanup left to do regarding which files to include and which need to be compiledMy local install of clang-format doesn't work with the options in this repo so I couldn't format the files with it (clang-format failed with code 1: /home/hhyyrylainen/Projects/godot/.clang-format:151:1: error: unknown key 'SpacesInLineCommentPrefix')update to Fedora 35 fixed this by having a newer clang versionI noticed that Windows crash reporter on destruction doesn't disable itself. seems like nothing else either disables it on shutdown. Is that intended? It's not consistent with Linux. For my own testing I added the line to disable it in the destructor, but I didn't commit that hereI added this codeI haven't yet tested whether the created crashdumps can be decoded with stackwalk (I'm going to work on my build scripts next to make this work, I'm not confident enough with the Godot build process yet to make a PR to the build scripts repo). Also Windows is less tested than LinuxWindows crash dumps only work with the special mingw supporting stackwalk, but no special version is required in the Godot repo when building Godot, so this is just a minor inconvenience.I don't have a mac to develop on, sosupport for that needs to be done later