Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not hard fail on a failed config parse in a manner which prevents the user from fixing it #342

Open
C0rn3j opened this issue Jan 26, 2024 · 2 comments

Comments

@C0rn3j
Copy link

C0rn3j commented Jan 26, 2024

I apologize for the quality of this issue, I had a great one 95% completed, until I had the bright idea to restart dbus-broker on my main system instead of the VM I was testing on, and everything was lost, including my will to rewrite everything properly

This issue exists to prevent #337 from happening without ending up in a usable terminal to be able to fix things.

To repro the state a simple touch /usr/share/dbus-1/system.d/broken.conf and make sure DM is enabled before rebooting.

Current state is that DBUS fails on loading a broken config, which makes systemd-logind fail, which makes it fail to allocate getty on VT2-VT5 like usual.
When a user has a DM enabled, which is the usual state, it will additionally eat the only remaining VT1, preventing getty from starting here, at which the point the only way forward is to reboot, hoping that the bootloader is editable so one can do the booting into /bin/bash trick, or boot a different operating system and fix it from there.


Solutions to this, of which it likely makes sense to implement multiple, if not all:

  1. Make DM launch depend on DBUS launching correctly, so it doesn't eat the only available terminal, this may or may not be reasonable/possible - this is a distro packaging issue at that point I suppose

  2. On fail, find out which VT is free and launch a rescue getty there(equivalent of systemctl start [email protected]), save which one that was so DBUS doesn't launch 5 of them on retries

  3. You can see systemd-logind launch error written out in the image from the OP post of the issue that I linked above. I don't see why DBUS couldn't log its own issue on tty0 too, along with the advice to switch to the rescue VT from 2) to check the full log.

  4. Implement a config check tool and advise distros to use this in packaging, so user is informed of a future failure BEFORE their system is rendered useless

@C0rn3j C0rn3j changed the title Do not hard fail on DBUS failure Do not hard fail on a failed config parse Jan 26, 2024
@rgudwin
Copy link

rgudwin commented Jan 26, 2024

I still believe that it is much simpler to, instead of failing, just print a warning, freeze for 30 seconds, and then ignore the failing configure file and run the service without failure. Just paralyzing the system for 30 seconds during boot, will be enough for someone to try to discover what is wrong and fix the failing file. There is no necessity to difficult life and freeze the whole system due to not allowing a TTY. I understand that a simple warning will probably be unnoticed, but if you print a warning and lock the system for 30 seconds, this would be an enough annoyance for someone to notice that there is something wrong and start looking to fix the non-compliant XML files.

@C0rn3j
Copy link
Author

C0rn3j commented Jan 26, 2024

Just paralyzing the system for 30 seconds during boot, will be enough for someone to try to discover what is wrong and fix the failing file.

If a server I am rebooting takes 30 seconds more or less is absolutely transparent to me - the boot time is in minutes, and I don't even care about it in the first place.
Even my desktop takes two minutes due to the amount of storage and inherent DRR5 slowdowns made worse by poor firmware.
Also you are greatly overestimating people's unwillingness to deal with a 30s slowdown over spending real time on fixing the issue.

I understand why it's failing, but it is rendering a regular desktop system irrecoverable, which is not good.

@C0rn3j C0rn3j changed the title Do not hard fail on a failed config parse Do not hard fail on a failed config parse in a manner which prevents the user from fixing it Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants