Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erratic service file behavior with systemd >= 246 #62

Open
sarnold opened this issue Aug 31, 2020 · 2 comments
Open

Erratic service file behavior with systemd >= 246 #62

sarnold opened this issue Aug 31, 2020 · 2 comments
Assignees
Labels
distro OS or distro runtime environment watch item hard to reproduce or rarely seen

Comments

@sarnold
Copy link
Contributor

sarnold commented Aug 31, 2020

The systemd service file log shows repeated complaints about KillMode=none (with nothing useful in the 246 man page) and the service "stop" fails miserably. Tweaking the service file does change the behavior, ie, without changes, there is no more signal to tell the daemon to shutdown and run cleanup handlers, and tweaking has so far only resulted in even more erratic shutdown behavior (eg, one cleanup func gets run 5 times in a row but others never run). This means that fpnd can no longer send the offline msg or run the down scripts to cleanup routes and iptables rules.

@sarnold sarnold added bug Something isn't working blocking This is blocked by another issue or something upstream distro OS or distro runtime environment labels Aug 31, 2020
@sarnold sarnold self-assigned this Aug 31, 2020
@sarnold sarnold changed the title Horribly erratic service file behavior with systemd >= 246 Erratic service file behavior with systemd >= 246 Sep 1, 2020
@sarnold
Copy link
Contributor Author

sarnold commented Sep 1, 2020

Following the systemd man pages and updating the service file to "force" the default behavior (again, following the documentation) results in different but still erratic and incorrect behavior, as shown in the log file where only the cleanup function should be running, yet somehow systemd invokes pre-start functions and then runs the very last part of cleanup() 5 times in a row:

2020-09-01 18:54:31 UTC fpnd.<module> +172: DEBUG [29127] fpnd.ini set startup mode: peer and role: None
2020-09-01 18:54:31 UTC fpnd.show_scheduled_jobs +55: DEBUG [29127] JOBS: Job(interval=10, unit=seconds, do=update_runner, args=(), kwargs={})
2020-09-01 18:54:31 UTC fpnd.show_scheduled_jobs +57: DEBUG [29127] TAGS: {'get-updates', 'base-tasks'}
2020-09-01 18:54:31 UTC fpnd.setup_scheduling +90: DEBUG [29127] Leaving setup_scheduling
2020-09-01 18:54:31 UTC daemon.stop +231: INFO [29127] Stopping...
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': 'dcd27a94-bcc1-4b05-84d2-01dae2dce594'}]
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '70df7721-cacb-4cf5-83e4-064c092ba843'}]
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '78e864f4-1246-4def-9e42-b7c7f33d911d'}]
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': 'f79a58d7-bf45-4d90-90f4-6ec21fd36ad4'}]
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '5e98ad6e-9d50-46a8-aa3b-a5280ce022b8'}]
2020-09-01 18:54:31 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:31 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '7f7b536f-5f7d-4df0-8010-a4a11b3e41f1'}]
2020-09-01 18:54:32 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:32 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': 'b37bd944-dcb7-4e3e-884c-46becaa5b76d'}]
2020-09-01 18:54:32 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:32 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '9db9e9b9-dd05-44ac-a32c-f18c4244f6fd'}]
2020-09-01 18:54:32 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:32 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '50432585-5136-452e-96eb-8ff4b8a2a054'}]
2020-09-01 18:54:32 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [28942] deorbit on ab37e685af result: 200 deorbit OK
2020-09-01 18:54:32 UTC root.call +289: DEBUG [28942] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('d6931ede19',), 'ref': '2096cbcc-83d0-408d-8b90-fa19f7b8a8fe'}]

where the correct stop log is, including verbose:

2020-09-01 07:05:17 UTC node_tools.data_funcs.wrapper +101: INFO [11476] Get data result: OK
2020-09-01 07:05:17 UTC node_tools.data_funcs.wrapper +108: DEBUG [11476] New cache time is: 2020-09-01 07:05:15 UTC
2020-09-01 07:05:20 UTC node_tools.node_funcs.do_cleanup +130: INFO [11476] CLEANUP: shutting down fpn0
2020-09-01 07:05:20 UTC node_tools.network_funcs.do_net_cmd +301: DEBUG [11476] ENV: net script settings are dict_items([('VERBOSE', 'yes'), ('DROP_DNS_53', '')])
2020-09-01 07:05:21 UTC node_tools.network_funcs.do_net_cmd +320: INFO [11476] net cmd fpn0-down.sh result: Checking iptables binary...
Checking kernel rp_filter setting...
  RP remove garlic filter...
Checking network...
  Found edf70dc89ae19299
Using FPN0 ID: edf70dc89ae19299
Checking for FPN routing table...
  Flushing route cache...
  Removing route rule...
  Deleting route...
  Cleaning up...
Checking interfaces...
  Found interface enx84e714006ef7

Found these devices and parameters:
  FPN interface: ztevw6ywqd
  FPN address: 172.16.19.234
  FPN gateway: 172.16.19.233
  FPN network id: edf70dc89ae19299

  INET interface: enx84e714006ef7
  INET address: 192.168.0.172

Deleting nat and mangle rules...

Success
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +133: INFO [11476] CLEANUP: fpn0 shutdown returned (True, b'Checking iptables binary...\nChecking kernel rp_filter setting...\n  RP remove garlic filter...\nChec
king network...\n  Found edf70dc89ae19299\nUsing FPN0 ID: edf70dc89ae19299\nChecking for FPN routing table...\n  Flushing route cache...\n  Removing route rule...\n  Deleting route...\n  Cleaning up...\nChecking interf
aces...\n  Found interface enx84e714006ef7\n\nFound these devices and parameters:\n  FPN interface: ztevw6ywqd\n  FPN address: 172.16.19.234\n  FPN gateway: 172.16.19.233\n  FPN network id: edf70dc89ae19299\n\n  INET i
nterface: enx84e714006ef7\n  INET address: 192.168.0.172\n\nDeleting nat and mangle rules...\n\nSuccess\n', 0)
2020-09-01 07:05:21 UTC node_tools.node_funcs.run_ztcli_cmd +281: DEBUG [11476] got data: 200 leave OK
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +136: DEBUG [11476] CLEANUP: action leave returned: 200 leave OK
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +130: INFO [11476] CLEANUP: shutting down fpn1
2020-09-01 07:05:21 UTC node_tools.network_funcs.do_net_cmd +301: DEBUG [11476] ENV: net script settings are dict_items([('VERBOSE', 'yes'), ('DROP_DNS_53', '')])
2020-09-01 07:05:21 UTC node_tools.network_funcs.do_net_cmd +320: INFO [11476] net cmd fpn1-down.sh result: Checking iptables binary...
Checking network...
  Found edf70dc89a63197c
Using FPN1 ID: edf70dc89a63197c
Checking interfaces...
  Found interface enx84e714006ef7
Found these devices and parameters:
  FPN SRC interface: ztevww5upg
  FPN SRC address: 172.16.20.17
  FPN SRC network: 172.16.20.16/30
  FPN SRC network id: edf70dc89a63197c

  INET interface: enx84e714006ef7
  INET address: 192.168.0.172
  INET gateway: 192.168.0.254

Reset forwarding for FPN source traffic
net.ipv4.ip_forward = 0
Deleting nat and forwarding rules...

Success
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +133: INFO [11476] CLEANUP: fpn1 shutdown returned (True, b'Checking iptables binary...\nChecking network...\n  Found edf70dc89a63197c\nUsing FPN1 ID: edf70dc89a63197c\nChecking interfaces...\n  Found interface enx84e714006ef7\nFound these devices and parameters:\n  FPN SRC interface: ztevww5upg\n  FPN SRC address: 172.16.20.17\n  FPN SRC network: 172.16.20.16/30\n  FPN SRC network id: edf70dc89a63197c\n\n  INET interface: enx84e714006ef7\n  INET address: 192.168.0.172\n  INET gateway: 192.168.0.254\n\nReset forwarding for FPN source traffic\nnet.ipv4.ip_forward = 0\nDeleting nat and forwarding rules...\n\nSuccess\n', 0)
2020-09-01 07:05:21 UTC node_tools.node_funcs.run_ztcli_cmd +281: DEBUG [11476] got data: 200 leave OK
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +136: DEBUG [11476] CLEANUP: action leave returned: 200 leave OK
2020-09-01 07:05:21 UTC node_tools.node_funcs.run_moon_cmd +353: DEBUG [11476] deorbit on 9f5aa7b693 result: 200 deorbit OK
2020-09-01 07:05:21 UTC root.call +289: DEBUG [11476] * Client will send payload: [{'ses': None, 'tok': None, 'ver': 1, 'met': 'offline', 'arg': ('4c66fda0f4',), 'ref': 'd272c123-3ff1-4230-8b3b-d3c577bcc27b'}]
2020-09-01 07:05:21 UTC node_tools.node_funcs.do_cleanup +141: DEBUG [11476] CLEANUP: offline reply: [{'ref': 'd272c123-3ff1-4230-8b3b-d3c577bcc27b', 'result': '4c66fda0f4'}]

@sarnold
Copy link
Contributor Author

sarnold commented Sep 1, 2020

Log above made with these changes to service file:

ExecStop=/usr/lib/fpnd/fpnd.py stop
Type=forking
KillSignal=SIGTERM
SendSIGHUP=yes

@sarnold sarnold added watch item hard to reproduce or rarely seen and removed blocking This is blocked by another issue or something upstream bug Something isn't working labels Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distro OS or distro runtime environment watch item hard to reproduce or rarely seen
Projects
None yet
Development

No branches or pull requests

1 participant