Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate configuration handling to the confuse lib #363

Closed
wants to merge 170 commits into from

Conversation

jinnatar
Copy link
Contributor

@jinnatar jinnatar commented Feb 3, 2023

This is very much a work in progress, just wanted to be transparent and make the current status visible. The current state is able to run the example config and known minimal configs but needs more testing.

This change touches all integrations and has minor potential for subtle breakage. The following have been thoroughly tested:

  • Telegram
  • Pushover

Groundwork for fixing #361

martomi and others added 30 commits April 3, 2021 22:13
Check that the full node doesn't skip any signage points. If it does,
send a normal priority notification.
The 30 second threshold is violated multiple times per day in the
current network. It's not exactly clear to me why but further analysis
of the logs shows that no signage points are skipped which indicates
that the farmer can still participate in all challenges.

Chiadog is now monitoring signage points so we can safely increase
the threshold here and rely more on other checks.
Logs indicating what checks and services are active to help provide
confidence that everything is running.

Also add warning log in case no notification service was enabled in the
config and info log for every keep-alive check.
There's a race condition and it can happen that the log consumer tries
to call the log handler before it is fully initialized.
This will enable monitoring remote harvester(s). Pre-requisite is having
setup key-based SSH authentication with your harvester machine.
This will help to distinguish notifications when running multiple
instances of chiadog to monitor more than 1 remote harvester.

Just create multiple configs for each harvester with unique prefix and
correct IP configuration and execute:

python3 main.py --config config-1.yaml
python3 main.py --config config-2.yaml
python3 main.py --config config-3.yaml
* Downgrade keep-alive logs to DEBUG
* Make notification texts more consistent with <Problem> <Reason> format
This is scenario observed on actual network. Seems unrelated to local
node because it's observable from multiple nodes at the same time.

Add some handling to ignore these type of events and reduce the
resulting false alarms.
Might be a good idea to have these visible at least as INFO logs to
understand how the network behaves over time.
Export SHOWCASE_NOTIFICATIONS=1 on top of API token and user key and run
tests to generate presentable notifications:

python3 -m unittest
tests.notifier.test_pushover_notifier.TestPushoverNotifier.testShowcaseGoodNotifications

python3 -m unittest
tests.notifier.test_pushover_notifier.TestPushoverNotifier.testShowcaseBadNotifications
Implements notifications through the Telegram API:
https://core.telegram.org/bots/api
Run notifier integration tests only if environment variables are
provided. This makes it easier for anyone to run the full-suite of tests
without needing to register for tokens for all integrations.
src/default_config.yaml Outdated Show resolved Hide resolved
@jinnatar
Copy link
Contributor Author

jinnatar commented Feb 4, 2023

That all makes sense, thanks for the thorough review!

@jinnatar
Copy link
Contributor Author

jinnatar commented Feb 4, 2023

I think this might now be in a sane state, but was only able to test a couple of the notifiers myself. @martomi if you have some test farm for all the notifiers, could you give it a whirl and make sure I didn't miss anything?

@jinnatar jinnatar changed the title WIP: Transform log parsing and handling to use the confuse lib Migrate configuration handling to the confuse lib Feb 4, 2023
@jinnatar jinnatar marked this pull request as ready for review February 5, 2023 20:14
Copy link
Owner

@martomi martomi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on a high-level. Left a bunch of small comments.

Agree with some of the default value changes but would like to keep that separated for another potential PR in the future.

Can you please also document in the PR description which integrations you managed to test? Since this change is touching all integrations, I’d like us to test at least 3 of them (feel free to pick which ones). The guide for testing is here. Thanks!

requirements.txt Outdated Show resolved Hide resolved
src/chia_log/log_consumer.py Outdated Show resolved Hide resolved
@@ -30,9 +33,9 @@ def _set_config(self, config: dict):
:param config: The YAML config for this notifier
:returns: None
"""
self._topic = config["topic"]
self._qos: int = config.get("qos", 0)
self._retain: bool = config.get("retain", False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not making an exhaustive list of mismatches but here I also noticed qos used to be 0, and it’s 1 now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so what we have here is that I took the default values from config-example.yaml instead of from the code, with the assumption that anyone actually using it was likely always defaulting on the example instead of the default in the code.

So I think what we have here is a problem of having had two defaults and we need to collapse on only one. You wana do that one by one or should I just flip over from example config values to code defaults?

src/notifier/slack_notifier.py Outdated Show resolved Hide resolved
tests/chia_log/handlers/test_wallet_added_coin_handler.py Outdated Show resolved Hide resolved
events = self.handler.handle("".join(logs))
# Default mojo filter excludes the event we will expect
self.handler_config["min_mojos_amount"].set(0)
handler = WalletAddedCoinHandler(self.handler_config)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the default used to be 0, maybe you want to do it the other way around? Keep it 0 and set it to 5 only for the test that requires to test that filter’s functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but I guess we need to first agree if the default should be 5 or 0 since it's a conflict between the code default and the example config.

@@ -46,15 +53,15 @@ def testShowcaseGoodNotifications(self):
notifiers = [
IftttNotifier(
title_prefix="Harvester 1",
config={"enable": True, "api_token": self.api_token, "webhook_name": self.webhook_name},
config=self.config,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config used to be the wrong format before, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, or perhaps not. The config structure on this one is really weird with the webhook_name under credentials. I'll go whip up an IFTTT account and test this one for real :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm that on branch main, if SHOWCASE_NOTIFICATIONS=true the test fails with a bunch of:

ERROR:root:Invalid config.yaml. Missing key: 'credentials'

And eventually

Traceback (most recent call last):
File "/home/artanicus/src/3rd-party-forks/chiadog.git/tests/notifier/test_ifttt_notifier.py", line 67, in testShowcaseGoodNotificat
ions
success = notifier.send_events_to_user(events=[found_proof_event])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/artanicus/src/3rd-party-forks/chiadog.git/src/notifier/ifttt_notifier.py", line 30, in send_events_to_user
f"/trigger/{self.webhook_name}/json/with/key/{self.token}",
^^^^^^^^^^^^^^^^^
AttributeError: 'IftttNotifier' object has no attribute 'webhook_name'

Ergo, the config indeed has not been valid. It was probably never caught since these lines are not triggered unless the showcase is enabled. With the new code the showcases work as expected.

@jinnatar
Copy link
Contributor Author

jinnatar commented Feb 5, 2023

I think I've fixed the ones where there was a clear fix. Happy to tidy up the default values if we have a consensus on which source of truth should be the new defaults. My feeling is that it's unlikely many folks have relied on the code defaults and instead have used the example values instead of explicitly deleting those fields from their config. But, alas I have no way of proving that feeling!

I've got the keepalive structural changes and a keepalive emitter for the wallet ready for review once we nail down this bit, just want to rebase them first on top of this one so you get a clean review slate.

@martomi
Copy link
Owner

martomi commented Feb 5, 2023

Cool - thanks for the quick turnaround!

Note that there are a few more inline comments I left in the code above, but they’re collapsed in the middle of the thread by the github UI. Just wanna make sure you saw them too? In one of them I had provided more context and example for default values.

My take is that we should use the code defaults for the following reasons:

  • The example config is still going to be taken by new users, so the default YAML has little effect beside keeping the code functionality being identical as how it used to be before this change
  • There were a substantial number of users that installed chiadog before we introduced most of the configs in the example config. In the process of installing it, they would have created a copy of the very old config example. We never explicitly required them to change their original configs with the new values from the extended example config. Thus they’ll be having a config that is substantially more minimal than the current example. By keeping the defaults same as in the code, we’ll not introduce any unexpected changes for these users.

This ensures the backwards compatibility is still carried forward of
enabling misconfigured handlers.
@jinnatar
Copy link
Contributor Author

jinnatar commented Feb 7, 2023

Oops, not sure why "close" is on by default in review mode. But did not intend to close and now can't reopen :D

@martomi
Copy link
Owner

martomi commented Feb 8, 2023

Hmm not sure, the reopen button is also grayed out for me. But more importantly, the code-diff is missing, so maybe that’s the main reason? Maybe you rebased on the wrong base? Feel free to re-open a new PR.

@jinnatar
Copy link
Contributor Author

Yeah, looks like there's no sane way to re-open after a rebase & close. I'll open a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.