Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: node and protocols health #2080

Merged
merged 14 commits into from
Jul 27, 2024
Merged

feat: node and protocols health #2080

merged 14 commits into from
Jul 27, 2024

Conversation

danisharora099
Copy link
Collaborator

Problem

Based on the Reliability RFC,

Node health is a metric meant to determine the connectivity state of a light node and its present ability to reliably send and receive messages from the network. We consider this reliability to be dependent on amount of simultaneous connections to responsive service nodes. Unfortunately the more connections light node establishes - the more bandwidth is consumed. To address this we RECOMMEND following states:

unhealthy - no connections to service nodes are available regardless of protocol;
minimally healthy:
Filter has one service node connection;
LightPush protocol has one service node connection;
sufficiently healthy:
Filter has at least 2 connections available to service nodes;
LightPush has at least 2 connections available to service nodes;

Solution

Introduce node health as a metric on the Waku object: waku.health

  • per protocol
  • overall

Also includes comprehensive tests:

Notes

Contribution checklist:

  • covered by unit tests;
  • covered by e2e test;
  • add ! in title if breaks public API;

@danisharora099 danisharora099 requested a review from a team as a code owner July 23, 2024 13:56
Copy link

github-actions bot commented Jul 23, 2024

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
Waku node 119.25 KB (+0.25% 🔺) 2.4 s (+0.25% 🔺) 17.1 s (-24.9% 🔽) 19.5 s
Waku Simple Light Node 184.19 KB (+0.19% 🔺) 3.7 s (+0.19% 🔺) 19.8 s (-28.45% 🔽) 23.5 s
ECIES encryption 23.15 KB (+0.12% 🔺) 463 ms (+0.12% 🔺) 3.4 s (+1.47% 🔺) 3.9 s
Symmetric encryption 22.6 KB (0%) 452 ms (0%) 4.3 s (-48.64% 🔽) 4.8 s
DNS discovery 72.51 KB (+0.03% 🔺) 1.5 s (+0.03% 🔺) 13.2 s (+5.59% 🔺) 14.7 s
Peer Exchange discovery 74.51 KB (+0.2% 🔺) 1.5 s (+0.2% 🔺) 17.8 s (+143.17% 🔺) 19.3 s
Local Peer Cache Discovery 67.77 KB (+0.14% 🔺) 1.4 s (+0.14% 🔺) 15 s (-12.28% 🔽) 16.3 s
Privacy preserving protocols 38.88 KB (-0.06% 🔽) 778 ms (-0.06% 🔽) 11 s (+35.83% 🔺) 11.8 s
Waku Filter 113.51 KB (+0.18% 🔺) 2.3 s (+0.18% 🔺) 19.8 s (+15.92% 🔺) 22.1 s
Waku LightPush 111.65 KB (+0.31% 🔺) 2.3 s (+0.31% 🔺) 21.1 s (+27.77% 🔺) 23.3 s
History retrieval protocols 112.24 KB (+0.39% 🔺) 2.3 s (+0.39% 🔺) 18 s (-9.21% 🔽) 20.2 s
Deterministic Message Hashing 7.34 KB (+0.72% 🔺) 147 ms (+0.72% 🔺) 1.9 s (-18.98% 🔽) 2 s

}

private updateOverallHealth(): void {
//TODO: blocked for Store by https://github.com/waku-org/js-waku/pull/2019
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this isn't to be seen as blocked but simply that health as a metric is for LightPush and Filter only

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Agreeing with this more. Reliability RFC is strictly LightPush + Filter

@danisharora099 danisharora099 force-pushed the feat/health-status branch 5 times, most recently from 24c0840 to ba7b44f Compare July 24, 2024 15:28
Copy link
Member

@adklempner adklempner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -232,6 +258,11 @@ export class BaseProtocolSDK implements IBaseProtocolSDK {
throw error;
}
}

private updatePeers(peers: Peer[]): void {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not to change everything else - we can introduce private set peers and do update inside

packages/sdk/src/waku.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@weboko weboko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

@@ -14,6 +18,7 @@ const DEFAULT_NUM_PEERS_TO_USE = 3;
const DEFAULT_MAINTAIN_PEERS_INTERVAL = 30_000;

export class BaseProtocolSDK implements IBaseProtocolSDK {
private healthManager: IHealthManager;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should understand if we want to support waku.lightpush.health as well as waku.health API

I don't have any preferences here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having it on the Waku object, where a consumer can fetch health for both node and protocol is a good start. We can definitely expand into exposing health on the protocol SDK as well: waku.lightpush.health

Copy link
Collaborator

@weboko weboko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments and read CI, all good otherwise

@danisharora099 danisharora099 force-pushed the feat/health-status branch 2 times, most recently from f137c2c to 9f12048 Compare July 26, 2024 11:12
@danisharora099 danisharora099 merged commit d464af3 into master Jul 27, 2024
10 of 11 checks passed
@danisharora099 danisharora099 deleted the feat/health-status branch July 27, 2024 12:57
@weboko weboko mentioned this pull request Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: introduce node health as a metric
3 participants