Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(validators): trickle in validators #1182

Merged

Conversation

ksrichard
Copy link
Collaborator

Description

Allow only a configurable amount of validator nodes to be active in an epoch, if we reach the limit, simply adding new validator nodes to a later epoch.

Example:
Let say that the max number of vns to be registered in an epoch is 2 and there are 10 new validator nodes are trying to to join at the same time.

Validator nodes table on other nodes (epochs are relative):

  1. epoch: new node 1, start epoch: 1, end epoch: 101
  2. epoch: new node 2, start epoch: 1, end epoch: 101
  3. epoch: new node 3, start epoch: 2, end epoch: 102
  4. epoch: new node 4, start epoch: 2, end epoch: 102
  5. epoch: new node 5, start epoch: 3, end epoch: 103
    etc...

Motivation and Context

Validators were registered all at once, so when the registration form base layer has been processed, the new validators are available from the next epoch. The problem is that if there are thousands of thousands of validators joining the same time, it would cause the consensus to stuck for a longer period of time until it gets running again healthy.

How Has This Been Tested?

Current limit for developer networks is 50, so setting the number of validator nodes to a higher number than 50 is needed. Then starting up a fresh swarm using this new configuration (delete processes dir) and after the first 50 nodes are up an running, open data/swarm/processes/validator-node-00/localnet/data/validator_node/global_storage.sqlite local database and check validator_nodes table content.
It will nicely show that the first 50 nodes are going to the next epoch, then the rest in the one after next one.
Then start periodic mining to get to a new epoch and see if everything still runs well with new validator nodes.

What process can a PR reviewer use to test or verify this change?

Breaking Changes

  • None
  • Requires data directory to be deleted
  • Other - Please specify

Copy link

github-actions bot commented Oct 21, 2024

Test Results (CI)

571 tests  ±0   570 ✅  - 1   3h 36m 17s ⏱️ +18s
 64 suites ±0     0 💤 ±0 
  2 files   ±0     1 ❌ +1 

For more details on these failures, see this check.

Results for commit 09b57b8. ± Comparison against base commit 50e7b8c.

♻️ This comment has been updated with latest results.

@ksrichard ksrichard self-assigned this Oct 21, 2024
Copy link
Member

@sdbondi sdbondi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK didn't test > 50 nodes, but tested with 20 VNs and working as before

@sdbondi sdbondi added this pull request to the merge queue Oct 21, 2024
Merged via the queue into tari-project:development with commit c423408 Oct 21, 2024
11 of 12 checks passed
sdbondi added a commit to sdbondi/tari-dan that referenced this pull request Oct 24, 2024
* development:
  feat(validators): trickle in validators (tari-project#1182)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants