Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-0165: Async Vote Execution #165

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8bce07a
Initial draft of Async Vote Execution SIMD.
wen-coding Aug 12, 2024
74f3a33
Change SIMD number.
wen-coding Aug 12, 2024
4b50153
Make linter happy.
wen-coding Aug 12, 2024
a8aad6c
Make linter happy.
wen-coding Aug 16, 2024
55fae1d
Update the plan for clock calculation.
wen-coding Aug 26, 2024
ca682d7
Update the proposal to reflect optimistic vote execution plan we disc…
wen-coding Sep 19, 2024
93b2506
Update dash to asterisk.
wen-coding Sep 19, 2024
a78f256
Rename VED and UED hash to Ephemeral and Final hash.
wen-coding Sep 20, 2024
67a1f45
Change title to reflect that we calculate Ephemeral hash.
wen-coding Sep 20, 2024
e38b645
Explain the checks we will perform during ephemeral hash computation.
wen-coding Sep 27, 2024
7dd9801
Add new fields in TowerSync.
wen-coding Sep 27, 2024
60a66af
Update that which block (slot, hash) on the vote transaction is.
wen-coding Sep 27, 2024
5432353
Clarify that new votes should be sent out when either hash changes.
wen-coding Sep 27, 2024
40d4be6
Propose to halt and exit if >1/3 disagrees on final bankhash.
wen-coding Oct 25, 2024
1b3ed81
Change status to Idea.
wen-coding Oct 25, 2024
4e83392
Update calculation of vote only hash and others.
wen-coding Nov 28, 2024
35ec617
Fix some small problems.
wen-coding Dec 11, 2024
93372ea
Address review comments.
wen-coding Dec 11, 2024
7daaf7a
Specify the hash function used.
wen-coding Dec 12, 2024
b7fc403
Clarify the set root error.
wen-coding Dec 12, 2024
f4278f4
Set root will not cause the block to be marked dead.
wen-coding Dec 12, 2024
7b3de13
Add the user visible changes section.
wen-coding Dec 12, 2024
8ac7180
Clarify we only use vote only hash in fork selection.
wen-coding Dec 12, 2024
437d66d
EpochSlots should be updated when banks are vote only frozen.
wen-coding Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 120 additions & 0 deletions proposals/0165-async-vote-execution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
simd: '0165'
title: Async Vote Execution
authors:
- Wen Xu
category: Standard
type: Core
status: Draft
created: 2024-08-11T00:00:00.000Z
feature: null
supersedes: null
superseded-by: null
extends: null
---

## Summary

Separate the execution of vote and non-vote transactions in each block. The
vote transactions will be verified and executed first, then the non-vote
transactions will be executed later asynchronously to finalize the block.

## Motivation

Currently the vote transactions and non-vote transactions are mixed together in
a block, the vote transactions are only processed in consensus when the whole
block has been frozen and all transactions in the block have been verified and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: votes are processed per entry batch, doesn't have to wait for the bank to be frozen. as a result we still partially process votes even if the block later ends up dead.
e.g. agave's impl https://github.com/anza-xyz/agave/blob/1f06fbdbe3b72f330f50ad93a15c1116f5021392/ledger/src/blockstore_processor.rs#L177

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, now I think about it, it's less about processing votes as fast as possible, more about a lot of things don't happen until the block is frozen, changes this part.

executed. This is a problem because slow running non-vote transactions may affect
how fast the votes are processed and then affect the ability of consensus to
pick the correct fork.

With different hardware and running environment, there will always be some
difference on speed of transaction execution between validators. Generally
speaking, because vote transactions are so simple, the variation between vote
execution should be smaller than that between non-vote executions. Therefore,
if we only execute vote transactions in a block before voting on the block,
it is more likely validators can reach consensus faster.

Even with async vote execution, forks can still happen because of
various other situations, like network partitions or mis-configured validators.
This work just reduces the chances of forks caused by variance in non-vote
transaction executions.

The non-vote transactions do need to be executed eventually. Even though it's
hard to make sure everyone executes every block within 400ms, on average majority
of the cluster should be able to keep up.

## New Terminology

- `VED`: Vote Execution Domain, Vote transactions and all its dependencies (e.g.
fee payers for votes).
- `VED Bankhash`: The hash calculated after executing only vote transactions in
a block. If there are no votes, use default hash.
- `UED`: User Execution Domain, currently everything other than votes. We may
have more domains in the future.
- `UED Bankhash`: The hash calculated after executing only non-vote transactions
in a block. If there are no non-vote transactions, use default hash.

## Detailed Design

### Allow leader to skip execution of transactions (Bankless Leader)

There is already on-going effort to totally skip execution of all transactions
when leader pack new blocks. See SIMD 82, SIMD 83, and related trackers:
https://github.com/anza-xyz/agave/issues/2502

Theoretically we could reap some benefit without Bankless Leader, the leader
pack as normal, while other validators only replay votes first, then later
execute other transactions and compare with the bankhash of the leader. But in
such a setup we gain smaller speedup without much benefits, it is a possible
route during rollouts though.

### Separating vote transactions and dependencies into a different domain

To make sure vote transactions can be executed independently, we need to
isolate its dependencies.

#### Remove clock program's dependency on votes

Introduce new transaction `ClockBump` to remove current clock program's
dependency on votes.

The transaction `ClockBump` is sent by a leader with at least 0.5% stake
every 12 slots to correct the clock drift. A small script can be used to
refund well-behaving leaders the cost of the transactions.

#### Split vote accounts into two accounts in VED and UED respectively

We need to allow users move money in and out of the vote accounts, but
we also need the vote accounts to vote in VED. So there will be two accounts:

- `VoteTowerAccount`: tracks tower state and vote authority, it will be
in `VED`, it is updated by vote transactions and tracks vote credits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A flow diagram/description of how users will pay for vote fees would be useful here. A step by step from vote account creation, to funding the account, to which domains it's created/moved into, etc.

- `VoteAccount`: everything else currently in vote accounts, it will be
in `UED`, users can move funds in and out of `VoteAccount` freely.

The two accounts are synced every Epoch when the rewards are calculated.

### Separate the VED and UED Domains

- Only Vote or System program can read and write accounts in `VED`
- Other programs can only read accounts in `VED`
- Users can't directly access accounts in `VED` but they can move accounts
from `VED` to `UED` and vice versa. Moving accounts from one domain to
another takes 1 Epoch, and the migration happens at Epoch boundary

### Enable Async Vote Executions

1. The leader will no longer execute any transactions before broadcasting
the block it packed. We do have block-id (Merkle tree root) to ensure
everyone receives the same block.
2. Upon receiving a new block, the validator computes the `VED bankhash`,
then vote on this block and also gives its latest `UED bankhash` on the
same fork. The `UED bankhash` will most likely be hash of an ancestor of
the received block.
3. A block is not considered Optimistically Confirmed or Finalized until
some percentage of the validators agree on the `UED bankhash`.
4. Add assertion that confirmed `UED bankhash` is not too far away from the
confirmed `VED bankhash` (currently proposed at 1/2 of the Epoch)
5. Add alerts if `UED bankhash` differs when the `VED bankhash` is the same.
This is potentially an event worthy of cluster restart.
Loading