-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD-0165: Async Vote Execution #165
Open
wen-coding
wants to merge
24
commits into
solana-foundation:main
Choose a base branch
from
wen-coding:wen-async-vote-execution
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
8bce07a
Initial draft of Async Vote Execution SIMD.
wen-coding 74f3a33
Change SIMD number.
wen-coding 4b50153
Make linter happy.
wen-coding a8aad6c
Make linter happy.
wen-coding 55fae1d
Update the plan for clock calculation.
wen-coding ca682d7
Update the proposal to reflect optimistic vote execution plan we disc…
wen-coding 93b2506
Update dash to asterisk.
wen-coding a78f256
Rename VED and UED hash to Ephemeral and Final hash.
wen-coding 67a1f45
Change title to reflect that we calculate Ephemeral hash.
wen-coding e38b645
Explain the checks we will perform during ephemeral hash computation.
wen-coding 7dd9801
Add new fields in TowerSync.
wen-coding 60a66af
Update that which block (slot, hash) on the vote transaction is.
wen-coding 5432353
Clarify that new votes should be sent out when either hash changes.
wen-coding 40d4be6
Propose to halt and exit if >1/3 disagrees on final bankhash.
wen-coding 1b3ed81
Change status to Idea.
wen-coding 4e83392
Update calculation of vote only hash and others.
wen-coding 35ec617
Fix some small problems.
wen-coding 93372ea
Address review comments.
wen-coding 7daaf7a
Specify the hash function used.
wen-coding b7fc403
Clarify the set root error.
wen-coding f4278f4
Set root will not cause the block to be marked dead.
wen-coding 7b3de13
Add the user visible changes section.
wen-coding 8ac7180
Clarify we only use vote only hash in fork selection.
wen-coding 437d66d
EpochSlots should be updated when banks are vote only frozen.
wen-coding File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
--- | ||
simd: '0165' | ||
title: Async Vote Execution | ||
authors: | ||
- Wen Xu | ||
category: Standard | ||
type: Core | ||
status: Draft | ||
created: 2024-08-11T00:00:00.000Z | ||
feature: null | ||
supersedes: null | ||
superseded-by: null | ||
extends: null | ||
--- | ||
|
||
## Summary | ||
|
||
Separate the execution of vote and non-vote transactions in each block. The | ||
vote transactions will be verified and executed first, then the non-vote | ||
transactions will be executed later asynchronously to finalize the block. | ||
|
||
## Motivation | ||
|
||
Currently the vote transactions and non-vote transactions are mixed together in | ||
a block, the vote transactions are only processed in consensus when the whole | ||
block has been frozen and all transactions in the block have been verified and | ||
executed. This is a problem because slow running non-vote transactions may affect | ||
how fast the votes are processed and then affect the ability of consensus to | ||
pick the correct fork. | ||
|
||
With different hardware and running environment, there will always be some | ||
difference on speed of transaction execution between validators. Generally | ||
speaking, because vote transactions are so simple, the variation between vote | ||
execution should be smaller than that between non-vote executions. Therefore, | ||
if we only execute vote transactions in a block before voting on the block, | ||
it is more likely validators can reach consensus faster. | ||
|
||
Even with async vote execution, forks can still happen because of | ||
various other situations, like network partitions or mis-configured validators. | ||
This work just reduces the chances of forks caused by variance in non-vote | ||
transaction executions. | ||
|
||
The non-vote transactions do need to be executed eventually. Even though it's | ||
hard to make sure everyone executes every block within 400ms, on average majority | ||
of the cluster should be able to keep up. | ||
|
||
## New Terminology | ||
|
||
- `VED`: Vote Execution Domain, Vote transactions and all its dependencies (e.g. | ||
fee payers for votes). | ||
- `VED Bankhash`: The hash calculated after executing only vote transactions in | ||
a block. If there are no votes, use default hash. | ||
- `UED`: User Execution Domain, currently everything other than votes. We may | ||
have more domains in the future. | ||
- `UED Bankhash`: The hash calculated after executing only non-vote transactions | ||
in a block. If there are no non-vote transactions, use default hash. | ||
|
||
## Detailed Design | ||
|
||
### Allow leader to skip execution of transactions (Bankless Leader) | ||
|
||
There is already on-going effort to totally skip execution of all transactions | ||
when leader pack new blocks. See SIMD 82, SIMD 83, and related trackers: | ||
https://github.com/anza-xyz/agave/issues/2502 | ||
|
||
Theoretically we could reap some benefit without Bankless Leader, the leader | ||
pack as normal, while other validators only replay votes first, then later | ||
execute other transactions and compare with the bankhash of the leader. But in | ||
such a setup we gain smaller speedup without much benefits, it is a possible | ||
route during rollouts though. | ||
|
||
### Separating vote transactions and dependencies into a different domain | ||
|
||
To make sure vote transactions can be executed independently, we need to | ||
isolate its dependencies. | ||
|
||
#### Remove clock program's dependency on votes | ||
|
||
Introduce new transaction `ClockBump` to remove current clock program's | ||
dependency on votes. | ||
|
||
The transaction `ClockBump` is sent by a leader with at least 0.5% stake | ||
every 12 slots to correct the clock drift. A small script can be used to | ||
refund well-behaving leaders the cost of the transactions. | ||
|
||
#### Split vote accounts into two accounts in VED and UED respectively | ||
|
||
We need to allow users move money in and out of the vote accounts, but | ||
we also need the vote accounts to vote in VED. So there will be two accounts: | ||
|
||
- `VoteTowerAccount`: tracks tower state and vote authority, it will be | ||
in `VED`, it is updated by vote transactions and tracks vote credits. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A flow diagram/description of how users will pay for vote fees would be useful here. A step by step from vote account creation, to funding the account, to which domains it's created/moved into, etc. |
||
- `VoteAccount`: everything else currently in vote accounts, it will be | ||
in `UED`, users can move funds in and out of `VoteAccount` freely. | ||
|
||
The two accounts are synced every Epoch when the rewards are calculated. | ||
|
||
### Separate the VED and UED Domains | ||
|
||
- Only Vote or System program can read and write accounts in `VED` | ||
- Other programs can only read accounts in `VED` | ||
- Users can't directly access accounts in `VED` but they can move accounts | ||
from `VED` to `UED` and vice versa. Moving accounts from one domain to | ||
another takes 1 Epoch, and the migration happens at Epoch boundary | ||
|
||
### Enable Async Vote Executions | ||
|
||
1. The leader will no longer execute any transactions before broadcasting | ||
the block it packed. We do have block-id (Merkle tree root) to ensure | ||
everyone receives the same block. | ||
2. Upon receiving a new block, the validator computes the `VED bankhash`, | ||
then vote on this block and also gives its latest `UED bankhash` on the | ||
same fork. The `UED bankhash` will most likely be hash of an ancestor of | ||
the received block. | ||
3. A block is not considered Optimistically Confirmed or Finalized until | ||
some percentage of the validators agree on the `UED bankhash`. | ||
4. Add assertion that confirmed `UED bankhash` is not too far away from the | ||
confirmed `VED bankhash` (currently proposed at 1/2 of the Epoch) | ||
5. Add alerts if `UED bankhash` differs when the `VED bankhash` is the same. | ||
This is potentially an event worthy of cluster restart. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: votes are processed per entry batch, doesn't have to wait for the bank to be frozen. as a result we still partially process votes even if the block later ends up dead.
e.g. agave's impl https://github.com/anza-xyz/agave/blob/1f06fbdbe3b72f330f50ad93a15c1116f5021392/ledger/src/blockstore_processor.rs#L177
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, now I think about it, it's less about processing votes as fast as possible, more about a lot of things don't happen until the block is frozen, changes this part.