Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD 0054: Sysvar for active stake #56

Closed
wants to merge 3 commits into from

Conversation

0xNineteen
Copy link

We propose to add a new sysvar that contains vote account pubkeys and their corresponding total active stake. This will enable on-chain programs to verify a validator's total active stake.

Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@ripatel-jump - do you guys want to comment on this new sysvar idea?

@0xNineteen - are you also interested in attempting an implementation for this?

@0xNineteen
Copy link
Author

yeah im down to attempt an impl

Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as proposed, this is going to be brittle around epoch boundaries. consider a transaction that was constructed from data queried prior to an epoch boundary, but broadcast and executed immediately afterwards. we probably need either the current and previous epoch in this sysvar, or use a pda to address (with epoch as a seed) the "sysvar" and just accumulate them in accountsdb indefinitely. latter may actually be better as something like an onchain vote can be ended in one epoch, then referenced to approve some future actions


### Ordering

The sysvar structure would be sorted by `vote_account` in ascending order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might pay to add an additional Vec<u16> of indexes into the vote-account-address-sorted vector, which is instead sorted by stake to ease "first N% stake" lookups


## Detailed Design

- sysvar structure: `Vec<(vote_account: Pubkey, active_stake_in_lamports: u64)>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would suggest an additional field

  • epoch_stake: u64 -- total active stake for the epoch

Comment on lines +79 to +81
Implementing the proposed sysvar will enable new types of programs which are
not possible now,
improving Solana's ecosystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a particularly valuable statement. elaborate with examples

modified on a per-epoch basis, validators will only need to update this
account on epoch boundaries.

We would also need a new feature gate to activate this sysvar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to security section. broken consensus -> loss of availability -> security

Comment on lines +21 to +22
it needs to
pass in all of the stake accounts which have delegated to it. This is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there are a few truncated line breaks like this through out... would be good to clean them up

Copy link
Contributor

@ripatel-fd ripatel-fd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation section of this proposal misses some important use cases.

A more complete-list of use-cases for the active stake table is as follows:

  1. Querying the stake of individual validators form an on-chain program (this is mentioned by the protocol).
  2. Exporting epoch stake weights to an RPC client
  3. Proving epoch stake weights to an off-chain user (light clients)
  4. Deriving the leader schedule using this information

A sysvar-based mechanism is not an efficient or robust way to enable either of these 3.

Problem 1: Length restriction

The proposal limits the number of records stored in the new sysvar to 4096.
It is currently trivial to allocate more than 4096 vote acocunts, and in the future in within the grasp of a well-funded attacker.

This means that this mechanism might start to return incorrect data in the future, by omitting any vote accounts past the first 4096. This is mildly annoying in an on-chain context (1), or even dangerous in consensus-critical applications like bridges (such as 3).

Problem 2: Iterating all vote accounts on-chain is impractical

The computational capability of on-chain programs are quite limited. As the Solana network grows, and the number of validators increases past thousand, it becomes impractical to iterate through all vote accounts in a single transaction execution.

It doesn't seem like there is any practical reason for doing so anyways. Therefore, there is also no practical reason for why we should have to load a list of all vote accounts into virtual machine memory.

Problem 3: Memory copies

In order to make this account accessible from the runtime, the sysvar will have to be copied into the virtual machine input segment, which currently a bottleneck for transaction processing throughput. The Solana Labs virtual machine implementation is working on various mechanisms to allow host memory to be mapped directly into virtual machine memory. However, these mechanisms still require expensive operating syscalls (mmap()), or exotic use of hardware hypervisor functionality, and puts pressure on the TLB.

Instead of relying on future VM optimization, it is better to just map the subset of vote account records into VM memory that is needed (refer to problem 2).

Problem 4: Query complexity

Storing a flat array does not allow efficient queries of stake by pubkey. Finding an entry requires a linear walk. As established in problem 2, it is not safe to assume that such linear walks are practical.

Therefore, the only way to query a specific pubkey is to locate the index of such a record off-chain, and then provide this index via transaction instruction data.

Considering instruction data is unauthenticated, this requires a sanity check whether the requested index actually holds the vote account address that the program is looking for.

This is more complex off-chain and on-chain than alternative solutions.

Problem 5: Deriving the leader schedule is impractical

Deriving the leader schedule requires a mapping of (node identity) => (active stake). However, this sysvar only mentions vote account addresses, not node identities. Resolving those would require lookups in the respective vote accounts.

For RPC clients, it would be impractical to fetch the node identity of every vote account.
The same is true for light clients, especially when running in resource-constrained environments like ZKP VMs.

Problem 6: Unnecessary complexity in epoch boundary

The Solana runtime would have to implicitly update the new sysvar(s) as part of the epoch boundary. The epoch boundary already adds EpochStakes as implicit state to the bank, so doing it twice seems like unnecessary complexity. Why not offer an interface that provides access to the bank's existing EpochStakes data structure?


Have you considered adding RPC calls and syscalls that permit querying specific EpochStakes instead? Improving light client accessibility of implicit runtime state is best done via deterministic content-addressable data structures.

Comment on lines +59 to +65
We also need to consider a maximum data size for the sysvar.
Currently, there are 3422 vote accounts on mainnet (1818 active and 1604 delinquint),
so we can use a maximum limit of 4096 entries and still include
all the vote accounts for now.
Using 4096 as the max number of entries the size would be (8 + 40 * 4096) =
163,848 bytes. Once the number of entries exceeds the max allowed,
vote accounts with the least amount of stake will be removed from the sysvar.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im very suspicious of this coming back to bite us in the future. there arent 4096 validators today, but there could be someday, and it puts us in the unfortunate position of needing to decide whether to keep increasing the size of the sysvar, or make smaller validators second-class citizens. i especially dont like that this creates a disincentive to decentralization. this isnt a theoretical concern because (at least according to our marketing materials...) we already have over 2000 active voting nodes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im very suspicious of this coming back to bite us in the future. there arent 4096 validators today, but there could be someday, and it puts us in the unfortunate position of needing to decide whether to keep increasing the size of the sysvar, or make smaller validators second-class citizens. i especially dont like that this creates a disincentive to decentralization. this isnt a theoretical concern because (at least according to our marketing materials...) we already have over 2000 active voting nodes

I agree, I think size of the sysvar should be much higher(or no limit if that doesn't introduce other problems) so that it can store all the vote accounts for the following reasons:

  • If we truncate the sysvar to the top stakers, in the case that one of the voters becomes delinquent for a certain slot, we won't be able to verify consensus since the sysvar would update every epoch.
  • Account size isn't a concern here because before reach that limit there would be other bottlenecks in the system as a consequence of having so many nodes in the network.

@2501babe
Copy link
Member

2501babe commented Jul 22, 2023

im a fan of the idea in spirit but i wonder about the best way to implement it... because the stakes cache already has this information computed in a form thats easy to look up, i wonder if it could be made queryable? an rpc call would (presumably, i havent worked in this part of the code much) be straightforward, but maybe theres a way to have a pseudo-program to retrieve the stake amount for a specific vote account in the manner of get_minimum_delegation? not sure how complicated this would be

@michaelh-laine
Copy link

The motivation section of this proposal misses some important use cases.

A more complete-list of use-cases for the active stake table is as follows:

  1. Querying the stake of individual validators form an on-chain program (this is mentioned by the protocol).
  2. Exporting epoch stake weights to an RPC client
  3. Proving epoch stake weights to an off-chain user (light clients)
  4. Deriving the leader schedule using this information

Another important use case that a large section of the validator community is interested in is the ability to enable stake-weighted on-vhain voting for a DAO for governance. Mentioning it for completeness' sake

@buffalu
Copy link

buffalu commented Jul 31, 2023

we would also like this information available on-chain for something

@mvines
Copy link
Contributor

mvines commented Jul 31, 2023

Yep this SIMD is about making that information available on chain. Any RPC endpoint discussion can be secondary, or moved elsewhere

@jacobcreech jacobcreech added standard SIMD in the Standard category core Standard SIMD with type Core labels Aug 16, 2023

### Changes Required

Stake weight information should already be available on full node clients
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this sysvar contain stakes before or after reward calculation ?

@michaelh-laine
Copy link

This SIMD seems to have gone somewhat stale, what do we need to push this through? This is important to being able to conduct on-chain stake weighted governance voting by validators

@anoushk1234
Copy link
Member

This SIMD seems to have gone somewhat stale, what do we need to push this through? This is important to being able to conduct on-chain stake weighted governance voting by validators

@t-nelson any thoughts on this? Since we spoke at the core dev mixer I think the concerns about space were addressed.

@ripatel-fd
Copy link
Contributor

@michaelh-laine The problem with this proposal is that it's quite inefficient (it adds a large account). It also loses correctness if the number of vote accounts increases in the future. I suggest using account compression here (Solana's silly name for "hash tree"), which is a common solution for making unbounded data accessible on-chain.

@anoushk1234
Copy link
Member

@michaelh-laine The problem with this proposal is that it's quite inefficient (it adds a large account). It also loses correctness if the number of vote accounts increases in the future. I suggest using account compression here (Solana's silly name for "hash tree"), which is a common solution for making unbounded data accessible on-chain.

@ripatel-fd even with 10k vote accounts the size should be ~400 kB.

@0xSol
Copy link

0xSol commented Mar 28, 2024

@riptl Given account compression being deprioritized short term, and per @anoushk1234, "with 10k vote accounts the size should be ~400 kB", does it make sense to move this SIMD forward to enable on-chain stake weighted governance voting as @michaelh-laine mentioned above?

@0xNineteen still interested in attempting an implementation?

@buffalojoec
Copy link
Contributor

@riptl Given account compression being deprioritized short term, and per @anoushk1234, "with 10k vote accounts the size should be ~400 kB", does it make sense to move this SIMD forward to enable on-chain stake weighted governance voting as @michaelh-laine mentioned above?

@0xNineteen still interested in attempting an implementation?

Maybe consider #133 as an alternative?

@michaelh-laine
Copy link

@riptl Given account compression being deprioritized short term, and per @anoushk1234, "with 10k vote accounts the size should be ~400 kB", does it make sense to move this SIMD forward to enable on-chain stake weighted governance voting as @michaelh-laine mentioned above?
@0xNineteen still interested in attempting an implementation?

Maybe consider #133 as an alternative?

What’s the difference?

@buffalojoec
Copy link
Contributor

What’s the difference?

@michaelh-laine Same query is available to BPF programs via syscall, but the data is not on-chain via sysvar account.

@0xNineteen
Copy link
Author

to comment my thoughts: this simd was initially thought to be a simple but there was a lot of feedback that the sysvar approach wasnt the right way to go - i havent been too involved in the discussions lately but it seems like @buffalojoec SIMD #133 will be the better way forward

@0xSol
Copy link

0xSol commented Mar 29, 2024

Thanks @0xNineteen for the update.

@buffalojoec SIMD-0133 could supersede this for the on-chain stake weights use case, unless hearing otherwise from the community.

@jacobcreech
Copy link
Contributor

@0xSol I don't think this SIMD needs to be superseded as it never was accepted. We should be good to just close this SIMD.

@0xSol
Copy link

0xSol commented Mar 29, 2024

Close per above comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Standard SIMD with type Core standard SIMD in the Standard category
Projects
None yet
Development

Successfully merging this pull request may close these issues.