Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD 0072: Feature Gate Threshold Automation #72

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
013d0e7
feature-gate-threshold-automation init
buffalojoec Oct 12, 2023
9b601c1
bump to 0072
buffalojoec Oct 19, 2023
24bc2c5
address minor corrections
buffalojoec Oct 19, 2023
fa1f218
revise based on related SIMDs
buffalojoec Dec 15, 2023
4a65303
add garbage collection process
buffalojoec Dec 15, 2023
a7d6a49
add PDA details
buffalojoec Dec 15, 2023
843c83e
revise signal cadence
buffalojoec Dec 15, 2023
e513e82
add instruction layout example
buffalojoec Dec 15, 2023
d705e54
add state layout examples
buffalojoec Dec 15, 2023
a63abbb
add multi-sig activation gate
buffalojoec Dec 15, 2023
7a6c3fc
add additional elaboration
buffalojoec Jan 25, 2024
23f1b24
init new version of 0072
buffalojoec Feb 5, 2024
f830338
add reusable support signal PDA
buffalojoec Mar 14, 2024
2fb901b
update to use validator epoch stake syscall
buffalojoec Mar 14, 2024
ab95740
add updates from SIMD 0133
buffalojoec May 9, 2024
1a8f21a
add some context and wording
buffalojoec May 13, 2024
ea129ba
add c struct for PDA layout
buffalojoec May 13, 2024
bb1c797
hard-coded threshold
buffalojoec May 16, 2024
b85755d
change deadline to 4500 slots
buffalojoec May 16, 2024
66e6371
clarity suggestions
buffalojoec May 22, 2024
58cb436
add more program specification
buffalojoec May 22, 2024
da132f7
one PDA per epoch
buffalojoec Jun 3, 2024
4de5abd
remove slots remaining constraint
buffalojoec Jun 3, 2024
072344f
Update proposals/0072-feature-gate-threshold-automation.md
buffalojoec Jun 3, 2024
9296061
instruction clarity
buffalojoec Jun 3, 2024
458b3ac
update `StageFeatureForActivation` instruction
buffalojoec Aug 14, 2024
d9f399d
update `SignalSupportForStagedFeatures` instruction
buffalojoec Aug 14, 2024
b87cadb
update runtime step
buffalojoec Aug 14, 2024
6e74a91
update state init requirement
buffalojoec Dec 9, 2024
4591f55
mark as idea
buffalojoec Dec 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
335 changes: 335 additions & 0 deletions proposals/0072-feature-gate-threshold-automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
---
simd: '0072'
title: Feature Gate Threshold Automation
authors:
- Tyera Eulberg
- Joe Caulfield
category: Standard
type: Core
status: Idea
created: 2024-01-25
feature: (fill in with feature tracking issues once accepted)
---

## Summary

This SIMD outlines a proposal for automating the feature activation process
based on a stake-weighted support threshold, rather than manual human action.

With this new process, contributors no longer have to assess stake support for a
feature before activation. Instead, the assessment is done by the runtime.

## Motivation

Feature gates wrap new cluster functionality, and typically change the rules of
consensus. As such, a feature gate needs to be supported by a strong majority
of cluster stake when it is activated, or else it risks partitioning the
network. The current feature-gate process involves two steps:

1. An individual key-holder stages a feature gate for activation
2. The runtime automatically activates the feature on the next epoch boundary

The key-holder is the one who *manually* (with the help of the solana-cli)
assesses the amount of stake that recognizes the feature and decides whether
it is safe to activate. This is obviously brittle and subject to human error.

If instead the runtime was to assess this stake support for activating a
feature, this would eliminate the key-holder's responsibility to asses stake
support, reducing the risk of human error.

In a world where multiple clients will aim to push features and seek to agree on
their activation, a more automated and secure process will be extremely
beneficial.

## New Terminology

- **Feature Gate program:** The Core BPF program introduced in
[SIMD 0089](./0089-programify-feature-gate-program.md)
that will own all feature accounts.
- **Staged Features PDA:** A PDA under the Feature Gate program used to track
features submitted for activation per epoch.
- **Validator Support Signal PDA:** A PDA under the Feature Gate program used to
track a validator's support signal bitmask, which signals which features they
support.
- **Get Epoch Stake Syscall:** The new syscall introduced in
[SIMD 0133](./0133-syscall-get-epoch-stake.md)
that returns the current epoch stake for a given vote account address.

## Detailed Design

This proposal outlines a new feature activation process. The new process
includes changes to the runtime's feature activation process as well as the
Core BPF Feature Gate program proposed in
[SIMD 0089](./0089-programify-feature-gate-program.md).

The new process will utilize the Feature Gate program to enable the runtime to
activate staged features that meet the necessary stake support while preventing
the activation of those that do not.

Two new instructions and one new type of PDA will be added to the Feature Gate
program. They are detailed in this proposal.

The new process is comprised of the following steps:

1. **Feature Creation:** Contributors create feature accounts as they do now.
2. **Staging Features for Activation:** In some epoch `N-1`, a multi-signature
authority stages for activation some or all of the features created in step
1, to be activated at the end of the *next epoch* (epoch `N`).
3. **Signaling Support for Staged Features:** During the next epoch (epoch `N`),
validators signal which of the staged feature-gates they support in their
software.
4. **Feature Activation:** At the end of epoch `N`, the runtime activates the
feature-gates that have the required stake support.

### Step 1: Feature Creation

The first step is creation of a feature account, done by submitting a
transaction containing System instructions to fund, allocate, and assign the
feature account to `Feature111111111111111111111111111111111111`.

This step is unchanged from its original procedure.

### Step 2: Staging Features for Activation

A multi-signature authority, comprised of key-holders from Anza and
possibly other validator client teams in the future, will have the authority to
stage created features for activation.
In the future, this authority could be replaced by validator governance.

The multi-signature authority stages a feature for activation by invoking a new
Feature Gate program instruction: `StageFeatureForActivation`. This instruction
expects the Staged Features PDA (defined below) to either exist or be allocated
with sufficient space and owned by the Feature Gate program, in order to
initialize state.

buffalojoec marked this conversation as resolved.
Show resolved Hide resolved
The `StageFeatureForActivation` instruction is structured as follows:

- Data: None
- Accounts:
- Feature account
- Staged Features PDA: writable
- Multi-signature authority: signer

Note that features can only be staged in the epoch prior to the target
activation epoch. This means a feature staged by invoking
`StageFeatureForActivation` in epoch `N-1` is scheduled for activation (through
the process below) at the end of epoch `N`. This is checked by the `Clock`
sysvar.

When a feature is staged for activation, initial stake support is zero.

#### Staged Features PDA State

A Staged Features PDA will be created for each epoch in which features are
staged to be activated. If no features are staged for a given epoch, that
epoch's corresponding Staged Features PDA will never be initialized and thus
will not exist.

These PDAs will not be garbage collected and can be referenced for historical
purposes.

The address of the Staged Features PDA for a given epoch is derived as follows,
where `epoch` is a `u64` serialized to eight little-endian bytes:

```
"staged_features" + <epoch>
```

buffalojoec marked this conversation as resolved.
Show resolved Hide resolved
The data for the Staged Features PDA will be structured as follows:

```c
#define FEATURE_ID_SIZE 32
#define MAX_FEATURES 8

/**
* A Feature ID and its corresponding stake support, as signalled by validators.
*/
typedef struct {
/** Feature identifier (32 bytes for public key). */
uint8_t feature_id[FEATURE_ID_SIZE];
/** Stake support (u64 serialized to little-endian). */
uint8_t stake_support[8];
} FeatureStake;

/**
* Staged features for activation.
*/
typedef struct {
/**
* Features staged for activation at the end of the current epoch, with
* their corresponding signalled stake support.
*/
FeatureStake features[MAX_FEATURES];
} StagedFeatures;
```

As depicted in the above layout, a maximum of 8 features can be staged for a
given epoch.

### Step 3: Signaling Support for Staged Features

With an on-chain reference point to determine the features staged for activation
for a particular epoch, nodes will signal their support for the staged features
supported by their software.

A node signals its support for staged features by invoking another new Feature
Gate program instruction: `SignalSupportForStagedFeatures`. This instruction
expects the Validator Support Signal PDA (defined below) to either exist or be
allocated with sufficient space and owned by the Feature Gate program, in order
to initialize state.

The `SignalSupportForStagedFeatures` instruction is structured as follows:

- Data: A `u8` bit mask of the staged features.
buffalojoec marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also need to include the epoch? As the list of staged features changes epoch-to-epoch, we need a way of ensuring that the bit mask in each validator support signal transaction is interpreted for the correct epoch. If the transaction is delayed or is executed in a different epoch than the one it was sent, for whatever reason, then the support signal will be mis-interpreted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess the Staged Features PDA supplied should indicate the intended epoch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the PDA for the validator's support signal. When they send a bitmask, it gets recorded in their signal PDA, and if they send again, it's overwritten. The current epoch is checked via Clock sysvar when the program is invoked with this instruction.

- Accounts:
- Staged Features PDA: writable
- Validator Support Signal PDA: writable
- Vote account
- Authorized voter: signer

The authorized voter signer must match the authorized voter stored in the vote
account's state.

The `SignalSupportForStagedFeatures` instruction processor will provide the
vote account's address to the `GetEpochStake` syscall to retrieve the stake
delegated to that vote account for the epoch. Then, using the `1` values
provided in the bitmask (defined below), the processor will add this stake value
to each corresponding feature ID's stake support in the Staged Features PDA.

A node's stake support for features is always accounted for using their most
recently sent bitmask. Each time a node sends a bitmask, if a previous bitmask
was already sent for that node, their previous bitmask is first used to deduct
stake support before adding stake support for the features signalled by the new
bitmask.

Similar to the `StageFeatureForActivation` instruction, the Clock sysvar will be
used to ensure the Staged Features PDA corresponding to the *current* epoch `N` was
provided.

If a node does not send this instruction successfully during the current epoch,
their stake is not tallied. This is analogous to a node signalling support for
zero features.

#### Signal Bitmask

A bit mask is used as a compressed ordered list of indices. This has two main
benefits:

- Minimizes transaction size for nodes, requiring only one byte to describe
256 bytes (8 * 32) of data. The alternative would be sending `n` number of
32-byte addresses in each instruction.
- Reduce compute required to search the list of staged features for a matching
address. The bitmask's format provides the Feature Gate program with the
indices it needs to store supported stake without searching through the staged
features for a match.

A `1` bit represents support for a feature. For example, for staged features
`[A, B, C, D, E, F, G, H]`, if a node wishes to signal support for all features
except `E` and `H`, their `u8` value would be 246, or `11110110`.

#### Validator Support Signal PDA State

A Validator Support Signal PDA will be created for each vote account. It will
store the node's submitted bitmasks.

As mentioned previously, a node's most recently submitted bitmask is considered
their signal for the epoch. Validator Support Signal state allows one bitmask
per epoch, but can store multiple epochs of historical bitmasks. This is useful
for querying stake support post-epoch. When a new bitmask is submitted for an
epoch with an existing entry in the account state, it is overwritten.

These accounts are never garbage collected since they are reused every epoch.
This means nodes are only required to pay for rent-exemption once.

The address of the Validator Support Signal PDA (for a given epoch?) is derived
as follows, where `vote_address` is the 32 bytes of the validator's vote address.

```
"support_signal" <vote_address>
```

The data for the Validator Support Signal PDA will be structured as follows:

```c
#define MAX_SIGNALS 4

/**
* A validator's support signal bitmask along with the epoch the signal
* corresponds to.
*/
typedef struct {
/**
* The epoch the support signal corresponds to (u64 serialized to
* little-endian).
*/
uint8_t epoch[8];
/** The support signal bitmask. */
u8 signal;
/** Padding for 8-byte alignment. */
uint8_t _padding[7];
} SupportSignalWithEpoch;

/**
* A validator's support signal bitmasks with their corresponding epochs.
*/
typedef struct {
/**
* The support signal bitmasks with their corresponding epochs.
*/
SupportSignalWithEpoch signals[MAX_SIGNALS];
} ValidatorSupportSignal;
```

As depicted in the above layout, a validator's signal can be stored (and
queried) for up to 4 epochs.

### Step 4: Feature Activation

At the end of the epoch, the runtime loads the Staged Features PDA for the
current epoch and calculates the stake - as a percentage of the total epoch
stake - in support of each feature to determine which staged features to
activate.

Every feature whose stake support meets the required threshold must be
activated. This threshold will be hard-coded in the runtime to 95% initially,
but future iterations on the process could make this threshold configurable.

Features can be revoked at any point up until this step (staged or unstaged). If
a feature is revoked, nodes may still have signalled support for it, but the
runtime will not activate the feature since the account will not exist on-chain.
For more information, see the
[Feature Gate Program's](https://github.com/solana-program/feature-gate)
`RevokePendingActivation` instruction, as proposed in
[SIMD 0089](./0089-programify-feature-gate-program.md).

If a feature is not activated, it must be resubmitted according to Step 2. If it
is revoked, it must be resubmitted according to Step 1.

## Alternatives Considered

## Impact

This new process for activating features directly impacts core contributors and
validators.

Core contributors will no longer bear the responsibility of ensuring the proper
stake supports their feature activation. However, this proposal does not include
a mechanism for overriding or customizing the stake requirement. This capability
should be proposed separately.

Validators will be responsible for signaling their vote using a transaction
which they've previously not included in their process. They also will have a
more significant impact on feature activations if they neglect to upgrade their
software version.

## Security Considerations

This proposal increases security for feature activations by removing the human
element from ensuring the proper stake supports a feature.

This proposal could also potentially extend the length of time required for
integrating feature-gated changes, which may include security fixes. However,
the feature-gate process is relatively slow in its current state, and neither
the current process or this proposed process would have any implications for
critical, time-sensitive issues.

Loading