Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IF: Implement correct compute_finality_digest #2080

Closed
Tracked by #2110
linh2931 opened this issue Jan 12, 2024 · 1 comment · Fixed by #2282
Closed
Tracked by #2110

IF: Implement correct compute_finality_digest #2080

linh2931 opened this issue Jan 12, 2024 · 1 comment · Fixed by #2282
Assignees

Comments

@linh2931
Copy link
Member

linh2931 commented Jan 12, 2024

compute_finality_digest needs to be implemented in block_header_state.

This should include the proper commitment structure that meets the needs of IBC and finality violations proofs.

Concepts and notation

Two fork databases exist: the legacy fork database and the new fork database. Prior to the start of the IF transition, only the legacy fork database is active. After the IF transition has been completed, only the new fork database is active. During the IF transition, both the legacy and new fork databases are simultaneously active. Some blocks will exist in both fork databases during the transition, but there will also be blocks that only exit in one and not the other.

Four classifications of blocks exist:

  1. Pre-IF Legacy Block: A Pre-IF Legacy Block is a block in which set_finalizers was never called in it or any of its ancestor blocks. It does not have a finality extension in its block header. A block_state_legacy (with a base of block_header_state_legacy) must be used to represent the state just after applying the Pre-IF Legacy Block. A block classified as a Pre-IF Legacy Block cannot be classified as anything else. A reversible Pre-IF Legacy Block can only be found in the legacy fork database.
  2. Transition Legacy Block: A Transition Legacy Block is a block which either has set_finalizers called within it or called within an ancestor block but also is not yet a Proper IF block (discussed below). It must have a finality extension in its block header. Any block classified as a Transition Legacy Block can also be re-classified as a Transition IF Block (discussed below); therefore the block itself can also be referred to as a Transition Block. A Transition Block referenced in the context of the legacy fork database can only sensibly be classified as a Transition Legacy Block, A Transition Block referenced in the context of the new fork database can only sensibly be classified as a Transition IF Block. A block_state_legacy (with a base of block_header_state_legacy) must be used to represent the state just after applying the Transition Legacy Block, but only in the context of the legacy fork database. This block_state_legacy should have some optional structures (some within block_header_state_legacy and some within block_state_legacy itself) used to service the additional requirements for these transition blocks. The block_state_legacy associated to a Transition Legacy Block can be converted into the block_state that would be associated to that same block when classified as a Transition IF Block. A conversion in the reverse direction is not possible due to information loss. If set_finalizers was called within a Transaction Block but not called in any of its ancestor blocks, then that block is also called an IF Genesis Block. When generating a snapshot for a Transition Block, the block_state_legacy associated to that block (classified as a Transition Legacy Block) must be used in the snapshot. It would be redundant to also include data from the block_state associated to that block (classified as a Transition IF Block) within the snapshot since it should be possible to convert from block_state_legacy to block_state, but it is also acceptable to include both in the snapshot. The new optional state in block_header_state_legacy helps indicate whether a descendant of the block must also be a Transition Legacy Block or must instead be a Proper IF block, e.g. by storing the IF Genesis Block number found in that branch and comparing it to the dpos_irreversible_block_num. A Transaction Block that requires its descendants to be Proper IF blocks is also called an IF Critical Block.
  3. Transition IF Block: A Transition IF Block is another classification for a Transition Block (the other one is Transition Legacy Block). A block_state (with a base of block_header_state) must be used to represent the state just after applying the Transition IF Block, but only in the context of the new fork database. A Transition Block can exist in the legacy fork database (classified as a Transition Legacy Block) without it also existing in the new fork database; this could be the case when the last irreversible block (according to the pre-IF algorithm) is still a block that is an ancestor of the IF Genesis Block, but it could also be the case when the Transition Legacy Block does not have a IF Critical Block as a descendant block. But the block that (according to the pre-IF algorithm) causes irreversibly to advance from a block that is an ancestor of the IF Genesis Block to either the IF Genesis Block or a block descendant from it is considered to be a IF Critical Block; note that the pre-IF algorithm is modified to disallow LIB or the root of the legacy fork database to advance past the IF Genesis Block. The first time an IF Critical Block is identified, the new fork database must start to be used with the IF Genesis Block as its root. All reversible IF Critical Blocks and their reversible ancestor blocks must be represented in the new fork database; no other reversible Transition blocks are allowed in the new fork database.
  4. Proper IF Block: A Proper IF Block is a block that is a descendant of a Proper IF Block or a descendant of an IF Critical Block. It must have a finality extension in its block header but it must also have a schedule_version field equal to 2^31 as a way of indicating it is a Proper IF Block and distinguishing it from a Transition Block. A block_state (with a base of block_header_state) must be used to represent the state just after applying the Proper IF Block. A block classified as a Proper IF Block cannot be classified as anything else. A Proper IF Block can only be found in the new fork database. In Pre-IF Legacy Blocks and Transition Blocks, the meaning of the action Merkle root field in the block header is the root of the block's action tree calculated using the old method. In Proper IF Blocks, the meaning of the action Merkle root field in the block header changes to something else that will be discussed in greater detail later, but in short it is either the zero digest or a root of a particular Finality Merkle Tree. Additionally, in Proper IF Blocks, the confirmed count field in the block header must be zero, the schedule_version must be 2^31, and the way the proposer signature on the block header is calculated changes.

When a reference is made to an IF Block without qualification, it can refer to either a Proper IF Block or a Transition IF Block. Both are considered valid IF Blocks.

Finality Merkle Trees:

The Finality Merkle Tree is a Merkle tree over a sequence of Finality Leaf Nodes, one for each block starting from the IF Genesis Block and ending at some specified descendant block. A Finality Leaf Node exists for the IF Genesis Block and every block that is a descendant of the IF Genesis Block. The Finality Leaf Node is a independently versioned structure (following same light header protocol version of the overall consensus protocol) that at least consists of the block number, the finality digest for the block, and also the root of the action Merkle tree of the block (calculated using the new method).

A particular Finality Merkle Tree can always be associated to any IF Block. This particular Finality Merkle Tree is referred as the Validation Tree for that target IF Block. It is defined as the Finality Merkle Tree over Finality Leaf Nodes starting with the one for the IF Genesis Block and ending with the one for the target IF Block. The root of all trees is calculated using the new method of computing a Merkle root over the sequence of Finality Leaf Nodes of the Finality Merkle Tree.

The above discussion of Finality Merkle Trees is to understand what they are conceptually. But to actually compute the roots a different data structure called an incremental Merkle tree is used. Rather than keeping the entire sequence of Finality Leaf Nodes, just the relevant digests are kept in an incremental Merkle tree which can be updated by adding the digest of a new Finality Leaf Node that is to be considered to be conceptually appended to the full sequence. This incremental Merkle tree structure allows such append-only updates to always be possible and it also serves the purpose of calculate the correct root digest for the conceptual Finality Merkle Tree. Furthermore, it even allows for generating a Merkle branch proof of inclusion for the last appended Finality Leaf Node (but only the last appended one).

Let a block_handle be a reference to either the block_state within the new fork database or the block_state_legacy within the legacy fork database (only if the corresponding block_state does not exist in the new fork database) associated with a particular reversible IF block (or the last irreversible IF block). The block_handle acts as a convenient way to uniquely reference a (reversible) block but with the added context to determine its ancestor blocks up until the last irreversible block and with the ability to get access to data associated with the referenced block that is common to both the block_state and block_state_legacy types. It also enables getting access to the block data itself but only for reversible blocks. And after the IF transition is complete, the block_handle also enables getting access to all data in the associated block_state.

Within a fork database (legacy or new), it is possible to find an ancestor of a starting block that has a specified block number (assuming that block number is less than that of the starting block but also greater than or equal to that of the root block of the fork database). For the purposes of this discussion, forkdb will be a reference to the new fork database. Assume a function exists in the new fork database called find_ancestor that takes a block_handle of a starting block and a target block number and optionally returns a block_handle to the block_state which is an ancestor of the starting block and has a block number equal to the provided target block number. So if start is the block_handle to the start block and target_block_num is the target block number, then the found block_handle (assuming it exists) is returned by forkdb.find_ancestor(start, target_block_num).

Let validation_tree(block_handle) be notation for a function that takes a valid block_handle and returns a reference to something representing the Validation Tree for the target IF Block referenced by the block_handle. The data structure it references would typically be stored within an optional validation structure stored within block_state or block_state_legacy. The optional validation structure within block_state (call it valid) must be present for blocks that have been validated; in fact it is how it how it indicates that Transition IF Blocks and Proper IF Blocks are valid. The optional validation structure stored within block_state_legacy would remain std::nullopt for Pre-IF Legacy Blocks, but for Transition Legacy Blocks it would be present iff the block was validated. To ensure the validation_tree function is well-defined, we require that the block_handle only references either the last irreversible block (which is of course validated) or a validated reversible block.

Let validation_tree_root(block_handle) be notation for a function that takes a valid block_handle (valid as discussed above) and returns the root digest for the Validation Tree referenced by validation_tree(block_handle) where the same block_handle passed into validation_tree_root is used for the argument into validation_tree.

When referencing core, it is referring to the core member variable in the block_state (actually the block_header_state) associated of some IF Block (which one in particular should be determined based on the context of the discussion). As a matter of convenience of notation and to be less ambiguous, the core associated with the block referenced by a block_handle b may be denoted b->core.

Given a core, it is always possible to tell whether a candidate_block_num within the interval [core.last_final_block_num(), core.current_block_block_num()] is the genesis block number or not. It would be useful to add a member function bool is_genesis_block_num(block_num_type candidate_block_num) const to finality_core to conveniently determine this (it should have the precondition that core.last_final_block_num() <= candidate_block_num <= core.current_block_block_num()). If core.links.front().source_block_num == core.links.front().target_block_num and core.links.front().source_block_num == candidate_block_num, then it must be the case that candidate_block_num is equal to the genesis block number. Otherwise, candidate_block_num is not equal to the genesis block number.

There is another Finality Merkle Tree that can be associated to some Proper IF Blocks. Consider a Proper IF Block referenced by block_handle b. If b->core.final_on_strong_qc_block_num is not equal to the genesis block number, then a Finality Merkle Tree can be associated with this block b that is referred as the Finality Tree for that block and is defined as the Validation Tree referenced by validation_tree(forkdb.find_ancestor(b, b->core.final_on_strong_qc_block_num).value()). For the sake of this conceptual definition, assume that the optional block_handle returned by find_ancestor function call is always present. To have that assumption hold for this definition, assume that the fork database always has an old enough root that has a block number less than or equal to b->core.final_on_strong_qc_block_num.

In practice, that assumption about the root of the fork database does not always hold, e.g. when starting from a snapshot. So actually retrieving the Finality Tree associated to a reversible Proper IF Block (or the last irreversible Proper IF Block), assuming it is defined, may not always be possible. Fortunately, the implementation does not actually need access to all defined Finality Trees associated to all currently reversible Proper IF Blocks (or the last irreversible Proper IF block). However, it does need to know the root of all such Finality Trees. And so this leads to the introduction of a new concept called the "tail".

Note that a Finality Tree cannot be associated to a Transition Block. It also cannot be associated to a Proper IF block that has a core.final_on_strong_qc_block_num equal to the genesis block number. So it is not a well defined concept to get the root of the Finality Tree associated to such blocks.

However, a Finality Tree Root can be defined for all IF Blocks (Transition IF Blocks and Proper IF Blocks). The Finality Tree Root is defined as the root of the Finality Tree associated with the block when such an association is well-defined, and otherwise is defined as the zero digest. The Finality Tree Root of an IF block is returned by a new finality_mroot function in block_state.

The tail:

Every IF Block b (using the block_handle to reference it) has an associated tail. The tail of b is the sequence of root digests of the Validation Trees associated to a unbroken sequence of blocks which consist of the ancestors of the IF Block starting with the one that has a block number equal to b->core.last_final_block_num().

This tail can be stored in the same optional validated structure within the block_state that contains the incremental Merkle tree for the Validation Tree. A convenience function within any IF Block's block_state called get_validation_mroot can take a target block number target_block_num which is within the valid range [b->core.last_final_block_num(), b->core.current_block_num()] and return the digest of the root of the Validation Tree associated with an IF Block that is either b (if target_block_num == b->core.current_block_num()) or is an ancestor of b that has block number equal to target_block_num. So b->get_validation_mroot(target_block_num) would return this root digest. Note that if b->valid.has_value() == false, then b->get_validation_mroot(target_block_num) would return std::nullopt.

The tail must be stored in a snapshot alongside the incremental Merkle tree for the Validation Tree, and they must be restored when loading from a snapshot into the valid structure within the root block_state.

As the valid is emplaced for a block_state when the corresponding block is validated, it must set the new incremental Merkle tree by setting it to the copy of the one from the parent block_state that is mutated by adding the new Finality Leaf Node corresponding to the newly validated block. Furthermore, it must set the new tail by copying over the old tail from the valid structure of its parent block_state, appending the root of the newly constructed Validation Tree for the block_state, and remove any roots from the front end of the queue based on the new last_final_block_num() calculated in the new core.

Finality Digest

Any Proper IF Block or Transition Block has an associated finality digest that can be calculated. The finality digest should be the SHA256 digest of a serialization of a structure that is versioned according to the light header protocol version (this is a new versioning scheme consisting of major and minor version numbers that is separate from protocol features that only gets updated if a protocol feature causes a breaking change to light block header validation). For now the only light header protocol version supported is 1.0. The serialization should include that major version number 1 (as a 32-bit number), the minor version number 0 (as a 32-bit number), and the rest of the serialization of the versioned structure.

That version structure for the version 1.0 protocol consists of the active finalizer policy generation number followed by two digests. The first digest is the Finality Tree Root of the block which is returned by the new finality_mroot() function added to block_header_state. The second is a digest computed as the SHA256 hash of the serialization of two other digests: the active_finalizer_policy_digest and the base_digest.

The active_finalizer_policy_digest is the SHA256 hash of some serialization of the finalizer policy that is pointed to by the active_finalizer_policy member in block_header_state.

The base_digest is the SHA256 hash of some serialization of the remaining (non-cached) fields of the block_header_state. So this includes: header, core, finalizer_policies, active_proposer_policy, proposer_policies, and activated_protocol_features.

Changes required

  1. Make changes to core to always have non-optional last_final_block_num(), final_on_strong_qc_block_num, and latest_qc_claim().block_num.
  2. Remove unnecessary proposal_mtree from block_header_state.
  3. Remove the member variable finality_mtree from block_header_state. The finality digest only needs to commit to the Finality Tree Root of the block. This Finality Tree Root does not need to be added as a new field to block_header_state. Instead, a member function finality_mroot() will be added to return the Finality Tree Root digest value to use within the finality digest computation.
  4. Add a member function digest_type finality_mroot() const to block_header_state. This member function should return header.action_mroot which now has dual semantics. For Pre-IF Legacy Blocks and Transition Blocks, the action_mroot is the legacy action Merkle root for the block. But for Proper IF blocks, that same field now should hold the Finality Tree Root digest of the block. The comment in block_header should be updated to reflect this new dual semantic of the action_mroot field.
  5. block_state should have an optional validated structure called valid that can be mutated after the block_state is initially constructed (it is initialized to std::nullopt). Later when the block is actually validated, this optional field will be replaced by the actual data relevant to the validation. That data consists of the incremental Merkle tree for the Validation Tree associated with the block and also the tail for that block. This also means that if the validated structure is present, it should be possible to retrieve the digest of the Finality Tree associated with the Proper IF Block, assuming defined, using b->get_validation_mroot(b->core.final_on_strong_qc_block_num).
  6. Consider a block_handle b referencing a Proper IF Block or an IF Critical Block. At some point a new block must be built with block b as the parent. When the new block is being assembled by the producer, block_header_state::next(block_header_state_input&) must be called. (Note that block_header_state::next(block_header_state_input&) should not be called on when this is referencing the block_header_state of a Transition Block that is not an IF Critical Block.) The block_header_state_input of block_header_state::next(block_header_state_input&) must also now include a new digest_type member called finality_mroot_claim which is Finality Tree Root of the block. This change is needed to set the action_mroot field of the new generated block header correctly. For Proper IF Blocks that have an associated Finality Tree defined, the Finality Tree Root is b->get_validation_mroot(next_core_metadata.final_on_strong_qc_block_num). But for Proper IF Blocks that do not have an associated Finality Tree defined (because their next_core_metadata.final_on_strong_qc_block_num is the genesis block number, which can be checked with core.is_genesis_block_num(next_core_metadata.final_on_strong_qc_block_num)), the Finality Tree Root is the zero digest. The next_core_metadata must be constructed prior to calling block_header_state::next(block_header_state_input&) because next_core_metadata.final_on_strong_qc_block_num is required to determine how to set finality_mroot_claim within the block_header_state_input. next_core_metadata can be constructed from the core of the existing block_header_state by calling the block_header_state_core::next_metadata function with the qc_claim_t returned by get_best_qc.
  7. Similarly, when a validating node constructs a block_header_state from a block header, it needs to know that it is using the correct value for finality_mroot() since that determines the finality digest. This will be done in two stages. At the first stage when constructing the block_header_state for the block (referenced with block_handle b) using its other next overload it can just assume the finality mroot provided through the block header's action_mroot field is the correct one, with the exception of the case of a Proper IF Block that has no Finality Tree associated with it. In the case where a Proper IF Block has no Finality Tree associated with it (which it can determine simply from the core), the validating node must pre-validate that action_mroot in the block header is the zero digest. Then the later second stage occurs when we actually validate the block; this only applies to Proper IF Blocks that have an associated Finality Tree. At the point of block validation, the valid structure should be present to provide a source for the expected Finality Tree Root. As part of completing validation, the implementation should look up the actual Finality Tree Root (using b->get_validation_mroot(b->core.final_on_strong_qc_block_num)) and ensure it is the same as the one returned by the finality_mroot() of the existing block_header_state. If the digests do not match, then the block should be rejected as invalid.
  8. Update compute_finality_digest in block_header_state to implement the design described in the "Finality Digest" section above.
  9. When generating a snapshot for a Pre-IF Legacy Block or a Transition Block, the implementation should continue to include the block_header_state_legacy in the generated snapshot which may now include additional optional information specific to the IF transition. It is also possible that there is additional optional information specific to the IF transaction that is tracked as part of the block_state_legacy instead; if so, this may need to be added to the snapshot as well. The details of this are left unspecified since implementing proper IF transition is not in scope for this issue and is instead left for IF: Implement proper IF transition #2057.
  10. When generating a snapshot for a Proper IF Block, instead of serializing a block_header_state_legacy the implementation should serialize some other structure that is a subset of the information in the block_state. This subset (which should be deterministic) includes the entire block_header_state but it also includes the incremental Merkle tree for the Validation Tree and the tail in the validated structure of the block_state (that should always be present by the point a snapshot is being generated).
@enf-ci-bot enf-ci-bot moved this to Todo in Team Backlog Jan 12, 2024
@linh2931 linh2931 added this to the Leap v6.0.0-rc1 milestone Jan 12, 2024
@arhag arhag changed the title IF: Unification: implement compute_finalizer_digest IF: Implement correct compute_finalizer_digest Jan 12, 2024
@arhag arhag added discussion and removed triage labels Jan 12, 2024
@arhag arhag added 👍 lgtm and removed discussion labels Feb 9, 2024
@linh2931 linh2931 moved this from Todo to In Progress in Team Backlog Feb 13, 2024
@BenjaminGormanPMP BenjaminGormanPMP moved this from In Progress to Blocked in Team Backlog Feb 20, 2024
@arhag arhag changed the title IF: Implement correct compute_finalizer_digest IF: Implement correct compute_finality_digest Feb 26, 2024
@BenjaminGormanPMP BenjaminGormanPMP moved this from Blocked to In Progress in Team Backlog Feb 27, 2024
@linh2931 linh2931 moved this from In Progress to Awaiting Review in Team Backlog Mar 5, 2024
@arhag arhag linked a pull request Mar 5, 2024 that will close this issue
@greg7mdp
Copy link
Contributor

greg7mdp commented Mar 7, 2024

suggested edits:

A Transaction Block that requires its descendants to be Proper IF blocks is also called an IF Critical Block.

A Transition Block that requires its descendants to be Transition IF Blocks is also called an IF Critical Block.

All reversible IF Critical Blocks and their reversible ancestor blocks must be represented in the new fork database;

All reversible descendants of IF Critical Blocks must be represented in the new fork database;

A Proper IF Block is a block that is a descendant of a Proper IF Block or a descendant of an IF Critical Block.

Aren't all Proper IF blocks descendants of an IF Critical Block?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants