Skip to content
This repository has been archived by the owner on Jan 11, 2024. It is now read-only.

Top down discussion #181

Closed
wants to merge 7 commits into from
Closed

Top down discussion #181

wants to merge 7 commits into from

Conversation

cryptoAtwill
Copy link
Contributor

Migrate the top down message execution to new design with Fendermint.

@cryptoAtwill cryptoAtwill marked this pull request as draft August 8, 2023 13:26
Copy link
Contributor

@adlrocha adlrocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a first pass to see if it helps with the implementation. I think we should decouple the ParentView, i.e. the perceived finality of the parent seen by the peer, from the actual proof of finality committed and agreed upon by all validators. I think these concepts are not decoupled in the code and is quite confusing to follow.

My suggestion:

  • ParentSyncer is the piece of code responsible for syncing with the parent and keeping an up to date view of the perceived parent finality of the peer. Any module that needs to perform queries about what is the current finality we see in the parent queries this module.
  • IPCParentFinality is the piece responsible for orchestrating the creation of PoFs and keeps track of what is the latest finality all the validators have so far agreed upon.

Let me know if this makes sense, we can also discuss sync if you think is easier.

@@ -125,6 +129,8 @@ where
///
/// Zero means unlimited.
state_hist_size: u64,
/// The top down parent finality.
parent_finality: ProofOfFinality,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe convert this into the ParentViewManager responsible for being the interface of the peer with the parent. As discussed in Slack:

  • It keeps track of the latest heights seen as final in the parent.
  • It exposes an interface to query from the parent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you have a ParentSyncer abstraction below. I think what we need here is something like this with the aforementioned functionality.

}

/// Verifies the list of transactions to see if they are valid. Returns true if all the txns
/// are valid and false if any one of the txns is invalid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we discussed, let's add a reminder here to explore an alternative serialization that would allow us to inspect the first few bytes of a transaction to determine their type without having to deserialize all of them. No need to address it now, but definitely something worth exploring.

let mut size: usize = 0;
let mut txs = Vec::new();
for tx in request.txs {
if size.saturating_add(tx.len()) > max_tx_bytes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have question here (probably for @aakoshh), when a transaction is disregarded for a proposal, it is still kept in the mempool for a future proposal, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#[derive(Debug, Clone)]
pub struct Config {
/// The number of blocks to delay reporting when creating the pof
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


impl From<IPCMessage> for Vec<u8> {
fn from(value: IPCMessage) -> Self {
serde_json::to_vec(&value).expect("should not happen")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this should not happen? What if someone by mistake does this over a Vec<u8> that is not a IPCMessage? I don't think panicking is the smartest approach. What if we implement try_from instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is super confusing actually. I read it as parsing Vec<u8> like 3 times already.

So yes, this will (probably) not fail because we can always convert to JSON, and this isn't the case when we're running the risk of a malicious validator passing us some untrustworthy byte content. However, I'd argue this implementation should not exist: it is not an inherent property of IPCMessage that it's transformed to binary via JSON. Case in point: we actually use IPLD. Let's not decide for everyone what they should use, and leave it up to the application implementer.

}

/// Constantly syncing with parent through polling
pub struct PollingParentSyncer {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we can have an implementation of ParentSyncer periodically (and in parallel) populates a cache with the latest final information it has from the parent, and includes an option to explicitly query the IPC agent if there's a cache miss for when a peer is syncing from scratch.

Maybe we could include a flag to determine if a call to the ParentSyncer wants to be done with_fallback (triggering a query to the IPC agent if there is a cache miss) or without it.

parent_syncer: Arc<T>,
lock: LockedProof,
) -> anyhow::Result<()> {
let mut interval = tokio::time::interval(Duration::from_secs(config.polling_interval));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of polling, we could consider in the future opening a web-socket with the IPC agent to get notified with every new block, in case we want to use a pull instead of a push approach (maybe add a note?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this functionality shouldn't be here but part of the ParentSyncer implementation, right? (maybe I missed something)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The websocket version can be another impl of ParentSyncer.
Maybe I should do a renaming of ParentSyncer to ParentView so that it's less confusing.

pub struct ProofOfFinality<T> {
config: Config,
started: Arc<AtomicBool>,
latest_proof: LockedProof,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the latest proof committed on-chain? If this is the case, we should only set this value after we know a blocks with the PoF has been committed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I should rename this to latest_parent_view


impl<T: ParentSyncer + Send + Sync + 'static> ProofOfFinality<T> {
pub fn get_finality(&self) -> IPCParentFinality {
let finality = self.latest_proof.read().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider querying the latest finality on-chain instead of keeping the value in-memory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, this finality is the currently finality seen in the parent or the latest PoF finality committed on-chain.

txs.push(tx);
}

let proof = self.parent_finality.get_finality();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not correct. The view of the parent finality currently seen should be kept IMO by the ParentSyncer, and in this PoF data structure we should only keep the latest finality committed (I think we are merging both concepts here and is a bit confusing).

@@ -291,6 +299,37 @@ where
let state = self.committed_state()?;
Ok((state.state_params, state.block_height))
}

fn verify_ipc_message(&self, ipc_message: IPCMessage) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can now be moved into the ChainMessageInterpreter in the implementation of the ProposalInterpreter added in #185

My goal with app.rs was to keep its size and scope to a minimum, it's arguably too big already. Ideally it should only be an implementation of the Application trait and dispatch to other components. Everything it does is in support of this, mainly state management and conversions to Tendermint format.

It is for this reason that the top-level interpreter which deals with transactions and genesis receives binary input: to make app.rs oblivious to the actual payloads we are working with. It has to know the output format in order to be able to translate it to Tendermint types.

I'd like us to maintain this separation of concerns and move the finality checks into the interpreter. This can be achieved in multiple ways:

The parent view or parent_finality can be added to the ChainMessageInterpreter during application construction, instead of the App struct, but ideally only if it's a stateless component. If parent finality checking needs some state, it should be lifted into the App for safekeeping, and passed to the interpreter during invocation. I would try using STM for this, so the interpreter can read the state of the parent chain from data structures maintained by another background component, which can be started run.rs. This same STM enabled componet can also be a shared with the transaction processing where we can add things CIDs to look up, for another background process to handle.

/// are valid and false if any one of the txns is invalid.
fn verify_messages(&self, txns: &Vec<Bytes>) -> bool {
for tx in txns {
let is_ok = match serde_json::from_slice::<ChainMessage>(tx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you take a look at the BytesMessageInterpreter you will see that we use IPLD binary format for messages, not JSON. The Cosmos SDK uses JSON, which may be why you thought Tendermint does as well, but it's actually agnostic, and I thought we should stick to the Filecoin convention of IPLD (not that I don't think JSON wouldn't be more convenient).

So this check should not be here, or exist in this format, because checking this is exactly the purpose of BytesMessageInterpreter. I added this check in #185

Comment on lines +545 to +547
size += tx.len();
txs.push(tx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as a utility function to #185 so there's no need to repeat it in app.rs.

txs.push(tx);
}

if let Some(finality_proposal) = self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this logic into the ChainMessageInterpreter.

if let Some(finality_proposal) = self
.parent_finality
.next_finality_proposal()
.map_err(|e| anyhow!("cannot get next finality to propose due to: {e}"))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this method fail? It's only reading data from memory, not doing any IO, right?

Be very careful when you use ? in app.rs because a failure will crash Fendermint and Tendermint will stop as well.

Comment on lines +554 to +555
let bytes =
serde_json::to_vec(&ChainMessage::IPC(IPCMessage::TopDown(finality_proposal)))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once moved into the ChainMessageInterpreter you won't have to convert to binary (not JSON) because that is take care of by the BytesMessageInterpreter.

In any case, it was wrong to put this after the stuff above that summed up all the bytes, because this could have brought the total above the allowed limit. This will not be a problem if you rebase onto #185 because there the check is done after the interpreters have finished.

For this reason (the possibility of this check removing a message we wanted to include in the proposal) it is crucial that for example next_finality_proposal() actually does things in a read-only fashion, which is why I indicated in the ProposalInterpreter that it does not return a modified state (it's a Reader monad, not a State monad 🤓 ), because our proposal might not be voted on, might time out, or may be removed. We have to be prepared to re-propose it. Only the transaction execution should cause changes, e.g. when we see that our proposal went through, then we can move on to the next candidate.

If we used STM, we might actually want to maintain two regions there: a "transient" region modified during deliver_tx, and a "permanent" region where it's merged during commit.

&self,
request: request::ProcessProposal,
) -> AbciResult<response::ProcessProposal> {
if !self.verify_messages(&request.txs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to ChainMessageInterpreter.

type Bounds<T> = (T, T);

/// The key value cache. The key must be numeric and falls within a range/bound.
pub struct RangeKeyCache<Key, Value> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct RangeKeyCache<Key, Value> {
pub struct RangeKeyCache<K, V> {

I think these won't be harder to read, and are well established?

/// The key value cache. The key must be numeric and falls within a range/bound.
pub struct RangeKeyCache<Key, Value> {
/// Stores the data in a hashmap.
data: HashMap<Key, Value>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use a BTreeMap to store data in-order, since it looks like you will be iterating over it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH even a VecDequeue would be better than a HashMap as most likely you want your data to be contiguous as well, without inserting into gaps, and you only pop from the beginning and push/pop from the end. You can calculate the index based on the boundaries and the offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, initially was VecDequeue, but we changed it. I think BTreeMap would work too

.flatten()
}

pub fn values_within_range(&self, start: Key, end: Option<Key>) -> Vec<&Value> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Removes the block hashes stored till the specified height, exclusive.
pub fn remove_key_till(&mut self, key: Key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Insert the block hash at the next height
pub fn insert_after_lower_bound(&mut self, key: Key, val: Value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon the implementation of this cache can be simplified to a struct which has a BTreeMap and a minimum established height. As I understand those are the operations you want: to clear out data up to a lower bound, and never to re-insert it. The upper bound seems like accidental complexity.

Comment on lines +316 to +317
// FIXME: maybe we can reject the proposal when the transaction cannot be deserialized
tracing::info!("cannot deserialized transaction: {tx:?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also my first intuition, but then I thought we can also let the proposal pass and punish the validator during execution.

Comment on lines +53 to +54
/// Get the top down messages from the nonce of a height
async fn top_down_msgs(&self, height: BlockHeight, nonce: u64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: how does one learn the nonce of a height? All I see in this interface is that I can get the latest height, and I can ask about block hashes (I assume tipset CIDs) up to that height. Do I have to maintain the nonce on this side and probe height-by-height to check if are top-down messages for the next nonce?

Comment on lines +56 to +57
/// Get the latest membership information
async fn membership(&self) -> anyhow::Result<ValidatorSet>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What membership is this? Is it the configuration that the parent thinks the child subnet should adopt? What does "latest" mean here, is it final?

use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub enum IPCMessage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the IPC related messages (and messages only) into the fendermint_vm_message crate, so that doesn't have to include references to crates with implementation such as the IPC Agent proxy. It should be a crate that anyone can add as a dependency to be able to inspect chain messages, until we figure out a way to make them more generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note: I always struggle to decide between using acronyms in PascalCasing names, but I usually go with only capitalising the first letter, e.g. ProductId, XmlHttpRequest, JsonRpcServer, rather than ProductID, XMLHTTPRequest, JSONRPCServer. I see there are all sorts of variations out there, but I found this to be a perhaps less aesthetically pleasing, but at least consistent way.

Have you guys established the IPC* prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the equivalent of this enum in the crate I mentioned above in #187 with a placeholder for TopDown.

}
}

pub fn upper_bound(&self) -> Option<Key> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw later that if you started with an empty cache, inserted something, then removed it, this would still remember what the upper bound used to be. If this is an intended behaviour, it would be great to capture it as an invariant, although personally I'm not sure how intuitive it is that the cache knows about a height which you cannot retrieve from it.


fn top_down_msgs(&self, _height: BlockHeight, nonce: Nonce) -> Vec<CrossMsg> {
self.read_cache(|cache| {
let v = cache.top_down_message.values_within_range(nonce, None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there is not upper bound here. How are we supposed to know by how much the nonce was increased, so as not to get overlaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it in the CrossMsg itself? Must be. Sorry I'm not familiar with the SDK yet.


fn on_finality_committed(&self, finality: &IPCParentFinality) {
let mut cache = self.cache.write().unwrap();
cache.block_hash.remove_key_till(finality.height);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the other caches, when do they get cleared?

Comment on lines +71 to +81
impl Clone for PollingParentSyncer {
fn clone(&self) -> Self {
Self {
config: self.config.clone(),
started: Arc::new(AtomicBool::new(false)),
cache: self.cache.clone(),
ipc_agent_proxy: self.ipc_agent_proxy.clone(),
handle: None,
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to exist? Why would somebody call it?

Comment on lines +132 to +139
let hashes = fetch_block_hashes(&config, &agent_proxy, &lock).await?;
update_top_down_msgs(&config, &agent_proxy, &lock).await?;
update_membership(&config, &agent_proxy, &lock).await?;

let mut cache = lock.write().unwrap();
for r in hashes {
cache.block_hash.insert_after_lower_bound(r.0, r.1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why hashes are inserted after all the other updates? Why not have update_block_hashes instead of fetch_block_hashes?

Comment on lines +132 to +134
let hashes = fetch_block_hashes(&config, &agent_proxy, &lock).await?;
update_top_down_msgs(&config, &agent_proxy, &lock).await?;
update_membership(&config, &agent_proxy, &lock).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using .context(...) liberally to attach some message about which of these failed.

Comment on lines +114 to +116
let handle =
tokio::spawn(async move { sync_with_parent(config, parent_syncer, cache).await });
self.handle = Some(handle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if the syncer fails, the handle will contain the error, but there will be no feedback anywhere in the logs about the error, I assume until the application is shut down and potentially the handle is awaited upon.

I'd say if it is allowed to fail, it should at least log the error here, but even better would be not allowing it to fail and retrying later.

Comment on lines +181 to +186
// if cache.latest_height() is None, it means we have not started fetching any heights
// we just use the latest height minus a lower bound as the parent view
let starting_height = cache.block_hash.upper_bound().unwrap_or(max(
1,
latest_height.saturating_sub(config.chain_head_lower_bound),
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that you thought about where to start from. I do think though that this behaviour could be improved.

After a restart, syncing with the parent should resume from where it was left off, that is, before the syncer is started, we should look up the state of our contract that ultimately tracks the application of top-down messages and retrieve the last known final block height and nonce, and go from there. Otherwise you might stop your node for maintenance, bring it back later, and miss some of the messages simply because more than the chain_head_lower_bound amount of time has elapsed.

Another adverse scenario is when you stop your node, restart quickly, and then it replays messages which you already had by effectively resetting the lower bound to an earlier value!

Comment on lines +21 to +22
/// The lower bound for the chain head height in parent view
chain_head_lower_bound: BlockHeight,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to explain here whether this is an absolute value or a relative one compared to the latest height.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/consensus-shipyard/fendermint/pull/181/files#diff-088fbb2a5e5eeef38c5648a53f47e659eecd568fe77bb93e6b262cfa08801b42R185

suggests it's a delta, but then what's the difference between this and the "delay" field? 🤔

Comment on lines +167 to +169
tracing::warn!("cannot fetch parent chain head due to {e}");

// not throw errors, caller will retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then maybe it would be better if this method wasn't allowed to return an error at all, to remove all ambiguity about whether it can fail or not.

// FIXME: make the fetching batching and concurrent
let mut results = vec![];
for h in starting_height..=latest_height {
let block_hash = agent_proxy.get_block_hash(h).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this allowed to cause failure, when errors from .get_chain_head_height were swallowed?

Comment on lines +22 to +23
/// latest configuration information from the parent.
pub config: ValidatorSet,
Copy link
Contributor

@aakoshh aakoshh Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// latest configuration information from the parent.
pub config: ValidatorSet,
/// latest membership information from the parent.
pub validator_set: ValidatorSet,

You already call it membership in the interface, so that's a good name too, but config is such an overloaded term, and most of the time in this codebase it's about static settings of a component, not something as specific as the dynamic validator powers. I understand it's called configuration in the theory papers, but in this context it's not helpful IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to validator_set, should be better.

pub struct IPCParentFinality {
/// The latest chain height
pub height: u64,
/// The block hash. For FVM, it is a Cid. For Evm, it is bytes32.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we mention EVM here. Fendermint will have to deal with Tendermint parents, but not geth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to have a purely EVM subnet, just be general here.

/// parent. Also it perform validation on incoming finality.
#[derive(Clone)]
pub struct ParentFinalityProvider<T> {
config: Config,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case in point: here's another config field, less than 10 lines away, with a completely different meaning.

Some(h) => h,
None => return Ok(None),
};
if latest_height < self.config.chain_head_delay {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tought chain_head_delay said it was "number of blocks". How can it be compared with an absolute height?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is when the parent chain is lower then the minimum. A comment would help here.


// FIXME: make the fetching batching and concurrent
let mut results = vec![];
for h in starting_height..=latest_height {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically this is re-reading all the last 900 block heights? There must be a way to handle re-orgs in a more relaxed way, push notifications, maybe compare the last height read with what you have on record to see if there was a change?

}

let confident_height = latest_height - self.config.chain_head_delay;
self.finality_proposal_at_height(confident_height)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we never access the data in anything above latest_height - chain_head_delay, so I wonder why the syncer even fetches those? It could just ask the parent chain always about the level which it thinks is final, and save itself the trouble of querying the IPC Agent at every poll interval for the last 900 or heights, only to look at the 900th and nothing else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for preparing the data in advance. This could be queried when process_proposal, someone else sends a height that is greater than latest_height - chain_head_delay, we still need to fetch in advance to get the data so that we can validate.

let this_finality = match self.finality_proposal_at_height(other_finality.height) {
Ok(Some(finality)) => finality,
_ => {
tracing::info!("cannot create next finality, check return false");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracing::info!("cannot create next finality, check return false");
tracing::debug!("cannot create next finality, check return false");

I doubt you will want to see this often in the production log.

}

pub fn check_finality(&self, other_finality: &IPCParentFinality) -> bool {
let this_finality = match self.finality_proposal_at_height(other_finality.height) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong but I think this isn't entirely what we wanted.

As I understand check_finality is used to ascertain that a top-down checkpoint proposed by some validator is indeed final according to our view. (It would be great if the method contained docstrings).

When I look at next_finality_proposal it contains the logic to deduct from the latest height the delay, and then it simply calls finality_proposal_at_height, which itself just constructs the data, but doesn't check whether the height it's asked about is indeed final.

By calling finality_proposal_at_height in check_finality, we only check the contents of the proposal, but not its actual finality, ie. not that it's been sufficiently embedded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but not its actual finality, you mean the height is not the current latest height?


pub fn check_finality(&self, other_finality: &IPCParentFinality) -> bool {
let this_finality = match self.finality_proposal_at_height(other_finality.height) {
Ok(Some(finality)) => finality,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ok(Some(finality)) => finality,
Ok(Some(finality)) => finality == *other_finality,

And then you don't have to use return.

agent_proxy: &Arc<AgentProxy>,
lock: &LockedCache,
) -> anyhow::Result<Vec<(BlockHeight, Vec<u8>)>> {
let latest_height = match agent_proxy.get_chain_head_height().await {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be technically possible for get_chain_head_height to return a lower height than what came before?

Say the parent was a longest chain and it was undergoing a re-org, that is, it was switching from a shorter to a longer fork, and first it rolled back the shorter fork to the greatest common ancestor, then forward on the new new fork. If we queried during rollback, it might appear to go backwards, and then it would leave some invalid block hashes in the cache.

For example if we had this:

B0 - B1 - B2 - B3 - B4
   \ 
     B1' - B2' - B3' - B4' - B5'
  1. Our client is at B4, latest height is 4
  2. The parent decides to roll back to B0 and then forward to B5'
  3. We query it during the rollback when it's at B3', we fetch B1', B2', B3'
  4. We insert these into the cache, which now contains B0, B1', B2', B3', B4

B4 is not a descendant of B3' but it was left there because it's not included in the catch.

If you included the clearing out of the cache above the the new minimum, rather than just do inserts, this problem went away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point raise, will take note of this one.

@@ -30,6 +30,7 @@ rand = { workspace = true, optional = true }
fendermint_vm_encoding = { path = "../encoding" }
fendermint_vm_actor_interface = { path = "../actor_interface" }
fendermint_testing = { path = "../../testing", optional = true }
fendermint_ipc = { path = "../../ipc" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add a reference to something as "heavy" as the IPC crate. The fendermint_vm_message create is a dependency for fendermint_rpc, so it should be a thin DTO layer concerned only with messages and their basic validation (like signatures), and not drag in the entire tokio stack.

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cryptoAtwill this is a great effort at tackling the top-down story, which I probably big enough to be called an epic. There is clearly a discovery process going on here while we figure out the best way to add this to Fendermint.

I think it would be a good idea to make a conscious effort to greatly limit the scope of these PRs. Let's think how we could break up an issue if it seems too big, and discuss early and often.

For example a good first PR for this would have been just one that contains just the top-down finality message proposal and nothing else. Then we can discuss if we all agree that it's in the right place and has the right content. Next we can add the parent view, without hooking it into the processing, again so we can agree about its implementation, add tests, etc.

height: BlockHeight,
) -> anyhow::Result<Option<IPCParentFinality>> {
let block_hash = match self.parent_view_provider.block_hash(height) {
None => return Err(anyhow!("block hash cannot be fetched at {height}")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this method should not be public, and then you could change so that it cannot return an Err.

An Option already seems to capture the fact that the height in the other_finality might be unknown to us, and in the other directly when we construct our own proposal it should be there, although I would say it might not be if the cache is undergoing a rollback and there is unfortunate timing between latest_height() and finality_proposal_at_height().

And this is why STM would be a boon: you can do both operations in one atomic unit, without having to take out and release the RwLock you have twice.

return Ok(None);
}

let confident_height = latest_height - self.config.chain_head_delay;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must ensure this cannot go backwards, which could be the case if the parent rolls back and we catch it while doing a reorg and subtract a fixed delay. I commented in parent.rs that I'm not sure whether this could happen, it depends on how the agent is implemented, and then there is your cache which remembers its upper bound even if data was to be removed, but it seems like a bit accidental, rather than intentional. Would be nice to have some quickcheck tests to ensure invariants are maintained.

Another scenario is what I mentioned about the startup: it should pick up the state from the actor, so even if Fendermint was pointed at a different parent node, one that might be re-syncing from genesis, it won't propose an earlier finality.

self.read_cache(|cache| cache.membership.clone())
}

fn on_finality_committed(&self, finality: &IPCParentFinality) {
Copy link
Contributor

@aakoshh aakoshh Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here we remove all entries from the cache up to the finality height, which is calculated elsewhere as latest - delay. Yet the parent fetcher will fill data in each loop between latest - lower_bound and latest. Doesn't that mean that there is, or should be no difference between delay and lower_bound, otherwise what one deletes, the other puts back. Or actually it can't put it back because of the cache cutoff, but it still keeps fetching it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dissonance would disappear if you only kept one value, the finality-delay, and in the fetcher you:

  1. probed the last height you fetched before
  2. if it's the same as before, fetch the next region up to the latest height
  3. if it's not the same, find the greatest common ancestor (walk backwards either linearly, or skipping), and then re-fetch up to the latest height, and replace

That way you only fetch what you need, and never anything beyond the finality height (if it was indeed final). Under these conditions the cache cutoffs wouldn't be required, because if the configuration is right, you should never fetch lower than lowest data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly upon startup, you can initialize the component with the last known finalized block, then check if it's still the same on the parent: if it doesn't exist, the parent is syncing; if it's different, it's a fatal error because the subnet state diverged from the parent. Then, go fetch until the latest height is reached, but be careful that we only start doing this once our own node is caught up with the network, or you might be fetching the entire parent chain (if our own node is going from genesis).

Once you have fetched all the region between the last known finalized block and the latest, then you can use the finality-delay to choose the next proposal.

@aakoshh
Copy link
Contributor

aakoshh commented Aug 14, 2023

I would propose the following breakdown of PRs:

  1. Top-down IPC message
  2. An STM enabled component to accumulate the data coming from the parent, one which captures the invariants of what of possible transitions (atomic reorgs) in tests, and exposes the relevant queries (and nothing more).
  3. A parent syncer that feeds this STM component with data, including the initialization requirements (never starting from an empty state, at least we should be aware of genesis), with hook for starting once self-chain-sync finished
  4. The smart contract that keeps the data, including the hooks in the actor_interface to query it.
  5. Hook it into the interpreters, disabled by default because there are no parents in end-to-end tests at the moment, and a Fendermint root should be possible as well.
  6. How do we add this to the end-to-end tests? Maybe do Fendermint-only setups in the current CI?

Open to any other recommendations.


#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub enum IPCMessage {
TopDown(IPCParentFinality),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, almost forgot: because every validator has the actual content of this from its own parent, there is no need to propose it in its entire glory.

The IPCParentFinality (isn't there a better name? IpcTopDownBundle?) can have a unknown number of messages - imagine that the subnet nodes are restarting after a relatively long hiatus, and now it's putting in multiple heights finalized on the parent, hours or something, and now we are pushing thousands of cross messages to each other while timing out and screaming "I know!" because we already have them in memory locally too!

Instead of including everything in the proposal, we can include just the height/nonce of the messages up to which point we want to execute, and the CID of the resulting data structure that has all those messages in it, including the membership set up to that point (not everyone's different idea of latest).

Then everyone can construct their view of what that finality cutoff should contain and what its CID should be, and decide whether to vote on it or not.

This will also require a way to actually retrieve that CID from the others in case you don't have it during execution, so it's a good idea to save it into the blockstore, once it's agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using Cid to replace all the top down messages will be a speed boost. But the issue now, like you mentioned, if someone does not have this Cid stored, then there should be a way to retrieve it from other peers. Currently this mechanism is missing. Or I misunderstood.

@cryptoAtwill cryptoAtwill changed the title Top down Top down discussion Aug 18, 2023
@cryptoAtwill cryptoAtwill deleted the top-down branch September 4, 2023 08:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants