Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sei remote sign hotfix: aggregate tcp chunks before unmarshal proto #1

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 31 additions & 10 deletions src/rpc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,9 @@ use crate::privval::SignableMsg;
use prost::Message as _;
use std::io::Read;
use tendermint::{chain, Proposal, Vote};
use tendermint_p2p::secret_connection::DATA_MAX_SIZE;
use tendermint_proto as proto;

// TODO(tarcieri): use `tendermint_p2p::secret_connection::DATA_MAX_SIZE`
// See informalsystems/tendermint-rs#1356
const DATA_MAX_SIZE: usize = 262144;

Comment on lines +10 to -15
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to confirm that this change doesn't affect other things, once Sei works

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sure, makes sense! should I revert it for now then?

The reasoning behind this change is:
I might not see the full picture, but p2p lib reads chunk of this size internally, so it doesn't make sense to have a larger buffer here actually. Same think they say in the comment.

use crate::{
error::{Error, ErrorKind},
prelude::*,
Expand All @@ -31,12 +28,36 @@ pub enum Request {
impl Request {
/// Read a request from the given readable.
pub fn read(conn: &mut impl Read, expected_chain_id: &chain::Id) -> Result<Self, Error> {
let msg_bytes = read_msg(conn)?;

// Parse Protobuf-encoded request message
let msg = proto::privval::Message::decode_length_delimited(msg_bytes.as_ref())
.map_err(|e| format_err!(ErrorKind::ProtocolError, "malformed message packet: {}", e))?
.sum;
let mut msg_bytes: Vec<u8> = vec![];
mkaczanowski marked this conversation as resolved.
Show resolved Hide resolved
let msg;

// fix for Sei: collect incoming bytes of Protobuf from incoming msg
loop {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this logic should go to tendermint_proto, e.g. proto::privval::Message::decode()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not to proto, since proto shouldn't care about transport. Rather to tendermint_p2p, as tmkms author mentioned.

let mut msg_chunk = read_msg(conn)?;
let chunk_len = msg_chunk.len();
msg_bytes.append(&mut msg_chunk);

// if we can decode it, great, break the loop
match proto::privval::Message::decode_length_delimited(msg_bytes.as_ref()) {
Ok(m) => {
msg = m.sum;
break;
}
Err(e) => {
// if chunk_len < DATA_MAX_SIZE (1024) we assume it was the end of the message and it is malformed
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a cleaned up version, I would remove any hardcoded numbers, i.e. 1024. The DATA_MAX_SIZE is enough

if chunk_len < DATA_MAX_SIZE {
return Err(format_err!(
ErrorKind::ProtocolError,
"malformed message packet: {}",
e
)
.into());
}
// otherwise, we go to start of the loop assuming next chunk(s)
// will fill the message
}
}
Comment on lines +41 to +59
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... if decode_length_delimited() fails, we add the new chunk to the data, and try again, correct?
Though if the chunk len is < DATA_MAX_SIZE, we fail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct. The idea is, when we get a chunk, if it is < 1024, then we think it is the end of the message. Or, if it is = 1024 and we can parse aggregated, then it is the end of the message as well.

This is far from elegant, however, I don't know what else we can do given that tmkms can't know the full length of incoming chunked message (see also he explanation in description).

https://github.com/sei-protocol/sei-tendermint/blob/main/privval/secret_connection.go#L210-L219

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the GH URL as a comment to explain where this assumption came from. Those few lines explain everything:
https://github.com/cometbft/cometbft/blob/ffd2d3f9475b6f101cc1d4c5ff94a4d928db6bb4/p2p/conn/secret_connection.go#L201-L203

(also please use the cometbft url I pasted, rather than sei-tendermint :) )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also) there is grpc endpoint that tmkms will use some day, tony (main maintainer of tmkms) has some PR somwhere to make it work, but it's not there yet.

So this code-path one day will be deprecated, but for now it is a main logic that needs to be fixed

}

let (req, chain_id) = match msg {
Some(proto::privval::message::Sum::SignVoteRequest(
Expand Down
Loading