Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from protoc to quick-protobuf #3066

Closed
wants to merge 15 commits into from

Conversation

kckeiks
Copy link
Contributor

@kckeiks kckeiks commented Oct 26, 2022

Description

According to perftest, quick-proto is faster than protobuf. Since Protobuf is fundamental to libp2p, the performance of the library we choose is important. Also, the quick-protobuf generated API is much simpler than protobuf's.

Open Questions

Pros and cons of quick-protobuf.

List of unsupported features.

Is it likely that we will need any of those unsupported features?

Any other concerns?

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@@ -174,42 +173,37 @@ impl Keypair {
}
};

Ok(pk.encode_to_vec())
Ok(quick_protobuf::serialize_into_vec(&pk).expect("Encoding to succeed."))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is infallible by looking at the code. Unfortunately, I have not found a concrete answer in the docs or issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so too. One error I could find is https://docs.rs/quick-protobuf/latest/src/quick_protobuf/writer.rs.html#379 which can't happen if the Vec is allocated correctly.

The other error path comes from lines like https://github.com/tafia/quick-protobuf/blob/d977371e05170a016f03a80512b2a925468e3a1a/quick-protobuf/examples/pb_rs/data_types.rs#L199 which end up being the same error case as above so I think we are good.

I believe this function could actually be made infallible in quick-protobuf but I am not sure.

@kckeiks
Copy link
Contributor Author

kckeiks commented Oct 26, 2022

If we want to use quick-protobuf, I can start pushing patches.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this patch!

Is it likely that we will need any of those unsupported features?

I don't think so. We only use protobufs to have a language-agnostic message description language so we are unlikely to use services for example.

I am not quite sure what option refers too? Is that talking about optional fields from proto3?

Any other concerns?

  • The verbosity of the build script is annoying but that shouldn't block a transition.
  • This is a bit concerning: Fix owned deref tafia/quick-protobuf#153. I am not liking the fact that there is unsafe in (some of) the generated code.

Overall, I like that we can use Cow in a few places now and overall, interacting with the generated code seems a lot simpler!

It also seems that @snproj has recently been active on the repository as a maintainer? I hope you don't mind being tagged! :)

I'd like to hear @mxinden's opinion before we move forward here. Personally I am in favor assuming the above unsafe problem is fixed.

core/build.rs Outdated
Comment on lines 25 to 48
let out_dir = std::env::var("OUT_DIR").unwrap();
let out_dir = Path::new(&out_dir).join("protos");

let in_dir = PathBuf::from(::std::env::var("CARGO_MANIFEST_DIR").unwrap()).join("src");

// Find all *.proto files in the `in_dir` and add them to the list of files
let mut protos = Vec::new();
let proto_ext = Some(Path::new("proto").as_os_str());

for entry in std::fs::read_dir(&in_dir).unwrap() {
let path = entry.unwrap().path();
if path.extension() == proto_ext {
protos.push(path);
}
}

// Delete all old generated files before re-generating new ones
if out_dir.exists() {
std::fs::remove_dir_all(&out_dir).unwrap();
}

std::fs::DirBuilder::new().create(&out_dir).unwrap();
let config_builder = ConfigBuilder::new(&protos, None, Some(&out_dir), &[in_dir]).unwrap();
FileDescriptor::run(&config_builder.build()).unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of annoying boilerplate but I could live with that. Worst case we could always make a little crate that abstracts things away for us.

Comment on lines 181 to 170
let mut private_key: zeroize::Zeroizing<keys_proto::PrivateKey> =
quick_protobuf::deserialize_from_slice(bytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd prefer using turbo-fish here.

Suggested change
let mut private_key: zeroize::Zeroizing<keys_proto::PrivateKey> =
quick_protobuf::deserialize_from_slice(bytes)
let mut private_key =
quick_protobuf::deserialize_from_slice::<keys_proto::PrivateKey>(bytes)

r#type: keys_proto::KeyType::Ed25519 as i32,
data: key.encode().to_vec(),
Type: keys_proto::KeyType::Ed25519,
Data: Cow::from(key.encode().to_vec()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the usage of Cow here would actually allow us to remove an allocation. A Ed25519 key in particular can also be viewed as a byte-slice so we could use Cow::Borrowed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem like encode() returns an array which creates a temporary that goes out of scope so we can't remove the allocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to add pub(crate) fn as_bytes(&self) -> &[u8] to ed25519::PublicKey which then allows the use of Cow::Borrowed.

Comment on lines 80 to 82
payload_type: Cow::from(self.payload_type),
payload: Cow::from(self.payload),
signature: Cow::from(self.signature),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could allow be borrowed :)

.map(|m| peer_record_proto::peer_record::AddressInfo {
multiaddr: m.to_vec(),
.map(|m| peer_record_proto::mod_PeerRecord::AddressInfo {
multiaddr: Cow::from(m.to_vec()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one could be borrowed too I think.

@dignifiedquire
Copy link
Member

This is a bit concerning: tafia/quick-protobuf#153. I am not liking the fact that there is unsafe in (some of) the generated code.

Looks like there was some good activity and this PR and a follow up got merged and released in the last days

@snproj
Copy link

snproj commented Nov 24, 2022

It also seems that @snproj has recently been active on the repository as a maintainer? I hope you don't mind being tagged! :)

Hi! Didn't see this earlier 😅 Just popping in to say yep, my org intends to maintain quick-protobuf. Let us know if you've got any concerns if you choose to migrate!

Since you mentioned the generated API, could I use that as an excuse to ask if you've got any opinions on this?

I am not quite sure what option refers too? Is that talking about optional fields from proto3?

I think it refers to this, which is currently being ignored by pb-rs/src/parser.rs > option_ignore(). Although the proto3 distinction between optional and default (no keyword) fields isn't really supported yet; we have yet to implement hazzers.

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 4, 2022

I'll update this PR very soon.

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 5, 2022

This test is failing because quick_protobuf::deserialize_from_slice is returning a VarInt error when calling from_protobuf_encoding. I noticed that outputs from KeyPair::to_protobuf_encoding have the value 68 at the start then followed by 8. As an experiment, I prepended encoded with 68 and deserialize_from_slice did not fail (but the peer ids did not match).

Does anyone know what the issue might be? Will investigate later today or tomorrow....

    #[test]
    fn keypair_from_protobuf_encoding() {
        // E.g. retrieved from an IPFS config file.
        let base_64_encoded = "CAESQL6vdKQuznQosTrW7FWI9At+XX7EBf0BnZLhb6w+N+XSQSdfInl6c7U4NuxXJlhKcRBlBw9d0tj2dfBIVf6mcPA=";
        let expected_peer_id =
            PeerId::from_str("12D3KooWEChVMMMzV8acJ53mJHrw1pQ27UAGkCxWXLJutbeUMvVu").unwrap();
        let encoded = base64::decode(base_64_encoded).unwrap();
        let keypair = Keypair::from_protobuf_encoding(&encoded).unwrap();
        let peer_id = keypair.public().to_peer_id();
        assert_eq!(expected_peer_id, peer_id);
    }

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 5, 2022

This test is failing because quick_protobuf::deserialize_from_slice is returning a VarInt error when calling from_protobuf_encoding. I noticed that outputs from KeyPair::to_protobuf_encoding have the value 68 at the start then followed by 8. As an experiment, I prepended encoded with 68 and deserialize_from_slice did not fail (but the peer ids did not match).

Does anyone know what the issue might be? Will investigate later today or tomorrow....

    #[test]
    fn keypair_from_protobuf_encoding() {
        // E.g. retrieved from an IPFS config file.
        let base_64_encoded = "CAESQL6vdKQuznQosTrW7FWI9At+XX7EBf0BnZLhb6w+N+XSQSdfInl6c7U4NuxXJlhKcRBlBw9d0tj2dfBIVf6mcPA=";
        let expected_peer_id =
            PeerId::from_str("12D3KooWEChVMMMzV8acJ53mJHrw1pQ27UAGkCxWXLJutbeUMvVu").unwrap();
        let encoded = base64::decode(base_64_encoded).unwrap();
        let keypair = Keypair::from_protobuf_encoding(&encoded).unwrap();
        let peer_id = keypair.public().to_peer_id();
        assert_eq!(expected_peer_id, peer_id);
    }

tafia/quick-protobuf#202

Comment on lines +17 to +59
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Debug, Default, PartialEq, Clone)]
pub struct Envelope<'a> {
pub public_key: Option<keys_proto::PublicKey<'a>>,
pub payload_type: Cow<'a, [u8]>,
pub payload: Cow<'a, [u8]>,
pub signature: Cow<'a, [u8]>,
}

impl<'a> MessageRead<'a> for Envelope<'a> {
fn from_reader(r: &mut BytesReader, bytes: &'a [u8]) -> Result<Self> {
let mut msg = Self::default();
while !r.is_eof() {
match r.next_tag(bytes) {
Ok(10) => msg.public_key = Some(r.read_message::<keys_proto::PublicKey>(bytes)?),
Ok(18) => msg.payload_type = r.read_bytes(bytes).map(Cow::Borrowed)?,
Ok(26) => msg.payload = r.read_bytes(bytes).map(Cow::Borrowed)?,
Ok(42) => msg.signature = r.read_bytes(bytes).map(Cow::Borrowed)?,
Ok(t) => { r.read_unknown(bytes, t)?; }
Err(e) => return Err(e),
}
}
Ok(msg)
}
}

impl<'a> MessageWrite for Envelope<'a> {
fn get_size(&self) -> usize {
0
+ self.public_key.as_ref().map_or(0, |m| 1 + sizeof_len((m).get_size()))
+ if self.payload_type == Cow::Borrowed(b"") { 0 } else { 1 + sizeof_len((&self.payload_type).len()) }
+ if self.payload == Cow::Borrowed(b"") { 0 } else { 1 + sizeof_len((&self.payload).len()) }
+ if self.signature == Cow::Borrowed(b"") { 0 } else { 1 + sizeof_len((&self.signature).len()) }
}

fn write_message<W: WriterBackend>(&self, w: &mut Writer<W>) -> Result<()> {
if let Some(ref s) = self.public_key { w.write_with_tag(10, |w| w.write_message(s))?; }
if self.payload_type != Cow::Borrowed(b"") { w.write_with_tag(18, |w| w.write_bytes(&**&self.payload_type))?; }
if self.payload != Cow::Borrowed(b"") { w.write_with_tag(26, |w| w.write_bytes(&**&self.payload))?; }
if self.signature != Cow::Borrowed(b"") { w.write_with_tag(42, |w| w.write_bytes(&**&self.signature))?; }
Ok(())
}
}
Copy link
Contributor

@thomaseizinger thomaseizinger Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is actually so decent to look at!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yup

use pb_rs::{types::FileDescriptor, ConfigBuilder};
use std::path::{Path, PathBuf};

fn main() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a CI check that ensures that the generated code is not modified!

Ideally, the CI check can use glob patterns so we auto-detect when new .proto files are added to the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too familiar with Github actions so I have some questions. Do we want the build to fail when these files are modified? Is there a way to turn these off when we legitimately need to update these files? Also, I think when someone adds a new generated file and .proto file, the build should not fail?

Another option is to make a warning comment in the PR when proto/* files are added or modified. Another simpler approach which I've seen some projects do (mitmproxy for example) is to add these proto/ directories to .gitignore.

Thoughts?

Possible solutions:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should never modify these files. I was thinking something along the lines of:

  • Run protobuf compiler with some glob pattern capturing all .proto files
  • Run git diff --exit-code

If anyone modified a generated file, git diff will fail the workflow.

This way, CI will only pass if there is a generated code file for each .proto file that is completely untouched, resulting in the exact same behaviour as if we were to generate it at compile-time.

@thomaseizinger
Copy link
Contributor

This test is failing because quick_protobuf::deserialize_from_slice is returning a VarInt error when calling from_protobuf_encoding. I noticed that outputs from KeyPair::to_protobuf_encoding have the value 68 at the start then followed by 8. As an experiment, I prepended encoded with 68 and deserialize_from_slice did not fail (but the peer ids did not match).
Does anyone know what the issue might be? Will investigate later today or tomorrow....

    #[test]
    fn keypair_from_protobuf_encoding() {
        // E.g. retrieved from an IPFS config file.
        let base_64_encoded = "CAESQL6vdKQuznQosTrW7FWI9At+XX7EBf0BnZLhb6w+N+XSQSdfInl6c7U4NuxXJlhKcRBlBw9d0tj2dfBIVf6mcPA=";
        let expected_peer_id =
            PeerId::from_str("12D3KooWEChVMMMzV8acJ53mJHrw1pQ27UAGkCxWXLJutbeUMvVu").unwrap();
        let encoded = base64::decode(base_64_encoded).unwrap();
        let keypair = Keypair::from_protobuf_encoding(&encoded).unwrap();
        let peer_id = keypair.public().to_peer_id();
        assert_eq!(expected_peer_id, peer_id);
    }

tafia/quick-protobuf#202

Well that is a bit of a weird one. Does the workaround mentioned in the issue solve things or are we blocked here?

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 6, 2022

This test is failing because quick_protobuf::deserialize_from_slice is returning a VarInt error when calling from_protobuf_encoding. I noticed that outputs from KeyPair::to_protobuf_encoding have the value 68 at the start then followed by 8. As an experiment, I prepended encoded with 68 and deserialize_from_slice did not fail (but the peer ids did not match).
Does anyone know what the issue might be? Will investigate later today or tomorrow....

    #[test]
    fn keypair_from_protobuf_encoding() {
        // E.g. retrieved from an IPFS config file.
        let base_64_encoded = "CAESQL6vdKQuznQosTrW7FWI9At+XX7EBf0BnZLhb6w+N+XSQSdfInl6c7U4NuxXJlhKcRBlBw9d0tj2dfBIVf6mcPA=";
        let expected_peer_id =
            PeerId::from_str("12D3KooWEChVMMMzV8acJ53mJHrw1pQ27UAGkCxWXLJutbeUMvVu").unwrap();
        let encoded = base64::decode(base_64_encoded).unwrap();
        let keypair = Keypair::from_protobuf_encoding(&encoded).unwrap();
        let peer_id = keypair.public().to_peer_id();
        assert_eq!(expected_peer_id, peer_id);
    }

tafia/quick-protobuf#202

Well that is a bit of a weird one. Does the workaround mentioned in the issue solve things or are we blocked here?

Not a blocker. We're good as long as we only use deserialize_from_slice and serialize_from_slice.

@mxinden mxinden linked an issue Dec 7, 2022 that may be closed by this pull request
@mergify
Copy link
Contributor

mergify bot commented Dec 13, 2022

This pull request has merge conflicts. Could you please resolve them @kckeiks? 🙏

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know when this ready for another full review!

@@ -1,3 +1,9 @@
# 0.39.0 [unreleased]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 0.39.0 [unreleased]
# 0.38.1 [unreleased]

I'd hope that we can ship this as a backwards-compatible change.

@@ -57,12 +57,15 @@

mod io;
mod protocol;
#[allow(clippy::derive_partial_eq_without_eq)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also applied within the file, isn't it?

@@ -1,3 +1,9 @@
# 0.42.0 [unreleased]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 0.42.0 [unreleased]
# 0.41.1 [unreleased]

Same here, it should be a backwards-compatible change.

core/Cargo.toml Outdated
@@ -54,7 +54,7 @@ rmp-serde = "1.0"
serde_json = "1.0"

[build-dependencies]
prost-build = "0.11"
pb-rs = "0.10.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should no longer need this, right?

@thomaseizinger
Copy link
Contributor

Please merge master instead of force-pushing. We squash merge anyway and it makes reviews easier :)

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 15, 2022

It's ready for another review. Just pushed df4d1c1 and afe818e.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is definitely heading in the right direction! Thank you!

Things to look out for:

  • Before we convert more crates, I'd like to verify that we can reasonably easy write a CI script that checks the generated code for modifications. I've just tried it locally and pb-rs **/*.proto seems to do the trick! I'd like to see this rather sooner than later in the progress of this PR. I think a good next step would be to move all proto files in the directories we want, generated the according Rust code and add the CI job that verifies it all. From there, we can continue to convert more crates to actually use the newly generated code.
  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.
  • In addition to the CI script, we can remove the installation of protoc from all workflows.

Thanks again for tackling this, I am very excited to land this improvement. It is a big plus for our CI pipeline and removes an annoying step for people to get starting hacking on rust-libp2p.

r#type: keys_proto::KeyType::Ed25519 as i32,
data: key.encode().to_vec(),
Type: keys_proto::KeyType::Ed25519,
Data: Cow::from(key.encode().to_vec()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to add pub(crate) fn as_bytes(&self) -> &[u8] to ed25519::PublicKey which then allows the use of Cow::Borrowed.

Comment on lines -352 to -354
let base_64_encoded = "CAESQL6vdKQuznQosTrW7FWI9At+XX7EBf0BnZLhb6w+N+XSQSdfInl6c7U4NuxXJlhKcRBlBw9d0tj2dfBIVf6mcPA=";
let base_64_encoded = "RAgBEkC+r3SkLs50KLE61uxViPQLfl1+xAX9AZ2S4W+sPjfl0kEnXyJ5enO1ODbsVyZYSnEQZQcPXdLY9nXwSFX+pnDw";
let expected_peer_id =
PeerId::from_str("12D3KooWEChVMMMzV8acJ53mJHrw1pQ27UAGkCxWXLJutbeUMvVu").unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this test change? We should not need to change this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes because of this #3066 (comment). I've been thinking about this and feel uneasy about it. Should we instead use this pattern tafia/quick-protobuf#202 (comment)? That way we're consistent with the official encoding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we instead use this pattern tafia/quick-protobuf#202 (comment)? That way we're consistent with the official encoding.

I think so yes. We need to be compatible here so this test should not need changing!

Comment on lines -71 to -76
pub(crate) fn unknown_key_type(key_type: i32) -> Self {
Self {
msg: format!("unknown key-type {key_type}"),
source: None,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@@ -18,12 +18,12 @@
// FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
// DEALINGS IN THE SOFTWARE.

use crate::structs_proto;
use crate::protos::structs as structs_proto;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of renaming modules on import. It would be cool if we could just access all structs under a proto namespace.

To achieve, would it make sense if we:

  1. Rename the protos module to generated
  2. Create a proto module in lib.rs as mod proto {}
  3. Re-export all types that we need from there to flatten out the module hierarchy.

@mergify
Copy link
Contributor

mergify bot commented Dec 19, 2022

This pull request has merge conflicts. Could you please resolve them @kckeiks? 🙏

@kckeiks
Copy link
Contributor Author

kckeiks commented Dec 21, 2022

I will have some bandwidth at the end of this week to work on this.

@thomaseizinger
Copy link
Contributor

I will have some bandwidth at the end of this week to work on this.

As a heads-up: Response times on our end might be slow over the next 2-3 weeks. Personally, I'll be pretty much off the grid for about two weeks from next Monday.

@kckeiks
Copy link
Contributor Author

kckeiks commented Jan 7, 2023

Moving to #3312

@thomaseizinger
Copy link
Contributor

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to https://github.com/mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

@mergify
Copy link
Contributor

mergify bot commented Jan 11, 2023

This pull request has merge conflicts. Could you please resolve them @kckeiks? 🙏

@mxinden
Copy link
Member

mxinden commented Jan 11, 2023

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to https://github.com/mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

@kckeiks kckeiks mentioned this pull request Jan 12, 2023
4 tasks
@thomaseizinger
Copy link
Contributor

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

I see your concern but given that protobuf serializations themselves aren't self-descriptive in that way, some strategy is needed. I think exposing a codec that always uses varint for that is a sensible choice. We can try and generalize that once someone comes along with a different usecase.

@kckeiks Would you be up for extending asynchronous-codec with this discussed feature?

@kckeiks
Copy link
Contributor Author

kckeiks commented Jan 12, 2023

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

I see your concern but given that protobuf serializations themselves aren't self-descriptive in that way, some strategy is needed. I think exposing a codec that always uses varint for that is a sensible choice. We can try and generalize that once someone comes along with a different usecase.

@kckeiks Would you be up for extending asynchronous-codec with this discussed feature?

Why not :)

@kckeiks
Copy link
Contributor Author

kckeiks commented Jan 25, 2023

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to https://github.com/mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

@thomaseizinger @mxinden I'm running into an issue over in mxinden/asynchronous-codec. prost-codec::Codec uses UviBytes which depends on asynchronous-codec so I'm running into a conflict with the traits.... I was thinking of replacing UviBytes with LengthCodec but UviBytes looks more robust. Any suggestions?

@thomaseizinger
Copy link
Contributor

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

@thomaseizinger @mxinden I'm running into an issue over in mxinden/asynchronous-codec. prost-codec::Codec uses UviBytes which depends on asynchronous-codec so I'm running into a conflict with the traits....

Ah damn. I see three solutions:

  1. Don't activate the asynchronous_codec codec feature in uvint and re-implement the codec in asynchronous-codec so you can use it for the new protobuf-codec as an implementation detail
  2. Do (1) but also make it publicly accessible and send a PR to uvint that removes the codec from there
  3. Create a dedicated crate as we did before where we can combine the two

Looking at it from first principles, (3) is the cleanest but requires us to have another crate again. Really, asynchronous-codec should probably not offer any codecs that require dependencies outside of the standard library. I'd say lets go with (3) and have a crate quick-protobuf-codec in misc, replacing prost-codec.

I was thinking of replacing UviBytes with LengthCodec but UviBytes looks more robust. Any suggestions?

We can't do that because it would not be compatible on the network layer with other peers / implementations.

@kckeiks
Copy link
Contributor Author

kckeiks commented Jan 25, 2023

  • We currently rely on prost-codec in various crates. Now that we are no longer using prost, that will have to renamed (and re-released) to quick-protobuf-codec. I'd like to verify that we can actually build the same abstraction, i.e. an implementation of asynchronous-codec::Encoder and asynchronous-codec::Decoder for structs generated using quick-protobuf.

@mxinden I've been thinking. Instead of releasing another crate, would you accept a PR to mxinden/asynchronous-codec with a feature-flag for decoding protobuf messages via quick-protobuf? The library already has support for JSON and CBOR so we could also add other formats to it. That would save us from maintaining and releasing one more crate.

Sure thing! Happy to merge a patch into asynchronous-codec. I think what was holding me back back then and what made me introduce a new crate (prost-codec) instead, was the fact that we prefix Protobufs with a varint, which at the same was a libp2p concept to me. Thinking about it some more, might as well be a general useful thing and thus worth merging into asynchronous-codec directly.

@thomaseizinger @mxinden I'm running into an issue over in mxinden/asynchronous-codec. prost-codec::Codec uses UviBytes which depends on asynchronous-codec so I'm running into a conflict with the traits....

Ah damn. I see three solutions:

1. Don't activate the `asynchronous_codec` codec feature in `uvint` and re-implement the codec in `asynchronous-codec` so you can use it for the new protobuf-codec as an implementation detail

2. Do (1) but also make it publicly accessible and send a PR to `uvint` that removes the codec from there

3. Create a dedicated crate as we did before where we can combine the two

Looking at it from first principles, (3) is the cleanest but requires us to have another crate again. Really, asynchronous-codec should probably not offer any codecs that require dependencies outside of the standard library. I'd say lets go with (3) and have a crate quick-protobuf-codec in misc, replacing prost-codec.

I was thinking of replacing UviBytes with LengthCodec but UviBytes looks more robust. Any suggestions?

We can't do that because it would not be compatible on the network layer with other peers / implementations.

Sounds good. I prefer (3) as well.

@chunningham chunningham mentioned this pull request Feb 2, 2023
12 tasks
@thomaseizinger
Copy link
Contributor

Closing here in favor of #3312.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate from protoc to quick-protobuf and remove the buildscripts
5 participants