Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid proto reflection for builtin types #54

Open
misberner opened this issue Aug 16, 2022 · 6 comments
Open

Avoid proto reflection for builtin types #54

misberner opened this issue Aug 16, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@misberner
Copy link
Contributor

The optimized generated code for marshal, unmarshal, clone etc. still resorts to generic, protoreflect-based logic for builtin/well-known types such as google.protobuf.Timestamp, google.protobuf.Duration etc. Since these types are well known, it would be nice if optimized unrolled code could be used for operations on these types as well - especially as there seems to be a significant performance penalty for the "context-switch" to reflection with each individual proto.Clone invocation (benchmark).

I'm happy to send a PR, but one thing I'd ask the library maintainers to chime in on is whether it's acceptable to have the generated code reference global helper functions from a package within this module (e.g., github.com/planetscale/vtprotobuf/support/...), or whether it would be preferable to just generate package-private helper functions for all referenced types on demand.

@vmg
Copy link
Member

vmg commented Aug 16, 2022

This would be a welcome contribution because Vitess does use some well-known proto-types.

I'm happy to send a PR, but one thing I'd ask the library maintainers to chime in on is whether it's acceptable to have the generated code reference global helper functions from a package within this module

This would be something new, as the generated protobuf files don't have any dependencies, but at the same time it feels more elegant than throwing up the same (de)serialization code over and over for the same well-known types in different packages. Since this package doesn't really have any dependencies besides protobuf itself, I say let's try it this way.

If you have any arguments why generating package-private functions would be better I'd love to hear them too. Cheers!

@howardjohn
Copy link
Contributor

I was looking into this a bit and it seems fairly complex.

I think we either need to use our own copy of the wellknown types, which isn't great for compatibility (I think this is how gogo did it).

OR we have totally specially logic and do not implement the typical interfaces

WDYT?

Since we cannot just add a method on a foreign type unfortunately

@howardjohn
Copy link
Contributor

I did some experimentation on this (marshaling https://github.com/envoyproxy/envoy/blob/5fefd9d4f78d169e3cb71f4e2c53a1b5d575334b/api/envoy/config/endpoint/v3/endpoint_components.proto#L88, fwiw). We have a variety of benchmarks for different sizes of protobufs.

Not all numbers are the end-to-end benchmark numbers, they do a bit more than just protobuf encoding - but its 75% proto.

With just plain vtprotobuf, we only saw a 5-10% improvement.

With wellknown types handled, we say a 35-65% improvement in CPU, and up to 80% allocation reduction(!). It actually is enough to bring protobuf time to under 50% of our costs (from 75%).

Note this is just a hacky prototype. I have it at https://github.com/howardjohn/vtprotobuf/pull/new/gen/wellknown for reference. It current generates the _vtproto.pb.go, but then they are copied into a fork of google.golang.org/protobuf to work around issues in #54 (comment)

@howardjohn
Copy link
Contributor

Interestingly I think the motivation for reflection in core is to reduce binary sizes (?). Builtin types are used almost everywhere so would have big impact to just have these in core (replacing Marshal I mean not the VT extension)

Could probably even have completely hand written logic for some of the trivial ones like wrapper.Uint32...

@collinforsyth
Copy link

We also are not getting the full value proposition out of this due to parsing several million timestamppb.Timestamp per second.

Happy to help if there's anything I can do on my end to get this through

@collinforsyth
Copy link

I also noticed #80 which is another attempt to support WKT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants