Sharing proto related techniques from opentelemetry-java #1996
anuraaga
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
There have been some big changes in dealing with protos and gRPC in the opentelemetry-java repo. Not too sure what a good way of sharing among SIGs is so trying out creating a discussion in this repo. This is to give possible ideas and share knowledge in case it can help other SIGs - the techniques are broadly applicable I think, even if the implementation may take some effort. Languages with difficult gRPC libraries, perhaps because they just wrap the C++ library into a scripting language, may particularly benefit from Idea 3.
1. Marshaler from SDK data types to OTLP binary without protobuf library
We have implemented serializing of SDK data types to OTLP binary without using the protobuf library or generated code from the protoc compiler.
https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/traces/SpanMarshaler.java
This work was started last year primarily as a performance optimization - transforming from one data type to another data type (generated proto struct) to binary involves one transformation in the middle that is not needed. By implementing the
Marshaler
framework and invokingCodedOutputStream
from hand-written marshaler code, we generally see 1.5-2x performance improvements.If you compare to the old version that went through the protobuf library, actually the structure isn't so different
https://github.com/open-telemetry/opentelemetry-java/blob/v1.4.1/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/SpanAdapter.java
In both cases you have to know what field to store data into, so the general "lines of code" of both approaches is similar. Marshaler tend to have extra code overhead of also writing code for computing the size of a serialized message because protobuf requires the size of the message to be written at the beginning. There are techniques to avoid this involving writing the proto from end-to-beginning but unfortunately gRPC does not provide hooks to allow us to try this.
While the original implementation referenced generated protoc code to access the field numbers for serialization, we found users needing a more lightweight artifact, for example for use in an Android app. Protobuf and the generated protos add up to 2+MB of overhead which brings Android users pause. As we only needed field numbers mostly, not the entire generated API of the protos nor the base classes such as
Message
, we implemented a plugin for the wire proto compiler to generate lightweight stubs with only the information we need. The reason not to use the official protoc is because with wire implemented in the JVM, it's trivial to write a plugin for it in our build without any IPC, etc.https://github.com/open-telemetry/opentelemetry-java/blob/main/buildSrc/src/main/kotlin/io/opentelemetry/gradle/ProtoFieldsWireHandler.kt
The generated stub looks like this. No dependency on the protobuf library and a very small amount of bytecode with just what we need. We could then eliminate the protobuf dependency completely by vendoring in the very small portion of it's encoding logic that we use
https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/CodedOutputStream.java
Well, it's ~500 lines so big, but a small drop compared to the original CodedOutputStream, and the rest of the protobuf library.
2. Binary and JSON Serializer
We support exporting via JSON logs (not the logs signal, but e.g. spans converted to OTLP-JSON and outputed to java.util.logging) and were previously using a heavyweight solution that generates serialization code at runtime using bytebuddy that I had written a few years ago. The upstream protobuf library's JSON serialization is problematic because the use of reflection makes it quite slow, but more importantly, it offers no customization, whereas we need to be able to serialize trace and span ID as hex instead of base64.
While this was ok, it relied on the generated protos, which we wanted to stop using completely to reduce the amount of code we maintain (the above
SpanAdapter
is completely deleted now). Thanks to our Marshaler framework, we were able to abstract binary vs JSON serialization into aSerializer
interface.https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/Serializer.java
The
Serializer
has two jobs 1) Implement proto3 optional semantics by not writing out values that match the field's default and 2) be implemented both for binary and JSON serialization. Previously we accessed binary proto'sCodedOutputStream
directly, but instead just replaced all calls to theSerializer
. With it, we get JSON support "for free" - the Serializer only has to be implemented once, while Marshaling logic is completely independent of binary vs JSON. All with no dependencies on the protobuf library.Implementing 2) is dependent on 1)
3. Direct gRPC wire format encoding
gRPC is confusingly two key concepts - a wire format and a server/client framework that implements the wire format. The server/client framework is highly generic and powerful, making it easy to generate code for APIs and supports powerful features like fully bi-directional streaming. OTLP, at least now, is far more simple, only using unary gRPC, and for SDKs, only serialization is needed. I suspect this is a general practice because the baseline for OTLP is the http/protobuf version, which by nature can't take advantage of things like streaming.
But in fact, unary gRPC is also just a very small protocol on top of HTTP/2. We have implemented the ability to export without depending on the gRPC library at all.
https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/grpc/GrpcRequestBody.java#L65
https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/otlp/common/src/main/java/io/opentelemetry/exporter/otlp/internal/grpc/OkHttpGrpcExporter.java#L165
The total code for "doing gRPC" is less than 200 LoC I think, and is pretty low code overhead I think. It allows removing gRPC and its heavyweight dependencies entirely - we plan on using this in a slim distribution of the Java agent which will clock in at 10MB total size instead of 14MB as now with the gRPC dependency.
The requirement is a full HTTP/2 library, in our case we used the popular OkHttp. The requirements are the library returns "trailers" (headers that come after the response body instead of before) and plain-text HTTP/2 connections. JDK's built in
HttpClient
cannot be used because it does not support these.While I was hoping most languages would have a library that supports this in 2021, looks like I was overoptimistic. In Python,
httpx
appears to be the only option for HTTP/2 but doesn't seem to expose trailers. I can't find any obvious choice for an HTTP/2 library in Ruby. No wonder gRPC sticks to the C++ wrapper on these languages...Note that 3) does not depend on 1) or 2). While we use the Marshaler in our own implementation of the gRPC encoding, we could just as easily call into a generated proto message instead.
Hope this blog post can help provide ideas and share knowledge. Let me know if anything's not clear.
Beta Was this translation helpful? Give feedback.
All reactions