Did you ever want to perfectly decode unknown protocol buffer encoded data when you do not have access to the message defintions? Well you can't. But at least, with this library, you can try.
ProtoUntyped can help you read unknown protocol buffer encoded data. The result will not be the same as with the message definitions, because message definitions are required to properly read the binary wire format. ProtoUntyped uses conventions and heuristics to identify the types from the binary data.
ProtoUntype can also be used to dynamically generate protocol buffer encoded data.
The core type of the library is ProtoObject
, which is the protobuf equivalent of the JsonObject
from the System.Text.Json
API.
This library is based on the excellent protobuf-net serializer.
Given the following message definition:
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
}
and the following message:
var message = new SearchRequest { Query = "/users", PageNumber = 8, ResultPerPage = 50 };
The encoded data can be read using:
var protoObject = ProtoObject.Decode(bytes);
Console.WriteLine(protoObject.ToString());
which will print:
message {
- 1 = "/users"
- 2 = 8
- 3 = 50
}
ProtoUntyped uses conventions to decode the wire format. Most of them can be configured using the ProtoDecodeOptions class.
Any ProtoObject
can be encoded back to bytes without data-loss:
var bytes = protoObject.Encode();
But because ProtoObject
instances can be manually created, it is possible to dynamically generate protobuf encoded data:
var protoObject = new ProtoObject(
new ProtoField(1, "/users"),
new ProtoField(2, 8, WireType.Varint),
new ProtoField(3, 50, WireType.Varint)
);
var bytes = protoObject.Encode();
protobuf-net supports the serialization of a few .NET types like Guid
, DateTime
or decimal
. These types are not part of the protocol buffer format and protobuf-net uses a custom wire format based on embedded messages.
ProtoUntyped does not decode the BCL types by default but it can be activated using the ProtoDecodeOptions
.
ProtoWireObject
is a low-level alternative to ProtoObject
. Here are the differences between the two types:
Type | Decodes LEN fields | Converts fixed types to floats | Groups repeated fields | Decodes BCL types |
---|---|---|---|---|
ProtoWireObject | ✅ | ❌ | ❌ | ❌ |
ProtoObject | ✅ | ✅ | ✅ | ✅ |
Under the hood, the wire format is always parsed as ProtoWireObject
first.
ProtoObject.ToString
generates a proto-like string representation of the object, for example:
message {
- 1 = "/users"
- 2 = 8
- 3 = 50
}
ProtoObject
can also be exported as protoscope, using ToProtoscopeString
, for example:
1: {"/users"}
2: 8
3: 50
Because ProtoObject
can contain types that are not valid in the protobuf wire-format (e.g.: decimal, Guid),
you should generate protoscope strings from ProtoWireObject
if you need strict protoscope format.
The protocol buffer wire format contains a sequence of fields. Every field contains the field number, the wire type, and the field value. The field numbers and the wire types can be safely read, so you will always get a valid list of top level field meta-data. Then ProtoUntyped uses heuristics to decode the field values depending on the wire type.
Type 0 can be used for int32, int64, uint32, uint64, sint32, sint64, bool, enum
. The size does not really matter for decoding, but it is required to know if the value is signed. ProtoUntyped always decodes type 0 as signed.
Type 1 can be used for fixed64, sfixed64, double
. Because protobuf-net uses type 0 for integers by default, ProtoUntyped interprets type 1 as System.Double
by default, but it can be configured using ProtoDecodeOptions.Fixed64DecodingMode
.
Type 2 can be used for string, bytes, embedded messages, packed repeated fields
. ProtoUntyped will try to interpret the data as an embedded message. If the decoding fails, the data will be interpreted as a System.String
. It can be configured using ProtoDecodeOptions.PreferredStringDecodingModes
.
Type 5 can be used for fixed32, sfixed32, float
. Because protobuf-net uses type 0 for integers by default, ProtoUntyped interprets type 5 as System.Float
by default, but it can be configured using ProtoDecodeOptions.Fixed32DecodingMode
.