Replies: 5 comments 3 replies
-
I like the option in which we replace I think the bigger issue is the support for The last option would be to do nothing. By "do nothing" I mean that we would document the supported types in a record (which are the types that are supported by I think it's valid to ask "What do we gain if we support I'm wondering if record schemas would help us solve this issue. If a source would supply a schema for the payload and we cache it, we could safely figure out if there's a timestamp anywhere in the record and encode/decode it with less overhead (first time we see a schema we can prepare the encoding/decoding function and then reuse it). At that point we could support any type we wanted, since the encoding and the type information would be separated. |
Beta Was this translation helpful? Give feedback.
-
I'd say primarily developer experience and to a smaller extent also making it easier to auto-generate schemas (which we do in the Kafka Connect wrapper). I believe that, ideally, the SDK makes it possible to use all the built-in types (minus some which make no sense, such as channels). Timestamps are used often and with that, I think it makes sense to make them available. In most Java frameworks, for example, this is done out of the box, and if customization is needed, it can easily be done with custom (un)marshallers, which are applied as JSON is being parsed (so there's no performance hit). I also realized that what was being used as a workaround in some connectors (to use raw payloads - JSON strings) is actually not the best solution. We can do better by returning structured data and using a custom serialization for just the fields which are not supported (e.g. timestamps). The conversion of timestamps to strings happens anyway in the raw payloads so we're not losing anything on that side but still get structured data.
I'm now for this option.:) As I mentioned above, I believe that having the built-in types available makes sense, but it's worth the performance cost. I tried finding a way to write a custom marshaller, but failed to do so. So with the above, I think not doing anything Developers having to transform from and to |
Beta Was this translation helpful? Give feedback.
-
Related to this: golang/protobuf#414 |
Beta Was this translation helpful? Give feedback.
-
I found some more information on this:
I think at this point the question is if
What do we think about this approach? |
Beta Was this translation helpful? Give feedback.
-
Support for timestamps was added as part of schema support. |
Beta Was this translation helpful? Give feedback.
-
This is in context of #866.
The underlying issue is with standalone connectors using gRPC and
google.protobuf.Struct
to represent a record. The mentioned Protobuf message doesn't support timestamps. The supported values are documented here.I see a few options:
google.protobuf.Struct
which describes a timestamp (e.g.{"type": "timestamp", "value": "12345"}
)google.protobuf.Struct
with a message of similar structure but with a timestamp.It looks like custom marshallers are not possible (see this and this).
Encoding timestamps using special structs is possible, but I'm worried about performance. It would require recursively going into a record to check most of its fields but also nested fields.
Replacing
google.protobuf.Struct
would look something like this:The above is also backwards compatible (connectors using the current version of the API, i.e.
google.protobuf.Struct
are still able to send it since the field tags are kept). This buys us some time until all connectors are updated.It also gives us more control over the struct and more flexibility to change it in future (if needed).
Beta Was this translation helpful? Give feedback.
All reactions