Feature: create trace span for kafkajs eachBatch #3400

allanpaiste · 2023-07-17T11:23:53Z

What does this PR do?

Enhances kafkajs instrumentation to cover eachBatch the same way the current instrumentation covers eachMessage just so that both chosen paths would get same kind of span representation show up in APM flamegraphs (and the spans to be searchable).

It also provides a experimental environment flag DD_EXPERIMENTAL_KAFKAJS_PLUGIN_TRACE_FIRST_BATCH_MESSAGE which allows developers to have at least some level of automatic traceId propagation for batch processing where the context from first message of each batch is used as the trace parent.

Motivation

Have both consumption strategies chosen when using kafkajs show up in APM flamegraphs and span search rather than half of the usage cases (imagining a situation where half of the consumptions are done in batches) being skipped where the traces would then start from whatever comes after the consume span (db interaction, http calls) which can leave user confused.

Plugin Checklist

Unit tests.

Additional Notes

There was a test added in dd-trace that was using eachBatch in a test that was supposed to indicate that user has not provided any callback at all which leaves me thinking that whoever authored the original instrumentation was at least aware that eachBatch was a thing, but then decided to skip implementing it. Would be good to know why that choice was made.

Am aware that when it comes to traceId propagation - then all kinds of batch processing solutions leave a lot to wish for which makes me wonder: was it left without implementation as the authors expect dd-trace users that decide to go with eachMessage to implement tracing within the batchProcessing themselves?

If that is so, then feel free to decline this PR.

…ch-batch

… delays on fast disconnect/re-connect imposed by kafkajs)

…tion-for-kafkajs-each-batch

…github.com:allanpaiste/dd-trace-js into feature/add-instrumentation-for-kafkajs-each-batch

rochdev

was it left without implementation as the authors expect dd-trace users that decide to go with eachMessage to implement tracing within the batchProcessing themselves?

Yes, unfortunately this is the case. We cannot support batches with our current data model since it's not possible for a single handler to follow from multiple messages from potentially different traces/services. This means that when looping through each of the batch message, it's up to the user to create individual spans that follow from each corresponding message. This is unfortunately a limitation of the product and not something we can address at the library level. This PR would only mitigate the issue for the first message in the batch, but all other messages would still have the same issue, which IMO is even more confusing than the current behaviour of not instrumenting batches at all by default.

allanpaiste · 2023-08-29T12:54:46Z

Thank you for the comment @rochdev

Would it still be reasonable to keep the instrumentation though where it would then just start a new trace (together with specific operation span) just so that the flamegraph would not start with some lower level operation (like database access) which might leave user wondering what preceeded that database access.

It would serve the purpose of still allowing people to list all the Kafka operations - while for obvious reasons not guaranteeing that there's no breakage of the trace.

Currently due to there not being a good way to retain traces per Kafka message (due to batch processing) we are also sacrificing visibility on the origin of some operations (some kafka operations not producing spans and not starting a trace). So you would not be able to determine which kafka message produced some following operations.

One could also still deal with allowing custom instrumentation by allowing (and documenting) the batch instrumentation (with completely new trace) to be disable if need be.

CC: @jbertran

…t offset

rochdev · 2023-09-08T03:29:29Z

@allanpaiste So basically you would keep the span, but when there are multiple messages it would simply not be connected to the upstream services? Generally speaking that would probably make sense. Could the topic still be used as the resource name, or do batches allow messages from different topics?

allanpaiste · 2023-09-08T06:55:50Z

@rochdev

The topic can be still used as resource name, yes. The batch will always hold just one partition of one topic - so that should not be a problem (consumer could be handling multiple partitions of a topic but they'd all end up being passed to eachBatch as separate objects).

Just for reference, throwing some kafkajs typing here as well.

eachBatchPayload:

https://github.com/tulios/kafkajs/blob/ff3b1117f316d527ae170b550bc0f772614338e9/types/index.d.ts#L990-L999

... and the Batch itself:

https://github.com/tulios/kafkajs/blob/ff3b1117f316d527ae170b550bc0f772614338e9/types/index.d.ts#L876-L886

tlhunter · 2024-08-30T20:51:16Z

Blessed by #4645

make sure that all kafkajs usage is properly traced

433bf4c

allanpaiste changed the title ~~make sure that all kafkajs usage is properly traced~~ Make sure that all kafkajs usage is properly traced Jul 17, 2023

allanpaiste and others added 4 commits July 17, 2023 14:43

Merge branch 'master' into feature/add-instrumentation-for-kafkajs-ea…

6abf979

…ch-batch

speed up test run by switching to new groupId for each test (to avoid…

6fda205

… delays on fast disconnect/re-connect imposed by kafkajs)

Merge github.com:allanpaiste/dd-trace-js into feature/add-instrumenta…

c635186

…tion-for-kafkajs-each-batch

Merge branch 'feature/add-instrumentation-for-kafkajs-each-batch' of …

47c47e0

…github.com:allanpaiste/dd-trace-js into feature/add-instrumentation-for-kafkajs-each-batch

allanpaiste marked this pull request as ready for review July 17, 2023 12:06

allanpaiste requested a review from a team as a code owner July 17, 2023 12:06

allanpaiste changed the title ~~Make sure that all kafkajs usage is properly traced~~ kafkajs eachBatch tracing (with optional experimental auto-propagation) Jul 17, 2023

allanpaiste changed the title ~~kafkajs eachBatch tracing (with optional experimental auto-propagation)~~ Feature: tracing for kafkajs eachBatch (with optional experimental auto-propagation) Jul 17, 2023

rochdev requested changes Jul 20, 2023

View reviewed changes

tlhunter requested a review from a team as a code owner August 15, 2023 22:07

tlhunter requested review from jbertran and removed request for a team August 15, 2023 22:07

allanpaiste closed this Aug 29, 2023

allanpaiste reopened this Aug 29, 2023

allanpaiste added 2 commits August 29, 2023 16:04

remove trace continuation from one message; switch to using first/las…

323f2ce

…t offset

merge in changes from master

df04469

allanpaiste changed the title ~~Feature: tracing for kafkajs eachBatch (with optional experimental auto-propagation)~~ Feature: tracing for kafkajs eachBatch Aug 29, 2023

allanpaiste changed the title ~~Feature: tracing for kafkajs eachBatch~~ Feature: create trace span for kafkajs eachBatch Aug 29, 2023

allanpaiste requested a review from rochdev August 29, 2023 13:11

Qard requested a review from a team as a code owner June 17, 2024 23:42

tlhunter added the integration-kafkajs label Aug 30, 2024

tlhunter mentioned this pull request Aug 30, 2024

feat(kafka): add datastreams monitoring support for kafkajs batches #4645

Merged

6 tasks

wconti27 closed this in #4645 Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: create trace span for kafkajs eachBatch #3400

Feature: create trace span for kafkajs eachBatch #3400

allanpaiste commented Jul 17, 2023 •

edited

Loading

rochdev left a comment

allanpaiste commented Aug 29, 2023 •

edited

Loading

rochdev commented Sep 8, 2023

allanpaiste commented Sep 8, 2023 •

edited

Loading

tlhunter commented Aug 30, 2024

Feature: create trace span for kafkajs eachBatch #3400

Feature: create trace span for kafkajs eachBatch #3400

Conversation

allanpaiste commented Jul 17, 2023 • edited Loading

What does this PR do?

Motivation

Plugin Checklist

Additional Notes

rochdev left a comment

Choose a reason for hiding this comment

allanpaiste commented Aug 29, 2023 • edited Loading

rochdev commented Sep 8, 2023

allanpaiste commented Sep 8, 2023 • edited Loading

tlhunter commented Aug 30, 2024

allanpaiste commented Jul 17, 2023 •

edited

Loading

allanpaiste commented Aug 29, 2023 •

edited

Loading

allanpaiste commented Sep 8, 2023 •

edited

Loading