Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30401 JTrace should optionally suppress trace/span ids generation #17935

Conversation

rpastrana
Copy link
Member

@rpastrana rpastrana commented Oct 23, 2023

  • Feature request somewhat ambiguous
  • Copies global.tracing to container helm values
  • Generates traceid if server doesn't receive remote parent

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

@rpastrana rpastrana requested a review from ghalliday October 23, 2023 02:54
@github-actions
Copy link

@rpastrana
Copy link
Member Author

@ghalliday it wasn't clear exactly when we wanted to force traceids. This PR is not complete but wanted to get in synch with you before spending any more time down this path...

@rpastrana
Copy link
Member Author

@ghalliday sample output from first commit doesn't make much sense. Hoping 2nd commit is closer to your intention.

"!!No remoteParent received, generating remoteParent IDs: Trace '30d658bd5fbc5c6583c3dedca16b8ba0' span '208c54bca6cd4d0a' "
00000078 PRG INF 2023-10-23 13:52:02.392     7   286 UNK     "Span start: {"Type":"Server","Name":"request","GlobalID":"JD8LowRvXm3ujvmHS82NAT","LocalID":"JD8LowSwvRDqbiZ7dRDJdt","SpanID":"621030d51af8737e","TraceID":"30d658bd5fbc5c6583c3dedca16b8ba0","ParentSpanID":"208c54bca6cd4d0a"}

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana looks good. A couple of questions.

@@ -32,7 +32,7 @@ Pass in dict with root and me
{{- $thorAgentScope := dict "name" .thorAgentName "replicas" .thorAgentReplicas "maxActive" .me.maxGraphs | merge (pick .me "keepJobs") }}
{{- $eclAgentResources := .me.eclAgentResources | default dict -}}
{{- $hthorScope := dict "name" $hthorName "jobMemorySectionName" "eclAgentMemory" | merge (pick .me "multiJobLinger" "maxGraphStartupTime") | merge (dict "resources" (deepCopy $eclAgentResources)) }}
{{- $thorScope := omit .me "eclagent" "thoragent" "hthor" "logging" "env" "eclAgentResources" "eclAgentUseChildProcesses" "eclAgentReplicas" "thorAgentReplicas" "eclAgentType" }}
{{- $thorScope := omit .me "eclagent" "thoragent" "hthor" "logging" "Tracing" "env" "eclAgentResources" "eclAgentUseChildProcesses" "eclAgentReplicas" "thorAgentReplicas" "eclAgentType" }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a lower case "tracing"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely, good catch

{
bool createLocalId = !isEmptyString(hpccGlobalId);
if (createLocalId)
hpccLocalId.set(ln_uid::createUniqueIdString().c_str());

auto provider = opentelemetry::trace::Provider::GetTracerProvider();
//we don't always want to create trace/span IDs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this preserve trace-ids if this flags are false?

Copy link
Member Author

@rpastrana rpastrana Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this preserve trace-ids if this flags are false?

Good point, opentel remote/local parentSpans won't have a span to associate with...

We use the otel span object to track many attributes including parent trace/span ID, do we care to keep track of them ourselves if !hasMask(flags, SpanFlags::EnsureTraceId) && !queryTraceManager().alwaysCreateTraceIds()

Update, we actually do track
opentelemetry::v1::trace::SpanContext remoteParentSpanCtx and
CSpan * localParentSpan
We just have to ensure tolog/tostring etc draws from these members and not just the span member

Copy link
Member Author

@rpastrana rpastrana Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this comment to avoid confusion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest amended commit simply inherits traceID from parent when no parent otel span available:

sample output for no traceID generated mode serverSpan(propagated parent)->clientSpanserverchild->internalSpan->internalSpan2

15:07:40.802583 27843 Span end: {"Type":"Internal","Name":"internalSpan2","TraceID":"beca49ca8f3138a2842e5cf21402bfff","ParentSpanID": ""}
15:07:40.802618 27843 Span end: {"Type":"Internal","Name":"internalSpan","TraceID":"beca49ca8f3138a2842e5cf21402bfff","ParentSpanID": ""}
15:07:40.802628 27843 Span end: {"Type":"Client","Name":"clientSpanserverchild","TraceID":"beca49ca8f3138a2842e5cf21402bfff","ParentSpanID": ""}
15:07:40.802637 27843 Span end: {"Type":"Server","Name":"propegatedServerSpan","GlobalID":"IncomingUGID","CallerID":"IncomingCID","LocalID":"JDbF4xnv7LSWDV4Eug1SpJ","TraceID":"beca49ca8f3138a2842e5cf21402bfff","ParentSpanID":"4b960b3e4647da3f"}

@@ -637,6 +638,33 @@ class CServerSpan : public CSpan
opts.parent = remoteParentSpanCtx;
}
}

/*if ((!httpHeaders || !httpHeaders->hasProp("traceparent")) && queryTraceManager().alwaysCreateTraceIds())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is a full PR this should be moved elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is a full PR this should be moved elsewhere.

I initially assumed this is what we wanted to do in this feature, but quickly realized forcing a remote trace/span ID didn't really make sense but left the commented out block to inspire this conversation...

Do we ever envision forcing a synthetic remote trace/span ID?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you ever want to create a synthetic remote id.

@ghalliday
Copy link
Member

I'm not sure why "Feature request somewhat ambiguous" is in the commit. If it is ambiguous it is worth clarifying before creating the PR.

@rpastrana
Copy link
Member Author

I'm not sure why "Feature request somewhat ambiguous" is in the commit. If it is ambiguous it is worth clarifying before creating the PR.

Because I found the description somewhat ambiguous and the creator was away and this feature was identified as priority.
Anyway, the point of that statement was for the reviewer to understand why the draft PR might have missed the gist of the feature request before significant time was spent on the implementation.

@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch from 2528c90 to 623b1a7 Compare October 26, 2023 19:15
@rpastrana rpastrana marked this pull request as ready for review October 26, 2023 19:19
@rpastrana rpastrana changed the title HPCC-30401 WIP HPCC-30401 JTrace should optionally suppress trace/span ids generation Oct 26, 2023
@rpastrana rpastrana requested a review from ghalliday October 31, 2023 20:51
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana I think this is close, but it isn't quite right yet.

@@ -16,7 +16,7 @@
############################################################################## */



//#include "opentelemetry/sdk/trace/random_id_generator_factory.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: remove

@@ -185,18 +190,19 @@ class CSpan : public CInterfaceOf<ISpan>
if (!isEmptyString(hpccLocalId.get()))
out.append(",\"LocalID\":\"").append(hpccLocalId.get()).append("\"");

//traceID not always tied to span object
out.append(",\"TraceID\":\"").append(traceID.get()).append("\"");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check for isEmptyString()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana I think this question is still relevant

@@ -637,6 +638,33 @@ class CServerSpan : public CSpan
opts.parent = remoteParentSpanCtx;
}
}

/*if ((!httpHeaders || !httpHeaders->hasProp("traceparent")) && queryTraceManager().alwaysCreateTraceIds())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you ever want to create a synthetic remote id.

//what if tracerName is empty?
auto tracer = provider->GetTracer(tracerName.get());
//we don't always want to create trace/span IDs
if (hasMask(flags, SpanFlags::EnsureTraceId) || queryTraceManager().alwaysCreateTraceIds())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem quite right. You still want to create a span if a trace-id was provided by the caller. It is only omitted if it is not provided and ids are not created by default.

@@ -171,6 +171,11 @@ class CSpan : public CInterfaceOf<ISpan>
return spanID.get();
}

const char * getTraceID() const
{
return traceID.get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't really be stored for non-server spans, similar to caller and local spans. The span class structure would benefit from some refactoring (but as a separate PR)

@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch from 623b1a7 to 2d5d45b Compare November 3, 2023 21:20
@rpastrana rpastrana requested a review from ghalliday November 3, 2023 21:21
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana one question comment, but otherwise it looks ready to merge.

@@ -185,18 +190,19 @@ class CSpan : public CInterfaceOf<ISpan>
if (!isEmptyString(hpccLocalId.get()))
out.append(",\"LocalID\":\"").append(hpccLocalId.get()).append("\"");

//traceID not always tied to span object
out.append(",\"TraceID\":\"").append(traceID.get()).append("\"");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana I think this question is still relevant

@rpastrana rpastrana requested a review from ghalliday November 7, 2023 04:12
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpastrana please squash and rebase. Thanks.

@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch 2 times, most recently from e1c7fba to 0668208 Compare November 9, 2023 14:23
@ghalliday
Copy link
Member

@rpastrana unfortunately needs rebasing again because I merged your other test.
Also, can you clarify what "disabled" means. Does it mean don't propagate anything? How does it differ from the createXXX options?

@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch from 0668208 to 2806c9f Compare November 10, 2023 14:05
@rpastrana
Copy link
Member Author

@rpastrana unfortunately needs rebasing again because I merged your other test. Also, can you clarify what "disabled" means. Does it mean don't propagate anything? How does it differ from the createXXX options?

Rebased, but yes, the introduction of the create

@rpastrana unfortunately needs rebasing again because I merged your other test. Also, can you clarify what "disabled" means. Does it mean don't propagate anything? How does it differ from the createXXX options?

Rebased.
Yes, the newly created alwaysCreateTraceIds and its ultimate implementation does cloud the definition of the global.tracing.disabled flag...

Initially, disabled would suppress configuration of Otel's tracer(span processor/exporter, etc), and thereby avoid all otel trace/span creation/processing/exporting.
The other flags control non-otel functionality.
As-is, the disabled and alwaysCreateTraceIds flags have an overlap.... if disabled, no valid trace IDs can be generated even if we've been instructed to "Always create TraceIds".

@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch from 2806c9f to 07783ea Compare November 15, 2023 15:36
@rpastrana
Copy link
Member Author

@ghalliday this was showing a conflict that shouldn't have been there. After re-rebasing, I manually resolved the conflict.

@ghalliday
Copy link
Member

@rpastrana the commits are not quite right. I suspect you may need to rebase your branch onto the most recent 9.4.x

- Copies global.tracing to component helm values
- Revises logic to create otel span
-- Span created if per-span flag set
-- or global/component config flag set
-- or serverSpan receives valid remote parent
- Adds cppunit (cannot test scenario where no span created yet)

Signed-off-by: Rodrigo Pastrana <[email protected]>
@rpastrana rpastrana force-pushed the HPCC-30401-JTrace-optional-traceid-support2 branch from 21bd30b to b8d0e49 Compare November 16, 2023 15:08
@ghalliday ghalliday merged commit 11c0506 into hpcc-systems:candidate-9.4.x Nov 16, 2023
42 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants