-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC4J-611 Add OpenTelemetry tracing to dfsclient #722
Conversation
Jira Issue: https://hpccsystems.atlassian.net/browse/HPCC4J-611 Jirabot Action Result: |
@rpastrana Wrapped up code review changes, please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpmcmu looks good, but left you a few suggestions. Let's discuss.
@@ -129,6 +161,15 @@ public boolean hasOperation() | |||
|
|||
public void startOperation(String operationName) | |||
{ | |||
if (taskSpan != null) | |||
{ | |||
taskSpan = Utils.createChildSpan(taskSpan, operationName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it makes sense to have a single Utils.createSpan(parentSpan, spanName)? If parentSpan != null, it sets it as the parent
} | ||
|
||
taskSpan.end(); | ||
|
||
long totalOperationTime = System.nanoTime(); | ||
totalOperationTime -= operationStart; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of these context values seem worth reporting in the span
@@ -1095,7 +1149,8 @@ private static void performRead(String[] args, TaskContext context) | |||
for (int i = 0; i < datasets.length; i++) | |||
{ | |||
String datasetName = datasets[i]; | |||
context.startOperation("Read " + datasetName); | |||
context.startOperation("FileUtility.Read_" + datasetName); | |||
context.setTaskSpanAttributes(Attributes.of(AttributeKey.stringKey("server.url"), connString)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to add individual attributes after this initial set attributes call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we expose addTaskSpanAttribute(Attribute)?
context.startOperation("Read Test " + datasetName); | ||
context.startOperation("FileUtility.ReadTest_" + datasetName); | ||
|
||
context.setTaskSpanAttributes(Attributes.of(AttributeKey.stringKey("server.url"), connString)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other attributes we might want to report to help track issues?
@@ -1463,7 +1521,9 @@ private static void performCopy(String[] args, TaskContext context) | |||
String srcFile = copyPairs[i]; | |||
String destFile = copyPairs[i+1]; | |||
|
|||
context.startOperation("Copy " + srcFile + " -> " + destFile); | |||
context.startOperation("FileUtility.Copy_ " + srcFile + " -> " + destFile); | |||
context.setTaskSpanAttributes(Attributes.of(AttributeKey.stringKey("server.src.url"), srcURL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the context.staroperation call add any relevant attributes to the span?
|
||
private static FileWriteContext constructReadContext(FieldDef recordDef, CompressionAlgorithm fileCompression, int connectTimeoutMs, int socketOpTimeoutMs) | ||
{ | ||
FileWriteContext context = new FileWriteContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do all/some of these end up tracked in a span?
} | ||
|
||
public static Span createSpan(String name) | ||
{ | ||
return Utils.getTelemetryTracer().spanBuilder(name) | ||
Span span = Utils.getTelemetryTracer().spanBuilder(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we combine the two functions?
Span createSpan(String name, Span parent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we combining these 2?
@rpastrana please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpmcmu reviwed
public AtomicLong bytesRead = new AtomicLong(0); | ||
public AtomicLong bytesWritten = new AtomicLong(0); | ||
|
||
public Span taskSpan = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the operation span?
private Stack<TaskOperation> operations = new Stack<TaskOperation>(); | ||
public List<JSONObject> operationResults = new ArrayList<JSONObject>(); | ||
|
||
public void setTaskSpanAttributes(Attributes attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setCurrentOperationSpanAttributes?
} | ||
} | ||
|
||
public void addWarn(String warn) | ||
{ | ||
synchronized(warnMessages) | ||
if (operations.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big deal, but since you created the hasOperation function, why not use it here?
{ | ||
return; | ||
} | ||
TaskOperation op = operations.peek(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you created the getOperation() function, why not use it here and in all the TaskContext methods
private static class TaskOperation | ||
{ | ||
public String currentOperationDesc = ""; | ||
public long operationStart = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor, but the var name is somewhat ambiguous, operationStartTimeNS ? or startTimeNS?
double writeBandwidth = (double) bytesWritten.get() / (1_000_000.0 * timeInSeconds); | ||
results.put("Write Bandwidth", String.format("%.2f MB/s", writeBandwidth)); | ||
|
||
TaskOperation op = getOperation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this makes sense, but could L291-294 be condensed into the following?
operationResults.add(endCurrentOperation());
@@ -659,6 +764,10 @@ private static void executeTasks(Runnable[] tasks, int numThreads) throws Except | |||
|
|||
public void run() | |||
{ | |||
// Make the task span is current for the thread, otherwise spans created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a typo in the comment
@@ -1095,7 +1149,8 @@ private static void performRead(String[] args, TaskContext context) | |||
for (int i = 0; i < datasets.length; i++) | |||
{ | |||
String datasetName = datasets[i]; | |||
context.startOperation("Read " + datasetName); | |||
context.startOperation("FileUtility.Read_" + datasetName); | |||
context.setTaskSpanAttributes(Attributes.of(AttributeKey.stringKey("server.url"), connString)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we expose addTaskSpanAttribute(Attribute)?
return true; | ||
} | ||
|
||
public TaskOperation getOperation() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and if for nothing other than symmetry, addOperation? setOperation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want all new operations to go through startOperation / endOperation. So I don't want to expose an addOperation etc
} | ||
|
||
public static Span createSpan(String name) | ||
{ | ||
return Utils.getTelemetryTracer().spanBuilder(name) | ||
Span span = Utils.getTelemetryTracer().spanBuilder(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we combining these 2?
- Updated event attribute reporting - Fixed trace structure in dfsclient protocol - Added tracing to FileUtility - Improved operation tracking in TaskContext - Added FileReadContext & FileWriteContext to DFSClient APIs Signed-off-by: James McMullan [email protected]
@jakesmith once we get dafilesrv instrumentation we'll be able to track these transactions end-to-end. We're able to deploy small, local Elastic stack based trace aggregator which would be very helpful in the instrumentation process. We can share that info if interested. |
Signed-off-by: James McMullan [email protected]
Type of change:
Checklist:
Testing: