-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add task resource tracking service to cluster service #14681
Changes from 3 commits
d1a6038
d00b2df
720cefb
14ff852
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,7 +45,7 @@ public SearchQueryRecord(final StreamInput in) throws IOException, ClassCastExce | |
measurements = new HashMap<>(); | ||
in.readMap(MetricType::readFromStream, StreamInput::readGenericValue) | ||
.forEach(((metricType, o) -> measurements.put(metricType, metricType.parseValue(o)))); | ||
this.attributes = in.readMap(Attribute::readFromStream, StreamInput::readGenericValue); | ||
this.attributes = Attribute.readAttributeMap(in); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Has something changed in the ser/deser logic? If yes, can that cause issue while upgrading from 2.15 to 2.16? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey @jainankitk ! Actually this is a bug we didn't caught. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And I took this from Siddhant's PR when moving categorization code to the plugin: https://github.com/opensearch-project/OpenSearch/pull/14528/files#r1669900434 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jainankitk This was also discussed in detail in the PR to move categorization to plugin : https://github.com/opensearch-project/OpenSearch/pull/14528/files#r1669900434 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, thanks for the details! Given the serialization was failing already, should not result into compatibility issues. |
||
} | ||
|
||
/** | ||
|
@@ -134,7 +134,11 @@ public XContentBuilder toXContent(final XContentBuilder builder, final ToXConten | |
public void writeTo(final StreamOutput out) throws IOException { | ||
out.writeLong(timestamp); | ||
out.writeMap(measurements, (stream, metricType) -> MetricType.writeTo(out, metricType), StreamOutput::writeGenericValue); | ||
out.writeMap(attributes, (stream, attribute) -> Attribute.writeTo(out, attribute), StreamOutput::writeGenericValue); | ||
out.writeMap( | ||
attributes, | ||
(stream, attribute) -> Attribute.writeTo(out, attribute), | ||
(stream, attributeValue) -> Attribute.writeValueTo(out, attributeValue) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same goes for the write part as well |
||
); | ||
} | ||
|
||
/** | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1109,6 +1109,7 @@ protected Node( | |
clusterService.getClusterSettings(), | ||
threadPool | ||
); | ||
clusterService.setTaskResourceTrackingService(taskResourceTrackingService); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we just initialize Also should |
||
|
||
final SearchBackpressureSettings searchBackpressureSettings = new SearchBackpressureSettings( | ||
settings, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ | |
import org.opensearch.ExceptionsHelper; | ||
import org.opensearch.action.search.SearchShardTask; | ||
import org.opensearch.common.SuppressForbidden; | ||
import org.opensearch.common.annotation.PublicApi; | ||
import org.opensearch.common.inject.Inject; | ||
import org.opensearch.common.settings.ClusterSettings; | ||
import org.opensearch.common.settings.Setting; | ||
|
@@ -51,6 +52,7 @@ | |
/** | ||
* Service that helps track resource usage of tasks running on a node. | ||
*/ | ||
@PublicApi(since = "2.16.0") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand why you are marking this public in that it is now exposed publicly from ClusterService. Is this the right level of visibility for this service? Can we instead expose only the required functionality from ClusterService instead the whole of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can do a wrapper of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ansjcy the
That seem to be the problem that core implementation has to fix, why the task level resources are not refreshed (on coordinator node)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @reta ! thanks for the input. Currently the Let me think about this more. Instead of exposing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @ansjcy
It does not sound right, the task is still ongoing so its usage won't be correct.
The logical point (at least to me) of capturing task usages seems to be the moment task ends. It looks to me you are trying to chime in somewhere in between (while task is still executing), that does not look like the correct way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@reta We are currently working on a PR based on the above discussion. For the last point you made, what other justification and work is required to make the API public? We are trying to get all the query insights changes in 2.16 and this is the only PR that is dangling currently. Want to make sure we reach a path forward. Please let us know your suggestions. In the meantime will finalize the above draft PR if there are not concerns with this approach? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @deshsidd
I did not design the original APIs, you may ask the contributor if he has any concerns. On the second point, if you need to make it public, apply the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @reta! @reta Looks like you had initially reduced the visibility of the API. For now I am going to continue with the approach that Reta and Chenyang had discussed above and work on the following PR. Will also make the SearchRequestOperationsListener @publicapi as part of these changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@deshsidd yes, you will understand why once try to apply There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
@SuppressForbidden(reason = "ThreadMXBean#getThreadAllocatedBytes") | ||
public class TaskResourceTrackingService implements RunnableTaskExecutionListener { | ||
|
||
|
@@ -357,6 +359,7 @@ public TaskResourceInfo getTaskResourceUsageFromThreadContext() { | |
/** | ||
* Listener that gets invoked when a task execution completes. | ||
*/ | ||
@PublicApi(since = "2.16.0") | ||
public interface TaskCompletionListener { | ||
void onTaskCompleted(Task task); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change to
attributes.put(Attribute.SOURCE, request.source());
here itself so we can use this in the future?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe let's do it as part of your PR? to ensure this PR only contains the necessary urgent fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks!