Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add read/write bytes metrics #803

Merged
merged 3 commits into from
Oct 25, 2024

Conversation

ykmr1224
Copy link
Collaborator

@ykmr1224 ykmr1224 commented Oct 23, 2024

Description

  • Add read/write bytes metrics
    • It uses SparkListener to collect read/write bytes/records for each task, and emit the sum once query completes.
  • Added HistoricGauge to emit each data points separately with timestamp.
    • Codahale stores metrics and those are emitted periodically. If application emit metrics multiple times within a period, those datapoints would be summarized or simply ignored other than last value. It is not ideal when we want to track each datapoints (such as when we want to gather metrics for each API call, etc.).
    • HistoricGausge stores each data point with timestamp, and emit each data points to CW using those timestamps.
  • Replaced QueryExecutionTimeMetric with HistoricGauge

Sample metrics:
input totalBytesRead

Related Issues

n/a

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224 ykmr1224 force-pushed the metrics/read_write_bytes branch from 7ed5ff7 to 4348e58 Compare October 24, 2024 05:53
@ykmr1224 ykmr1224 marked this pull request as ready for review October 24, 2024 15:07
Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224 ykmr1224 requested a review from noCharger as a code owner October 24, 2024 22:00
Copy link
Collaborator

@seankao-az seankao-az left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

long timestamp;
}

private final List<DataPoint> dataPoints = Collections.synchronizedList(new LinkedList<>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be a trivial question..
what could go wrong if use a regular list instead of synchronized list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As metric emit is run in separate thread, there could be concurrency issue.

Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224 ykmr1224 requested a review from mengweieric as a code owner October 25, 2024 04:31
@ykmr1224 ykmr1224 merged commit 2a647d4 into opensearch-project:main Oct 25, 2024
4 checks passed
@opensearch-trigger-bot
Copy link

The backport to 0.5-nexus failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-spark/backport-0.5-nexus 0.5-nexus
# Navigate to the new working tree
pushd ../.worktrees/opensearch-spark/backport-0.5-nexus
# Create a new branch
git switch --create backport/backport-803-to-0.5-nexus
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2a647d43264cdcbb65f95b22c8373dee50497937
# Push it to GitHub
git push --set-upstream origin backport/backport-803-to-0.5-nexus
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-spark/backport-0.5-nexus

Then, create a pull request where the base branch is 0.5-nexus and the compare/head branch is backport/backport-803-to-0.5-nexus.

ykmr1224 added a commit to ykmr1224/opensearch-spark that referenced this pull request Oct 29, 2024
* Add read/write bytes metrics

Signed-off-by: Tomoyuki Morita <[email protected]>

* Add unit test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Address comments

Signed-off-by: Tomoyuki Morita <[email protected]>

---------

Signed-off-by: Tomoyuki Morita <[email protected]>

(cherry picked from commit 2a647d4)
seankao-az pushed a commit that referenced this pull request Oct 30, 2024
* Add read/write bytes metrics

Signed-off-by: Tomoyuki Morita <[email protected]>

* Add unit test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Address comments

Signed-off-by: Tomoyuki Morita <[email protected]>

---------

Signed-off-by: Tomoyuki Morita <[email protected]>

(cherry picked from commit 2a647d4)
kenrickyap pushed a commit to Bit-Quill/opensearch-spark that referenced this pull request Dec 11, 2024
* Add read/write bytes metrics

Signed-off-by: Tomoyuki Morita <[email protected]>

* Add unit test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Address comments

Signed-off-by: Tomoyuki Morita <[email protected]>

---------

Signed-off-by: Tomoyuki Morita <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants