Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dimensions to cloudwatch metric and metric filter option for CWSink. #176

Merged
merged 1 commit into from
Nov 28, 2023

Conversation

vmmusings
Copy link
Member

@vmmusings vmmusings commented Nov 23, 2023

Description

  • Added option to include metric filter in CWSink.
  • CloudWatchSDK is not available during runtime, so adding the dependency explicitly.
  • Example Usage :--conf spark.metrics.conf.*.sink.cloudwatch.regex=.*Flint.* property would publish only metrics with the regex Given.
  • Streamlined implementation of adding dimensions to metricNames.
  • No need to review DimensionedName Class: It is picked up from https://github.com/azagniotov/codahale-aggregated-metrics-cloudwatch-reporter.

Usage of Dimensioned Name for below image:

final DimensionedName dimensionedName = DimensionedName
        .withName("FlintOpensearchErrorCount")
        .withDimension("domainIdent", "88888")
        .build();

metricRegistry.counter(dimensionedName.encode()).inc();
metricRegistry.counter(derivedDimensionedName.encode()).inc();

Tested on emr-serverless by publishing dummy metrics.

Screenshot 2023-11-22 at 4 20 46 PM

Issues Resolved

List any issues this PR will resolve, e.g. Closes [...].

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@vmmusings vmmusings self-assigned this Nov 23, 2023
@vmmusings vmmusings added the enhancement New feature or request label Nov 23, 2023
@vmmusings vmmusings marked this pull request as ready for review November 23, 2023 00:17
@penghuo
Copy link
Collaborator

penghuo commented Nov 23, 2023

what is the relationship of metrics dimension and screenshot?

final DimensionedName dimensionedName = DimensionedName
        .withName("test")
        .withDimension("key1", "val1")
        .withDimension("key2", "val2")
        .withDimension("key3", "val3")
        .build();

@vmmusings
Copy link
Member Author

what is the relationship of metrics dimension and screenshot?

final DimensionedName dimensionedName = DimensionedName
        .withName("test")
        .withDimension("key1", "val1")
        .withDimension("key2", "val2")
        .withDimension("key3", "val3")
        .build();

Let me change the PR description.
Spark Dropwizard metrics doesn't support dimensions, we encode dimensions in metric name and add them when pushing to CW.
For the screenshot.

DimensionName dimensionedName = DimensionedName
.withName("FlintOpensearchErrorCount")
.withDimension("domainIdent", "88888")
.build()

@vmmusings vmmusings closed this Nov 23, 2023
@vmmusings vmmusings reopened this Nov 23, 2023
@@ -59,6 +59,8 @@ lazy val flintCore = (project in file("flint-core"))
exclude ("org.apache.logging.log4j", "log4j-api"),
"com.amazonaws" % "aws-java-sdk" % "1.12.397" % "provided"
exclude ("com.fasterxml.jackson.core", "jackson-databind"),
"com.amazonaws" % "aws-java-sdk-cloudwatch" % "1.12.593"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance this can be provided or not?

meanwhile I'm just thinking any impact on open source user who use Flint as library. Does all classes added have to be in Flint core module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Provided is not working...cloudwatch is not available.
  • I still didn't get the idea behind flint-core,[I only see OS Client] should I one more package as flint-common-utils?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There won't be any impact to opensource user. We emit metrics but nothing will be pushed unless one configures a sink.

Copy link
Collaborator

@dai-chen dai-chen Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Provided is not working...cloudwatch is not available.
  • I still didn't get the idea behind flint-core,[I only see OS Client] should I one more package as flint-common-utils?

For now we only have FlintOpenSearchClient. We may add FlintFileSystemClient later (for storing Flint index data on S3). I was thinking for user who use Flint library in their own application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for them we are actually providing benefit by giving out metrics similar to driver and executor metrics. CWSink they can leverage if they wish, otherwise they can ignore. We can discuss offline on this and will post the summary over here.

@vmmusings vmmusings merged commit 0616865 into opensearch-project:main Nov 28, 2023
8 of 10 checks passed
@penghuo penghuo mentioned this pull request Dec 7, 2023
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants