Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite to use the new Prometheus Java client #25

Merged
merged 3 commits into from
Jun 27, 2024

Conversation

mimaison
Copy link
Contributor

@mimaison mimaison commented Jun 25, 2024

This introduces some changes:

  • Quantiles names are now serialized without trailing zeros. So we now have labels like quantile="0.5". Previously the label was quantile="0.50".
  • I've added sum series to Summary metrics. They now have quantiles, count and sum. Sum series have the _sum suffix.
  • Removed _count suffix from Counter metrics. As per the Counter documentation, Counter metrics should use the _total suffix. Previously we were injecting _count and Prometheus was adding _total. So now we have kafka_server_kafka_server_brokertopicmetrics_bytesinpersec_total while before we had kafka_server_kafka_server_brokertopicmetrics_bytesinpersec_count_total.
  • The new client has a dedicated metric type of non-numeric metrics, InfoSnapshot. Under the covers this still emits a Gauge metric with a value set to 1, like we do today. This is because Prometheus does not support non-numeric metrics. It's preferable to use the dedicated type, however this new type uses the _info suffix. So now we have kafka_server_kafka_server_kafkaserver_clusterid_info{ClusterId="mHGu-1meQOSC_RpQHdCpOw"} while before we had kafka_server_kafka_server_kafkaserver_clusterid{ClusterId="mHGu-1meQOSC_RpQHdCpOw"}. The suffix helps identify metrics whose values should not be graphed (as they are constant) and where the useful information is in the labels instead.
  • Gauge metrics don't have a suffix anymore, as per the OpenMetrics specification. The previous Prometheus client was injecting a _total suffix. We now have kafka_server_txn_marker_channel_metrics_response, while previously we had kafka_server_txn_marker_channel_metrics_request_total.
  • The new client does not insert a trailing comma after the last label. We now have kafka_server_alter_partition_metrics_connection_close_rate{BrokerId="0"}, while we previously had kafka_server_alter_partition_metrics_connection_close_rate{BrokerId="0",}.

The new client allows exposing metrics in the OpenMetrics format and serialized with Protobuf. When Prometheus scrapes the endpoint, it negotiates the format using the Accept HTTP header. You can check different formats using the debug query parameter on the GET /metrics endpoint. For example:

  • /metrics?debug=openmetrics: View OpenMetrics text format.
  • /metrics?debug=text: View Prometheus text format.
  • /metrics?debug=prometheus-protobuf: View a text representation of the Prometheus protobuf format.

Related resources:

@mimaison mimaison force-pushed the new-client branch 2 times, most recently from a9e3cc9 to c1b97e5 Compare June 26, 2024 10:30
@mimaison mimaison marked this pull request as ready for review June 26, 2024 10:37
*/
@Override
public MetricSnapshots collect() {
Map<String, GaugeSnapshot.Builder> gaugeBuilders = new HashMap<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the MetricSnapshotBuilder interface does not have a build() method. I opened a PR against prometheus/client_java to change that.

In the meantime, I think we have to keep a collection per metric type.

* @param labels Labels associated with the datapoint
* @return The {@link CounterSnapshot.CounterDataPointSnapshot} datapoint
*/
public static CounterSnapshot.CounterDataPointSnapshot convert(Number number, Labels labels) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super happy with the various methods especially as convert(double value, Labels labels) and convert(Number number, Labels labels) shadow each other. Ideally I wanted to keep this file "Yammer free" but I ended up having to use Timer and Histogram.

So maybe we should tweak some of these convert() methods to accept the Yammer and Kafka metric types.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you had sum and count parameters that were Suppliers and a value Function, you could hide yammerness from this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shuffled a few things around so this class does not depend on Yammer anymore.

}

private String metricName(MetricName metricName) {
return PrometheusNaming
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand how to use the various methods in PrometheusNaming especially sanitizeMetricName() and prometheusName().

}

// for testing
KafkaPrometheusMetricsReporter(PrometheusRegistry registry) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrometheusRegistry has no methods to clear it. For this reason, I introduced constructors to inject a registry instance. This simplifies testing.
Same in YammerPrometheusMetricsReporter

*/
public PrometheusMetricsReporterConfig(Map<?, ?> props) {
public PrometheusMetricsReporterConfig(Map<?, ?> props, PrometheusRegistry registry) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal to have a registry field here. But it's required when start the HTTP server.


@BeforeEach
public void setup() {
labels = new HashMap<>();
labels.put("key", "value");
tagsMap = new LinkedHashMap<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup of the 2 Collector tests is still a bit wonky with the labels/tagsMap fields. I'll see if I can tidy this up a little bit.

pom.xml Outdated
@@ -309,6 +309,23 @@
</archive>
</configuration>
</plugin>
<!-- Temporarily re-add the assembly plugin before we fix the packaging -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be addressed in #24

@@ -169,17 +169,17 @@

<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient</artifactId>
<artifactId>prometheus-metrics-model</artifactId>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid that. It makes it hard to override dependencies etc.


private static Quantiles quantiles(Sampling samplingMetric) {
Quantiles.Builder quantilesBuilder = Quantiles.builder();
for (double quantile : Arrays.asList(0.50, 0.75, 0.95, 0.98, 0.99, 0.999)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract constant

* @return The {@link InfoSnapshot.InfoDataPointSnapshot} datapoint
*/
public static InfoSnapshot.InfoDataPointSnapshot convert(Object value, Labels labels, String metricName) {
labels = labels.add(PrometheusNaming.sanitizeLabelName(metricName), String.valueOf(value));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to mutate the caller's labels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Labels.add() returns a new instance, but yeah we don't want to reassign labels. Fixed.

Copy link

@k-wall k-wall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines 8 to 11
<suppress checks="CyclomaticComplexity" files="YammerMetricsCollector.java"/>
<suppress checks="NPathComplexity" files="YammerMetricsCollector.java"/>
<suppress checks="JavaNCSS" files="YammerMetricsCollector.java"/>
<suppress checks="ClassFanOutComplexity" files="YammerMetricsCollector.java"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you use annotations to suppress these instead of suppressing it here? That is what we normally do -> it is better coupled with the code and easier to see and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, if it's what Strimzi usually does, yes let's use annotations. Done

@scholzj scholzj added this to the 0.1.0 milestone Jun 27, 2024
@mimaison
Copy link
Contributor Author

I rebased on main and removed my pom.xml changes related to packaging as it's been resolved in #27

@mimaison mimaison merged commit 4c1fd9e into strimzi:main Jun 27, 2024
7 checks passed
@mimaison mimaison deleted the new-client branch June 27, 2024 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants