Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the allowlist reconfigurable #64

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mimaison
Copy link
Contributor

@mimaison mimaison commented Nov 5, 2024

I've packed quite a few changes in this PR.

First let's cover the form. I moved a bunch of classes under a new package collector as we were starting to have many classes in the top level package. That way all logic about collecting metrics is regrouped and we only have the 2 reporter implementations and the config in the top level package.

Now I'll try to explain the logic. The main issue is that only MetricsReporter (used for Kafka metrics) is reconfigurable. KafkaMetricsReporter (used for Yammer metrics) is not reconfigurable. Internally Kafka has some custom logic to make the JMX implementation of KafkaMetricsReporter reconfigurable but we can't do that. To work around this issue, the logic relies on PrometheusCollector which is a singleton and holds references to both reporter implementations.

So when a user triggers a reconfiguration with the Admin API, KafkaPrometheusMetricsReporter.reconfigure() is invoked with the new configuration values. We build a new PrometheusMetricsReporterConfig instance just to parse the allowlist and pass the allowlist to the PrometheusCollector instance which in turns passes it to KafkaCollector and YammerCollector. Now that the allowlist is in the collector classes I moved the filtering logic there too (it used to be in KafkaPrometheusMetricsReporter and YammerPrometheusMetricsReporter).

Another complication is the need to handle cases when the scope of the allowlist is increased. For that reason, we need to keep track of all existing metrics (the otherMetrics field in MetricsCollector), so if the user decides to collect more metrics we can retrieve them. I don't think there's an alternative to this and this is actually what JmxReporter in Kafka does too (https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java).

Since both KafkaCollector and YammerCollector have to keep track of all metrics and do the same filtering logic, I moved that into MetricsCollector and change it from a interface to an abstract class.

Moving the logic into the collectors causes a significant change in behavior when you run multiple instances of the reporter in the same JVM, for example in Kafka Connect, Kafka Streams, or a Kafka application that starts multiple clients. In this case KafkaCollector being a singleton shared by all instances, the allowlist is now global to all clients. So if a user configures different allowlists, the last client starting will overwrite the previous allowlist. For Kafka Streams and Kafka Connect, users should set the allowlist simply using prometheus.metrics.reporter.allowlist and not set it per client using the producer., consumer. or admin. prefixes. In other applications when starting multiple clients they should all have the same allowlist.

So for example with the following configuration should be avoided:

metric.reporters=io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter
prometheus.metrics.reporter.listener=http://:8082
admin.metric.reporters=io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter
admin.prometheus.metrics.reporter.listener=http://:8082
producer.metric.reporters=io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter
producer.prometheus.metrics.reporter.listener=http://:8082
consumer.metric.reporters=io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter
consumer.prometheus.metrics.reporter.listener=http://:8082
consumer.prometheus.metrics.reporter.allowlist=kafka_consumer_consumer.*

Using it, you only get kafka_consumer_consumer metrics.

One thing I'm considering is making PrometheusMetricsReporterConfig a singleton. That way we would recommend setting all metrics-reporter configurations at the top level (without any prefix). We would lose the ability to assign different port to different client but I don't think that's a very useful "feature", it's probably best in most cases to expose a single endpoint.

Finally, note that the reconfiguration feature is only supported on brokers, that's not a limitation of the metrics-reporter, it's just the way it works in Apache Kafka.

@mimaison mimaison added this to the 0.2.0 milestone Nov 5, 2024
@mimaison mimaison requested a review from k-wall November 6, 2024 17:20
Signed-off-by: Mickael Maison <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant