You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
logger.error("Error trying to connect to APM Server at {}. Although not necessarily related to SSL, some related SSL " +
to WARN for the first couple of connection issues, and then fallback to using ERROR. What do you think?
Same goes for IntakeV2ReportingEventHandler.
2024-04-20 18:54:21,338 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.AbstractIntakeApiHandler - Error trying to connect to APM Server at http://127.0.0.1:8200/intake/v2/events. Although not necessarily related to SSL, some related SSL configurations corresponding the current connection are logged at INFO level.
2024-04-20 18:54:21,338 [elastic-apm-server-reporter] INFO co.elastic.apm.agent.report.AbstractIntakeApiHandler - Backing off for 36 seconds (+/-10%)
The text was updated successfully, but these errors were encountered:
If I understand it correctly, you'd like to have the following:
for the first few occurrences of a connection issue: just issue a warning.
if the connection issues persist, then issue an error.
The problem I see here is that it should be considered an ERROR or just a WARNING is very context-sensitive and will depend on the application and the user expectations, so it is very hard to come with a common rule. For example when the apm server can't be reached some applications have a very light load and can buffer for a while, while others would have lots of traffic and will drop data very quickly.
In addition, when querying log messages, a filter on the log level is often applied first, and unless the query is based on the log message it could become confusing to have the same message reported with two different log levels or even hide the WARN/ERROR occurrences if the user is focused on the other ERROR/WARN level occurences.
So here I think it would not be worth modifying the current behavior and keep this as an ERROR. If however you are in an environment where such error messages are too frequent then that's very likely a symptom of a potential issue like apm-server high load or a network issue which should not be ignored.
The
AbstractIntakeApiHandler
has support for retryability and backoff - hence I think it would be a good idea to change the log level atapm-agent-java/apm-agent-core/src/main/java/co/elastic/apm/agent/report/AbstractIntakeApiHandler.java
Line 123 in e010c35
WARN
for the first couple of connection issues, and then fallback to usingERROR
. What do you think?Same goes for
IntakeV2ReportingEventHandler
.The text was updated successfully, but these errors were encountered: