Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV Tomcat 10.1 Java 17 #3521

Closed
fr19k opened this issue Feb 9, 2024 · 29 comments
Closed

SIGSEGV Tomcat 10.1 Java 17 #3521

fr19k opened this issue Feb 9, 2024 · 29 comments
Labels
agent-java community Issues and PRs created by the community triage

Comments

@fr19k
Copy link

fr19k commented Feb 9, 2024

Hi,

Elastic Apm agent (version 1.46.0) causes SIGSEGV crash in JVM (17.0.10)

Then crash occures usually in 30min after kubernetes pod starts
My application is using tomcat 10.1.18 docker image

In attachment you can find Error Report from JVM

Application works properly with disabled/removed Elastic APM Agent

is there any workaround for that? maybe some fix?

hss_err_pid1.txt

@github-actions github-actions bot added agent-java community Issues and PRs created by the community triage labels Feb 9, 2024
@jackshirazi
Copy link
Contributor

Does it still happen if you use a different garbage collector?

@fr19k
Copy link
Author

fr19k commented Feb 9, 2024

I didn't try it. My application uses G1GC

Should i try SerialGC?

@fr19k
Copy link
Author

fr19k commented Feb 9, 2024

On another enviroment my application uses SerialGC, but the problem still exists

@jackshirazi
Copy link
Contributor

Any chance of producing a test we can reproduce?

@jackshirazi
Copy link
Contributor

Can you post a couple of the other crashlogs too please

@jackshirazi
Copy link
Contributor

Also just to confirm, you haven't enabled inferred spans? You can see the properties that are set in the startup info (agent output with lines including StartupInfo)

@fr19k
Copy link
Author

fr19k commented Feb 9, 2024

I think that i didn't enable inferred spans

2024-02-06 21:00:51,269 [main] INFO co.elastic.apm.agent.util.JmxUtils - Found JVM-specific OperatingSystemMXBean interface: com.sun.management.OperatingSystemMXBean 2024-02-06 21:00:51,372 [main] INFO co.elastic.apm.agent.util.JmxUtils - Found JVM-specific ThreadMXBean interface: com.sun.management.ThreadMXBean 2024-02-06 21:00:51,395 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - Starting Elastic APM 1.46.0 as app on Java 17.0.10 Runtime version: 17.0.10+7 VM version: 17.0.10+7 (Eclipse Adoptium) Linux 5.15.4-1.el7.elrepo.x86_64 2024-02-06 21:00:51,396 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - service_name: 'app' (source: Java System Properties) 2024-02-06 21:00:51,396 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - secret_token: 'XXXX' (source: Environment Variables) 2024-02-06 21:00:51,397 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - server_urls: 'https://server1:8200,https://server1:8200' (source: Java System Properties) 2024-02-06 21:00:51,397 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - application_packages: 'package1,package2,package3' (source: Java System Properties) 2024-02-06 21:00:53,178 [main] INFO co.elastic.apm.agent.impl.ElasticApmTracer - Tracer switched to RUNNING state 2024-02-06 21:00:54,004 [elastic-apm-server-healthcheck] INFO co.elastic.apm.agent.report.ApmServerHealthChecker - Elastic APM server is available: { "build_date": "2022-01-28T08:52:26Z", "build_sha": "a1459eb3f24e7c054794a82f6c19b66975098866", "publish_ready": true, "version": "7.17.0"} 2024-02-06 21:00:54,047 [elastic-apm-server-healthcheck] INFO co.elastic.apm.agent.report.ApmServerHealthChecker - Elastic APM server is available: { "build_date": "2022-01-28T08:52:26Z", "build_sha": "a1459eb3f24e7c054794a82f6c19b66975098866", "publish_ready": true, "version": "7.17.0"}

What is the most important for you in other crashlogs? Problematic frame?

J 20648 c2 co.elastic.apm.agent.loginstr.reformatting.AbstractEcsReformattingHelper.onAppendEnter(Ljava/lang/Object;)Z (124 bytes) @ 0x00007f20ae058090 [0x00007f20ae057ce0+0x00000000000003b0]

@jackshirazi
Copy link
Contributor

Are all the stacks the same in the crashlogs? If any are different we'd like to see any other stacks.

Two suggestions we'd like you to try separately:

  1. Generate a JITC log eg -XX:+UnlockDiagnosticVMOptions -Xlog:class+load=info -XX:+LogCompilation (more details here ) and specifically identify if there are any JITC actions just before the crash (or include the log here, we can look)
  2. Exclude the crashing method from JITC, eg adding to the command line
    -XX:CompileCommand=exclude,co/elastic/apm/agent/collections/CachedKeyWeakConcurrentMap.getLookupKey

@fr19k
Copy link
Author

fr19k commented Feb 14, 2024

Ad.2 I added

-XX:CompileCommand=exclude,co/elastic/apm/agent/collections/CachedKeyWeakConcurrentMap.getLookupKey

Now the problematic frame is : # J 2489 c2 co.elastic.apm.agent.weakconcurrent.CachedLookupKey.get(Ljava/lang/Object;)Lco/elastic/apm/agent/weakconcurrent/CachedLookupKey; (16 bytes) @ 0x00007f1168fcfbc6 [0x00007f1168fcfac0+0x0000000000000106]

ErrorFile in attachment

hss_err_pid1_2.txt

@jackshirazi
Copy link
Contributor

Did you get any info from the JITC logging?

Can we try more generic JITC disabling, by now adding to the command line

-XX:CompileCommand="exclude,co/elastic/apm/agent/collections/CachedKeyWeakConcurrentMap.*" -XX:CompileCommand="exclude,co/elastic/apm/agent/weakconcurrent/CachedLookupKey.*" -XX:CompileCommand="exclude,co/elastic/apm/agent/loginstr/reformatting/AbstractEcsReformattingHelper.*" 

It's not at all clear why this is crashing, the hotspot logging is itself crashing (hence the incomplete crashlog) but it's successfully seeing everything in the Elastic data structure, so it's something outside that, it's a process of narrowing down by excluding other things

If you need an urgent workaround, you can disable the reformatting eg -Delastic.apm.disable_instrumentations=log-reformatting

@fr19k
Copy link
Author

fr19k commented Feb 15, 2024

JITC log will be available soon...

I Added below instruction to command line
-XX:CompileCommand="exclude,co/elastic/apm/agent/collections/CachedKeyWeakConcurrentMap.*" -XX:CompileCommand="exclude,co/elastic/apm/agent/weakconcurrent/CachedLookupKey.*" -XX:CompileCommand="exclude,co/elastic/apm/agent/loginstr/reformatting/AbstractEcsReformattingHelper.*"

and now problematic frame is:
J 875 java.lang.System.identityHashCode(Ljava/lang/Object;)I [email protected] (0 bytes) @ 0x00007f5d84f568ce [0x00007f5d84f568a0+0x000000000000002e]

New error file is available in attachment

hss_err_pid1_3.txt

@jackshirazi
Copy link
Contributor

Yes, this is crashing inside the JVM native code in ObjectSynchronizer::FastHashCode. It's not at all clear why that would happen, we'll continue to look into it but I don't expect any quick solution, it's not even clear this is an agent bug. The best we can suggest is the workaround of disabling the log-reformatting given above.

If you can create a standalone example that you can pass to us to reproduce this crash, that would be awesome!

@JonasKunz
Copy link
Contributor

JonasKunz commented Feb 21, 2024

Hi @fr19k ,

We have another user with the same crash: Both crashes occur due to java.util logging reformatting on the same System.identityHashCode invocation.

One thing I noticed is that both you and the other user specify the -java.util.logging.manager=org.apache.juli.ClassLoaderLogManager flag to configure the log manager.

Could you maybe try again without that flag (so using the default logmanager) and check whether the error still occurs?

@fr19k
Copy link
Author

fr19k commented Feb 21, 2024

OK, i will check your suggestion with log manager

I can confirm that workaround with -Delastic.apm.disable_instrumentations=log-reformatting works properly (application is not crashing)

@fr19k
Copy link
Author

fr19k commented Feb 23, 2024

What is the default java.util.logging.manager ?

Tomcat seems to allows to overwrite java.util.logging.manager via LOGGING_MANAGER env, but i have specify proper class

@JonasKunz
Copy link
Contributor

I think it should be java.util.logging.LogManager, though not 100% sure here. That class has a protected constructor, so we might need to create a subclass to reset it.

Alternatively you might be able to simply remove the part where it configures java.util.logging.manager from the tomcat startup script.

@fr19k
Copy link
Author

fr19k commented Feb 23, 2024

If i set env variable LOGGING_MANAGER to java.util.logging.LogManager , application crashes at startup with below message

Error: Main method not found in class java.util.logging.LogManager, please define the main method as:
   public static void main(String[] args)
or a JavaFX application class must extend javafx.application.Application

@JonasKunz
Copy link
Contributor

Based on this SO thread it looks like you need to provide the full arguments for the LOGGING_MANAGER env variable:
LOGGING_MANAGER="-Djava.util.logging.manager=java.util.logging.LogManager"

If that doesn't work, I think we can also trick tomcat by setting LOGGING_MANAGER="-Ddummy.prop=foobar": I'm assuming here that the tomcat startup script checks if LOGGING_MANAGER is empty and only in that case applies it's own log manager.

As @jackshirazi mentioned earlier, it would most likely be easier for us to find the root cause if you could provide us an application / docker image to allow us to reproduce this crash ourselves.

@SylvainJuge
Copy link
Member

Hi @fr19k,

Given that the crash seems quite fast to trigger here, we'd like to have a bit more information on your app/setup to enable us attempting to reproduce it.

could you provide the following:

  • agent configuration: some config is passed through JVM variables, but the log reformatting seems enabled through env variables.
  • few implementation details on the app: what are the dependencies, does it uses java.util.logging for logging directly or is it only for the Tomcat part ?
  • how much activity is there on the app before it gets triggered ? for example is that an "intensive load test for 30min", or does it just crashes with few hundreds of transactions over 30min ? The number of transactions between the start and the time it crashed in Kibana could provide a good hint here. Also, do you have a rough estimate on the GC activity in terms of overhead or executions per minute ?

@fr19k
Copy link
Author

fr19k commented Mar 12, 2024

Hi,

agent configuration: some config is passed through JVM variables, but the log reformatting seems enabled through env variables.

Via env variables i'm passing ELASTIC_APM_SECRET_TOKEN, any other configs are passed through JVM variables

few implementation details on the app: what are the dependencies, does it uses java.util.logging for logging directly or is it only for the Tomcat part ?

if i set LOGGING_MANAGER to, "-Djava.util.logging.manager=java.util.logging.LogManager" or "-Ddummy.prop=foobar" i receive many errors from tomcat like

java.lang.ClassNotFoundException: 2localhost.org.apache.juli.AsyncFileHandler at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) at java.logging/java.util.logging.LogManager.createLoggerHandlers(LogManager.java:1005) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:975) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.initializeGlobalHandlers(LogManager.java:2424) at java.logging/java.util.logging.LogManager$RootLogger.addHandler(LogManager.java:2514) at java.logging/java.util.logging.LogManager.setLoggerHandlers(LogManager.java:989) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:974) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.addLogger(LogManager.java:1204) at java.logging/java.util.logging.LogManager$2.run(LogManager.java:398) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.ensureLogManagerInitialized(LogManager.java:383) at java.logging/java.util.logging.LogManager.getLogManager(LogManager.java:431) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandJULLoggerFor(LoggingProviderImpl.java:409) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandLoggerFor(LoggingProviderImpl.java:438) at java.base/jdk.internal.logger.DefaultLoggerFinder.getLogger(DefaultLoggerFinder.java:159) at java.base/jdk.internal.logger.LazyLoggers.getLoggerFromFinder(LazyLoggers.java:410) at java.base/jdk.internal.logger.LazyLoggers.getLazyLogger(LazyLoggers.java:465) at java.base/sun.util.logging.PlatformLogger.getLogger(PlatformLogger.java:297) at java.base/sun.net.www.protocol.http.HttpURLConnection.<clinit>(HttpURLConnection.java:436) at java.base/sun.net.www.protocol.http.Handler.openConnection(Handler.java:62) at java.base/java.net.URL.openConnection(URL.java:1144) at co.elastic.apm.agent.util.UrlConnectionUtils.openUrlConnectionThreadSafely(UrlConnectionUtils.java:47) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.executeRequest(CloudMetadataProvider.java:418) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.getAwsMetadata(CloudMetadataProvider.java:222) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.access$000(CloudMetadataProvider.java:53) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:155) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:149) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) at java.base/java.lang.Thread.run(Thread.java:840) Can't load log handler "3manager.org.apache.juli.AsyncFileHandler" java.lang.ClassNotFoundException: 3manager.org.apache.juli.AsyncFileHandler java.lang.ClassNotFoundException: 3manager.org.apache.juli.AsyncFileHandler at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) at java.logging/java.util.logging.LogManager.createLoggerHandlers(LogManager.java:1005) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:975) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.initializeGlobalHandlers(LogManager.java:2424) at java.logging/java.util.logging.LogManager$RootLogger.addHandler(LogManager.java:2514) at java.logging/java.util.logging.LogManager.setLoggerHandlers(LogManager.java:989) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:974) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.addLogger(LogManager.java:1204) at java.logging/java.util.logging.LogManager$2.run(LogManager.java:398) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.ensureLogManagerInitialized(LogManager.java:383) at java.logging/java.util.logging.LogManager.getLogManager(LogManager.java:431) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandJULLoggerFor(LoggingProviderImpl.java:409) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandLoggerFor(LoggingProviderImpl.java:438) at java.base/jdk.internal.logger.DefaultLoggerFinder.getLogger(DefaultLoggerFinder.java:159) at java.base/jdk.internal.logger.LazyLoggers.getLoggerFromFinder(LazyLoggers.java:410) at java.base/jdk.internal.logger.LazyLoggers.getLazyLogger(LazyLoggers.java:465) at java.base/sun.util.logging.PlatformLogger.getLogger(PlatformLogger.java:297) at java.base/sun.net.www.protocol.http.HttpURLConnection.<clinit>(HttpURLConnection.java:436) at java.base/sun.net.www.protocol.http.Handler.openConnection(Handler.java:62) at java.base/java.net.URL.openConnection(URL.java:1144) at co.elastic.apm.agent.util.UrlConnectionUtils.openUrlConnectionThreadSafely(UrlConnectionUtils.java:47) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.executeRequest(CloudMetadataProvider.java:418) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.getAwsMetadata(CloudMetadataProvider.java:222) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.access$000(CloudMetadataProvider.java:53) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:155) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:149) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) at java.base/java.lang.Thread.run(Thread.java:840) Can't load log handler "4host-manager.org.apache.juli.AsyncFileHandler" java.lang.ClassNotFoundException: 4host-manager.org.apache.juli.AsyncFileHandler java.lang.ClassNotFoundException: 4host-manager.org.apache.juli.AsyncFileHandler at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525) at java.logging/java.util.logging.LogManager.createLoggerHandlers(LogManager.java:1005) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:975) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.initializeGlobalHandlers(LogManager.java:2424) at java.logging/java.util.logging.LogManager$RootLogger.addHandler(LogManager.java:2514) at java.logging/java.util.logging.LogManager.setLoggerHandlers(LogManager.java:989) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:974) at java.logging/java.util.logging.LogManager$4.run(LogManager.java:971) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:971) at java.logging/java.util.logging.LogManager.addLogger(LogManager.java:1204) at java.logging/java.util.logging.LogManager$2.run(LogManager.java:398) at java.base/java.security.AccessController.doPrivileged(AccessController.java:318) at java.logging/java.util.logging.LogManager.ensureLogManagerInitialized(LogManager.java:383) at java.logging/java.util.logging.LogManager.getLogManager(LogManager.java:431) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandJULLoggerFor(LoggingProviderImpl.java:409) at java.logging/sun.util.logging.internal.LoggingProviderImpl.demandLoggerFor(LoggingProviderImpl.java:438) at java.base/jdk.internal.logger.DefaultLoggerFinder.getLogger(DefaultLoggerFinder.java:159) at java.base/jdk.internal.logger.LazyLoggers.getLoggerFromFinder(LazyLoggers.java:410) at java.base/jdk.internal.logger.LazyLoggers.getLazyLogger(LazyLoggers.java:465) at java.base/sun.util.logging.PlatformLogger.getLogger(PlatformLogger.java:297) at java.base/sun.net.www.protocol.http.HttpURLConnection.<clinit>(HttpURLConnection.java:436) at java.base/sun.net.www.protocol.http.Handler.openConnection(Handler.java:62) at java.base/java.net.URL.openConnection(URL.java:1144) at co.elastic.apm.agent.util.UrlConnectionUtils.openUrlConnectionThreadSafely(UrlConnectionUtils.java:47) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.executeRequest(CloudMetadataProvider.java:418) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.getAwsMetadata(CloudMetadataProvider.java:222) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider.access$000(CloudMetadataProvider.java:53) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:155) at co.elastic.apm.agent.impl.metadata.CloudMetadataProvider$1.call(CloudMetadataProvider.java:149) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) at java.base/java.lang.Thread.run(Thread.java:840)

My application uses logback for logging. The class java.util.logging is used only for tomcat part

how much activity is there on the app before it gets triggered ? for example is that an "intensive load test for 30min", or does it just crashes with few hundreds of transactions over 30min ? The number of transactions between the start and the time it crashed in Kibana could provide a good hint here. Also, do you have a rough estimate on the GC activity in terms of overhead or executions per minute ?

My application crashes in every 30-60mins after normal traffic. When I disable log-reformatting, pod is stable (15days without restart)

There was no intensive load tests, it's just normal production traffic.
In other application i have many messages from APM agent like
2024-03-12 11:27:54,279 [Worker-45] WARN co.elastic.apm.agent.impl.transaction.Span - Max spans (200) for transaction 'ACM_JOB' 00-74dd6dc51fe53e21518509858f7f1eca-e59cd06e8969cc62-01 (49052de7) has been reached. For this transaction and possibly others, further spans will be dropped. See config param 'transaction_max_spans'.
But Agent is working stable.
In application(which crash every 30-60mins), there is no logs like above

GcLogs looks good (there are no full gc pauses), throughput ~99.5%

Below you can see a chart with typical(production) number of transactions
image

@SylvainJuge
Copy link
Member

Thanks for your feedback, with that we should be able to attempt reproducing it.

One interesting part is that while the application uses logback for logging, the JUL logging API used by Tomcat should have very light usage in production.

@fr19k
Copy link
Author

fr19k commented Mar 14, 2024

Logback is used due there is plugin(logback-logstash-encoder) to sending logs directly to logstash

@jackshirazi
Copy link
Contributor

Hi @fr19k, an Oracle engineer has suggested running the test with -Xint - which although it would be much slower, could indicate whether or not this is a compiler bug. Would you be willing to run that test? It would likely need to run for much longer than previous tests

@jackshirazi
Copy link
Contributor

Hi, no need to try this, we've reproduced the crash and found that disabling the c2 compiler (with -XX:TieredStopAtLevel=1) stops it crashing, so most likely this is a c2 compiler bug. We'll continue to work on a resolution for this

@jackshirazi
Copy link
Contributor

To be closed after JVM release with https://bugs.openjdk.org/browse/JDK-8322726 fixed (we also expect a backport to 17 and 21, so wait for those for tracking purposes before closing)

@jackshirazi
Copy link
Contributor

Hi @yuvalbar84 this is not a jar option, the c2 compiler is a JIT compiler in the JVM. Your options are any of these

  • run the JVM with the command line flag -XX:TieredStopAtLevel=1
  • run the agent with disable_instrumentations=log-reformatting, eg if setting the property on the command line this would be -Delastic.apm.disable_instrumentations=log-reformatting
  • wait for the next release of the JVM, the JVM bug fix has been merged into 23 and backported to 21 and 17 so will be included in the next release of each of these JVM versions

@JonasKunz
Copy link
Contributor

Should be fixed now in JDK 17.0.12 according to the release notes.

@yuvalbar84
Copy link

@JonasKunz thank you !

@JonasKunz
Copy link
Contributor

JonasKunz commented Aug 14, 2024

MInor correction:
The issue is fixed for the oracle bundled patch releases 17.0.11.0.1 and 17.0.12.0.1. For OpenJDK the fix will ship with 17.0.13. (JVM Bug)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-java community Issues and PRs created by the community triage
Projects
None yet
Development

No branches or pull requests

5 participants