Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark triggering massive amounts of "ThreadDump" safepoints, causing stutter/lag #458

Open
pietro-lopes opened this issue Sep 19, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@pietro-lopes
Copy link

Description

Some players at ATM10 are having some lag spikes and I asked them to turn on safepoint and GC logs to see what is going on and turn out this is happening:

With spark https://mclo.gs/q9v2q3W

Without spark https://mclo.gs/XpTnbfu

Reproduction Steps

Happens just by having spark (maybe the background profiler?)

Looks like it is happening to very few people, I can't reproduce it at Linux (PopOS) or Windows 10.

Expected Behaviour

Don't know, is it suffering from safepoint bias (at least for Windows)?

Platform Information

  • Minecraft Version: 1.21.1
  • Platform Type: client
  • Platform Brand: Neoforge
  • Platform Version: Neo 21.1.47

Spark Version

1.10.97

Logs and Configs

No response

Extra Details

here is some random spark from that player if you need to grab some PC/config specs
https://spark.lucko.me/fPQnwEqJ2K

@pietro-lopes pietro-lopes added the bug Something isn't working label Sep 19, 2024
@SirYwell
Copy link
Contributor

It seems like the Reaching safepoint time is pretty high every now and then. It might be related to GC (I'm also seeing allocation stalls, that might indicate that memory just isn't sufficient). Does that also happen with either other GCs or more memory assigned?

The way spark takes thread dumps without async-profiler requires threads to be at a safepoint, but safepoint bias is more about less precise measurements than performance overhead/lag spikes.

@pietro-lopes
Copy link
Author

Another person
https://spark.lucko.me/gghY5nDptL (for spec references)

With spark (at this time didn't asked to use the gc debug option, only safepoint)
https://mclo.gs/NX3UTPO

(nearly ~21s of pause only for ThreadDump, on an aplication running for 232s)

No spark
https://mclo.gs/gETRTlg
(now a total of ~2s of pause for app running for ~236s)

@pietro-lopes
Copy link
Author

And now just another player had same issue and fixed by disabling background profiler.
We will ship that config disabled by default for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants