Skip to content
This repository has been archived by the owner on May 5, 2022. It is now read-only.

Statuser causes main-thread jank itself #32

Open
vdjeric opened this issue Feb 2, 2016 · 3 comments
Open

Statuser causes main-thread jank itself #32

vdjeric opened this issue Feb 2, 2016 · 3 comments
Labels

Comments

@vdjeric
Copy link

vdjeric commented Feb 2, 2016

Statuser reported a few BHR hangs whose stack was entirely composed of an XRE_Main frame. I got curious and captured a profile -- if the BHR uptime timestamps are to be believed, the hang was caused by numGeckoThreadHangs() in Statuser itself.

Can this function be split up and made lighter?

@chutten
Copy link
Owner

chutten commented Feb 2, 2016

Aside from the Services.telemetry.threadHangStats getter, the whole thing could be thrown into a Worker. However, the amount of processing we're doing is really low. If the runtime is that high, the number of hanging threads is high so find() is taking too long, the number of counts buckets is high so the foreach is taking too long, or some combination of both. Either way, it's the size of the input that's causing the slowdown, so it'll be expensive to send that much input to a worker.

IOW, I'm not sure that off-threading it would help.

Breaking it into pieces we could do. Throw a setTimeout(1) in the middle (after the refactor to Promises, no problem) and that might help.

We could refactor the counts.forEach to run backwards. We know we're looking for all contents of all buckets above a certain value, so we can early return going that way.

@Uberi , do you have any ideas to add?

@chutten chutten added the bug label Feb 2, 2016
@avih
Copy link

avih commented Feb 2, 2016

well.. we could assess this by measuring how long this call takes (from within the addon itself) and if it's more than, say, 2ms or so, do a console.log("numGeckoThreadHangs() took " + duration + " ms");

@Uberi
Copy link
Collaborator

Uberi commented Feb 2, 2016

  • I agree that sending all that data to a worker would itself probably be slower than it is now. While the reverse forEach trick would have a small improvement, I'm not sure it would be noticeably faster.
  • Splitting it up with a setTimeout seems fine to me to work around this hang. Seems kind of messy though.
  • Is this using the latest child-hangs branch of the addon? It's got a lot of changes that might cause a different outcome. Here's an XPI for testing (unsigned, built from latest child-hangs branch):
    @statuser-0.1.4.xpi.zip
  • If so, does the profiler say that the hang is caused by the numGeckoHangs function itself, or one of the anonymous functions inside it (I couldn't reproduce the issue)? We don't seem to be doing any significant computation in the main body...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants