Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hot Reload Hang on MacOS #79

Closed
nick-thompson opened this issue Jul 20, 2020 · 22 comments · Fixed by #81
Closed

Hot Reload Hang on MacOS #79

nick-thompson opened this issue Jul 20, 2020 · 22 comments · Fixed by #81
Labels
bug Something isn't working
Milestone

Comments

@nick-thompson
Copy link
Collaborator

I'm finding that during a hot reload on MacOS, Xcode just hangs in a permanent loop and the project freezes.

Dropping the debugger in shows that we're stuck in the Duktape finalizer callstack searching through the lambdaReleasePool, but it seems like there are a million things to be finalized. It looks like there indeed might be a bug with the lambdaReleasePool, because it's huge, but I'm not sure that's exactly what's hanging up Xcode here.

@JoshMarler have you seen this one?

@nick-thompson nick-thompson added the bug Something isn't working label Jul 20, 2020
@nick-thompson nick-thompson added this to the beta milestone Jul 20, 2020
@JoshMarler
Copy link
Owner

JoshMarler commented Jul 20, 2020

Interesting, I have seen this previously when the "animate" prop was set on the canvas element. The clearing of the duk context and release pool on a reload appeared to resolve the issues I was seeing but I'm not using animated draws. I'll be back at me desk shortly so I'll fire this up on Mac and take a look.

@nick-thompson
Copy link
Collaborator Author

Ahh right, you mentioned those fixes before. Yea, they sound like a good idea to me.

I just noticed that I can accelerate the lock up by resizing the window a bunch in gain plugin. Just making note of that here

@JoshMarler
Copy link
Owner

Yeah. I seem to remember it appeared to be mostly antagonised by the animate functionality on CanvasView but I'll have to take another look. Does seem there might be some little lurking thing between the release pool and duktape. I think the exponential growth of the pool existed from the first Canvas prototype but if I've introduced something mad apologies!

I'm just on separate thing at the moment but I will for sure take a look at this today. I'll let you know if I get anywhere with it.

@nick-thompson
Copy link
Collaborator Author

Yea, I think you're right. I don't think it was something you introduced, I think the finalizer stuff I wrote is probably buggy and not always removing lambdas correctly. If you have time I would definitely appreciate the look! I'll try to put some time into it later this week as well

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 21, 2020

Yeah no worries. I'll take a look today/tomorrow for sure. Heh last time I worked it all through under gdb made my brain melt ...

@nick-thompson
Copy link
Collaborator Author

:D Yea, this might be a gnarly one. Thank you!

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 28, 2020

Heh. Writing a little level meter component at the minute and getting freezes and bad performance .... I have a sneaking suspicion this has just bitten me. Into the bug dive I go! (I might take a good hard look and profile over the Canvas component whilst I'm in there too).

@nick-thompson
Copy link
Collaborator Author

😁 good luck! I haven't had a chance to dig in yet; hope it's not too gnarly

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 29, 2020

Looking through this at the moment. At a first pass, it almost looks like there is a significant lag in the call to the finalizers. From Duktape docs:

"Finalizers are executed for unreachable objects detected by reference counting or mark-and-sweep. The finalizer may not execute immediately, however, not even when reference counting detects that the object became unreachable."

I wonder if it literally is taking this long for the finalizers to be called after garbage collection. Hence our filling up of the pool which does eventually get cleared, though clearing blocks the UI/Message thead.

So high level thoughts are, can we increase GC rate? Can we reduce the number of LambdaHelpers being created for repeated Canvas redraws? If this lag is unavoidable should the release pool live on some backkground thread which can be incrementally cleared without blocking the message loop? Might be hard if we have to use a mutex to protect the pool ... we'll still block.

This comment is more me thinking aloud than anything at the moment..., will continue digging.

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 29, 2020

Based on look under the debugger, it does look like the rate at which we push lambdas into the pool when painting on a timer is far higher than the rate of GC. We push enough in there for the loops over the pool to significantly block the UI thread. Effectively doing an N squared lookup I guess, i.e. on garbage collect for every lambda to be cleared we iterate the entire vector in worst case.

A first experiment might be trying a different data-structure for the pool to increase the lookup times. unordered_set of pointers might be worth a try and preallocate some sensible amount of memory?

Failing that maybe we need a different scheme for the canvas onDraw property function invocations.

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 29, 2020

So yeah, a quick hack switching to std::unordered_set<LambdaHelper*> resolves the freeze. Obviously this sort of approach would need some tidying. We could add a key to LambdaHelper using a static counter and use unordered_map with that key to avoid lifetime traps.

I might think about this more though and workout if anything nicer can be done. That's a lot of small allocations and deallocations flying around.

Any thoughts on this one? I'm going to do some general profiling around canvas timer draws this eve so will let you know what hotspots I find.

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 29, 2020

Might also be interesting to manually call duk_gc after invoking the lambdas here.

svaarala/duktape#1550

I wonder if the "light function" concept is also worth some investigation, to avoid creating JS function objects in some cases which appear to require periodic garbage collection to destroy rather than being immediately released via ref counting.

Reference counting relies on mark-and-sweep to handle reference cycles. For example, every ECMAScript function instance is required to be in a reference loop with an automatic prototype object created for the function. You can break this loop manually if you wish. For internal technical reasons, named function expressions are also in a reference loop; this loop cannot be broken from user code and only mark-and-sweep can collect such functions.

https://wiki.duktape.org/howtolightfuncs

I need to read into this all in more detail ....

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 29, 2020

Also interesting: svaarala/duktape#1872

I'm not 100% sure yet, but I suspect our issue may be due at least in part to the sheer number of live objects we have at any one time. As you state, the more objects there are to mark, the more time it takes to run the algorithm. We can probably play tricks like hiding a manual GC during other expensive operations, but we're seeing GC hitches of as much as 200ms sometimes, which is really hard to hide in an application that needs to run at 30fps or higher.

Sounds very much like we might be seeing similar behaviour. I suspect GC with all these function objects being created is causing performance to drop off a cliff here. Having a bit of trouble getting anything useful out of perf at the moment. Will try mac in a second and maybe use the gain plugin example as less going on in there than this app.

I might try disabling mark and sweep and maually invoke gc inside readVarFromDukStack just to see what happens to the performance.

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 30, 2020

I can drop the unordered_map fix over as it does fix the animate case the GainPlugin example but worth noting I'm basically dispatching events to JS from C++ for meter updates on a 50ms timer which results in canvas redraws. This may just be asking too much of duktape/Blueprint but I feel like this should be resolvable ...

In my particular context this hack seems to further improve UI response. To be expected with stateless components till this optimisation is made I expect

    void ViewManager::setViewProperty(ViewId viewId, const juce::Identifier& name, const juce::var& value)
    {
        const auto& [view, shadow] = getViewHandle(viewId);

        view->setProperty(name, value);

        // For now, we just assume that any new property update means we
        // need to redraw or lay out our tree again. This is an easy future
        // optimization.
        if (name.toString() != "meterLevel" && name.toString() != "value") {
            shadow->setProperty(name, value);
            performRootShadowTreeLayout();
            view->repaint();
        }
    }

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 30, 2020

I'm also wondering whether we can avoid pushing a new canvas context object to the duk stack for every single invocation and if we can instead store a canvas context object permanently in the global namespace or similar ...

I'd be interested to see what happens if this approach is possible. There are also a few other performance improvements to be had like the performRootShadowTreeLayout optimisation, potential removal of vector temporaries in things like EcmascriptEngine::invoke and possibly usage of "light functions".

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 30, 2020

Sorry I'm spamming you a bit but I've been profiling so I'll register findings here.

. Rendering text via strokeText and fillText on the canvas object is currently a hotspot. Removing that made a fair amount of difference. I've got some text overlaid on this meter. I should probably have two components with one using a higher z-order, I'm not sure how z-order is intended to be controlled via component props at this stage? Is this still a thing to be implemented @nick-thompson ? Relative z orderings on views?

. The canvas drawImage function is also pretty costly, no great suprises there either.

. Hotspot is various locations in duk_pcall inside EcmascriptEngine::readVarFromDukStack in CanvasView::paint. No great suprises there. Most time is spent in garbage collection functions, object property lookup (duk_hobject_getptop) and property setting (maybe from repeated calls to bindCanvasContextProperties ? I have an idea for that).

Enabling these badboys (at least on 32bit arm) has also helped nicely:

. DUK_USE_LIGHTFUNC_BUILTINS
. DUK_USE_VALSTACK_UNSAFE
. DUK_USE_ALLOW_UNDEFINED_BEHAVIOR

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 30, 2020

OK ... I'm clearly also redrawing way too many things when I should just be drawing my meter ... some prop/state lifecycle thing happening I'm yet to catch/grasp.

EDIT: Yep. Removed unrelated component which draws an image to test. Performance back up ... Silly me. Well I've various performance fixes to send over to you off the back of this regardless (and probably some useful tips for docs). This has been most enlightening.

So, summary of points after all this spew @nick-thompson.

  1. Use unordered_set for the lambdaReleasePool rather than a vector.
  2. Avoiding computing layout on the entire shadow tree in response to any property change should be sorted.
  3. There may be some refinements in some of the canvas implementation functions.
  4. I think there is a way to avoid calling bindCanvasContextProperties for every canvas draw function, this is a hotspot.
  5. I still think removing the use of std::vector<juce::var> temporaries in EcmascriptEngine might be worth consideration,
    the approach using std::array with sizeof... on template parameter pack combined with juce::var::NativeFunctionArgs
    might be useful.
  6. The duk config options I mentioned are worth consideration (even if just in blueprint docs), they gave me some pretty decent performance boosts.
  7. Should we look into whether we're able to apply "light functions"? It looks like they could result in a significant performance increase. It looks like we could refine GC behaviour somehow, it's the main hotspot with our canvas redraws.
  8. I need to learn React.

After changing some of these things performance drawing a very busy meter on an embedded device is pretty darn good, so I think there are strong hopes in regards to: #28

@nick-thompson
Copy link
Collaborator Author

nick-thompson commented Jul 30, 2020

@JoshMarler you're the man, this is so awesome. I'll try to address everything in a few high level thoughts, to start:

  1. Yep, I'm down for unordered_set, that's a good calll
  2. Yes indeed! Optimize layout computation #5
  3. I'd be curious to see a before/after profile of how much we get from that. If it's really significant, I'm interested
  4. I didn't know about these, but I'm very interested to have them in the docs or if we can find a way to integrate duktape better (Enable Custom Duktape Integrations #62), enabling these things by default
  5. Definitely when looking into this, make sure you're using shouldComponentUpdate or the new hooks API to ensure components don't update when they don't need to. That's always a huge performance help.

High level I'm thinking basically that there are two types of functions that we push to javascript: persistent functions via registerNativeMethod, and the more transient functions like rect() on the CanvasContext that we prepare and push onto the stack for invoking onDraw in the CanvasView. In the first case, we obviously need to hold onto that lambda to keep the function implementation alive. I think that's a strong use case for the duktape finalizers. The latter case, I feel like we could probably find a way to push the CanvasContext onto the stack, call onDraw, and then immediately kill the lambdas that we had registered for that call. After onDraw that CanvasContext won't be on the duk stack anymore, and it's almost certainly the case that the last reference has been dropped after that call completes. We could manually invoke the GC right there, but I think that's overkill. I'm wondering if there's a way to distinguish that we're taking the "transient" path here, avoid ever pushing these functions into the lambdaRelasePool, and allow their destruction when the current C++ stack frame unwinds.

I need to think on this for a bit, but I feel like there's likely an elegant solution there...

Thanks a million for the detail here @JoshMarler, I'll write back with more thoughts soon

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 30, 2020

@nick-thompson,

No Problem! It's been massively valuable to dig into this. I found the causes for my wasted reconciles. Indeed, I reckon the PureComponent idea might be worth spattering around my code base to assist when others make similar mistakes etc.

The idea around CanvasContext is interesting, I totally agree. I wondered about the whole "light function" thing and whether that is some we could somehow leverage but I'm still not sure yet. I did briefly experiment with having a single CanvasContext registered as a global property, it sort of worked with much odd behaviour, basically ReactApplicationRoot owned the instance and passed it into the CanvasView's constructor. I don't think that's a particularly good idea though.

I'll muse over the "transient" function objects idea too once I've gotten other tasks out of the way ... that does feel like the right approach, maybe even coupled with the "light-function" facility. If we can distinguish these "transient" function objects it feels like light-functions might be a very good idea, I suspect they could improve performance for this sort of thing dramatically (if there is indeed no associated GC etc).

If we could workout someway for things like canvas onDraw calls running on timers to put less stress on the GC I think that would be a HUGE win. The performance is actually looking really good around all of this now.

I'll get the unordered_set fix over to you and deal with #5 as a start.

@JoshMarler
Copy link
Owner

@nick-thompson, just because you might have missed it? Any thoughts on the z-order thing? I've managed to get position: absolute working for some use cases but be cool to sort that little issue out.

@JoshMarler
Copy link
Owner

JoshMarler commented Jul 31, 2020

@nick-thompson, PR with fix for the freeze here: #81

Will follow up with a PR for that ViewManager::setViewProperty improvement shortly. I suspect I can probably handle #4 and #5 in the same PR if that's cool with you? Having constant identifiers for the various yoga/layout properties will probably make #5 easier.

@nick-thompson
Copy link
Collaborator Author

Ahh yea sorry, I skipped right over the z-order question. So z-order is already sort of implemented based on literally the order of react children:

<View>
  <View />
  <View />
  <Modal /> // Move this to the top; the first child of the parent `View` will have the top-most z order
</View>

Implementing a proper z-index property implementation is on the TODO list, and should be quite straightforward: by default we continue with the default z-order behavior (based on the order of the children in the parent), but allow a z-index property (https://developer.mozilla.org/en-US/docs/Web/CSS/z-index) whose behavior is implemented by walking the parent's list of children and moving the child appropriately using https://docs.juce.com/master/classComponent.html#a1b0cca8b9a9fea673aefc84a42602bcd.

Definitely happy to have you tackle #5 and #4. Same PR is cool with me. For #5 I want to dig into the React HostConfig a little more to see if there really is no "commit" callback, as if to say "we've now applied all property updates, commit the update transaction." That certainly seems like the ideal time to do the layout. If not, I think we should use something like https://docs.juce.com/master/classAsyncUpdater.html to coalesce the updates. For example, if I set width: 100, we enqueue the layout callback via triggerAsyncUpdate. But then maybe another property change for height: 100 comes through and we call that same triggerAsyncUpdate, and if this is all synchronous then the previous pending update will still be sitting there and we get automatic coalescence. Mixing this into the updates from #4 and only triggering the async update if the set of properties changed involved layout-related properties seems like the big winner.

I'll take a look at the PR now, thanks so much for digging into it. I want to spend some time as well playing with Duktape, lightfuncs, and exploring this GC thing and "transient" function stack objects. I'll write back if I find anything

nick-thompson pushed a commit that referenced this issue Jul 31, 2020
 This fix resolves #79. The freeze was caused by an O(Nsquared) lookup
 on the LambdaReleasePool vector when GC kicked in and triggered our
 LambdaReleasePool finalizers.

 Note this also bumps juce from 5.4.1 to 5.4.2 as std::hash
 specialisation for juce::Uuid was added to the Uuid header in 5.4.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants