Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Python Garbage Collection #142

Merged
merged 1 commit into from
Jun 18, 2021
Merged

Conversation

jakirkham
Copy link
Collaborator

There doesn't appear to be an environment variable to disable Garbage Collection. So this disables Garbage Collection in sitecustomize.py to ensure it is off at process startup. Though we will need to reinstall this file on the machine running the benchmarks to get it to work

Copy link
Contributor

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc'ing @gjoseph92 for visibility

@jakirkham
Copy link
Collaborator Author

@quasiben mentioned Gabe had turned off GC and this had a notable effect. Interested in seeing if we can reproduce this in our run as well. Also curious what the new bottlenecks are with this change

@jakirkham
Copy link
Collaborator Author

Though more generally if GC's effect is this pronounced we may want to consider adding this as a config parameter in Distributed. Or perhaps take a more active role of controlling when GC happens in the Scheduler. Maybe you or Gabe have thoughts on this :)

@gjoseph92
Copy link
Contributor

gjoseph92 commented May 14, 2021

@jakirkham I'll post results and my code soon, but this is what I was doing:

    def disable_gc():
        # https://github.com/benfred/py-spy/issues/389#issuecomment-833903190
        import gc

        gc.disable()
        gc.set_threshold(0)

    print("Disabling GC on scheduler")
    client.run_on_scheduler(disable_gc)

I think the gc.set_threshold(0) is important. If you look at benfred/py-spy#389 (comment), he links to a blog post from Instagram about disabling GC. The weirdly fortunate thing I happened to see in that post was:

With some debugging with GDB, we found that apparently one of the third-party libraries we used (msgpack) calls gc.enable() to bring it back, so gc.disable() at bootstrapping was washed.

That'll probably save us a couple days of confusion!

Though more generally if GC's effect is this pronounced we may want to consider adding this as a config parameter in Distributed.

I doubt that disabling GC is a solution per se. (The Instagram post is a good comparison, because it's very different from our case, in that they have lots of forked child processes and are trying to retain shared memory pages between children and the parents, and the pages would otherwise be dirtied by mutating GC pointers on PyObjects, causing the kernel to copy the pages. Whereas we're not forking or worried about shared pages here.) I haven't looked at memory stats at all, but I'd think that if GC is taking so long, it has a lot of garbage to collect, and if it doesn't collect it, memory usage might be a problem. I'm more interested in figuring out what's causing the reference cycles, and how we can avoid creating them in the first place.

The main reason I disabled GC initially is because it can lead to skewed profiling results. Because whatever function happen to be running when GC pauses the world will show up as having a hugely exaggerated runtime, and you might end up going off and trying to optimize some code that profiling accuses of being slow, when in fact the code itself wasn't the issue at all.

@jakirkham
Copy link
Collaborator Author

Thanks for the update Gab 😄

@jakirkham I'll post results and my code soon

Eager to see them! 😁

I think the gc.set_threshold(0) is important

Ah good point. Added 👍

a blog post from Instagram about disabling GC

Got it. Yep have read this post before. Wouldn't have guessed it would be relevant to our use case.

. The weirdly fortunate thing I happened to see in that post was:

With some debugging with GDB, we found that apparently one of the third-party libraries we used (msgpack) calls gc.enable() to bring it back, so gc.disable() at bootstrapping was washed.

Interesting. Does this only happen at import? If so, maybe we just import msgpack in this file. Admittedly a hack, but the whole file is a hack 😛

I doubt that disabling GC is a solution per se.

Likewise, which is what I was hinting at near the end. Anyways it's good to know GC is part of the problem.

I haven't looked at memory stats at all, but I'd think that if GC is taking so long, it has a lot of garbage to collect, and if it doesn't collect it, memory usage might be a problem. I'm more interested in figuring out what's causing the reference cycles, and how we can avoid creating them in the first place.

Agreed this sounds important. Though I guess I'm not too surprised with all the different references to other objects held in the various *State and Task* classes. Am wondering to what extent this can be addressed.

The main reason I disabled GC initially is because it can lead to skewed profiling results. Because whatever function happen to be running when GC pauses the world will show up as having a hugely exaggerated runtime, and you might end up going off and trying to optimize some code that profiling accuses of being slow, when in fact the code itself wasn't the issue at all.

Am interested to see where the time is spent once it is disabled 🙂

@jakirkham
Copy link
Collaborator Author

. The weirdly fortunate thing I happened to see in that post was:

With some debugging with GDB, we found that apparently one of the third-party libraries we used (msgpack) calls gc.enable() to bring it back, so gc.disable() at bootstrapping was washed.

Interesting. Does this only happen at import? If so, maybe we just import msgpack in this file. Admittedly a hack, but the whole file is a hack 😛

It seems MsgPack use to do this, but no longer does ( msgpack/msgpack-python@235b928 ). Not seeing any more references to GC, but feel free to point them out if I'm missing them

@gjoseph92
Copy link
Contributor

Nice find! I didn't even bother to look in msgpack or try without the set_threshold, just copied those settings and moved on.

@jakirkham
Copy link
Collaborator Author

Somewhat tangential, but something that has been on the back of my mind is using freelists for allocation of Cython extension types like *State and Task* objects. This could cutdown on the number of allocations/deallocations done and even reuse some memory. Though don't have a clear sense atm of how big of a freelist is useful without just hogging a bunch of memory from the outset.

Until now there also hasn't been much value in doing something like this (allocation/deallocation didn't appear to take that much time). It might be worth profiling memory usage by type to see if these extension objects we are creating for the graph are eating up a lot of space. If we find that to be true, we can pick some freelist size to collect them together. This would also cutdown on any fragmentation we might be seeing (that statistically would just look like used memory even though there is a subtle distinction!)

@jakirkham
Copy link
Collaborator Author

Results can be seen in issue ( #144 )

@jakirkham jakirkham merged commit fb1783b into quasiben:main Jun 18, 2021
@jakirkham jakirkham deleted the disable_gc branch June 18, 2021 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants