-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use dill for serialization #121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍
One question I would like to validate is if dill
supports custom serialization via __reduce__
, which would be needed to benefit from changes like encode/httpx#3108
It's not clear from a quick search, and the documentation doesn't seem to mention it either, so it might be useful to add a test to validate this on our side.
I would say one advantage of using dill
instead of pickle
is it's an open-source project separated from Python, would we ever need to make changes to it we could do so without being tied to the serialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other question that came to mind, should we use replace other use of pickle
with dill
(for example)?
Let's hold off on merging this until we have a better idea of the capabilities and trade-offs of
Let's stick with |
This is out of sync. I'll reboot if/when necessary. |
This PR introduces dill for serialization of coroutine state, replacing pickle from the standard library.
From the dill README:
Dispatch supports serializing coroutines (including generators) and their frames, so that's a non-issue.
The fact that dill can serialize cell vars means that this PR fixes #117.
One thing I like about dill is the built-in tracing. The
DISPATCH_TRACE
environment variable can be used to enable dill tracing. Below is an example trace when serializing the state of the functions from #117.Example trace:
Although the size of the outermost object is reported as
2 MB
, the serialized state in this case is ~2KB, which is only slightly larger than the equivalent state when usingpickle
. It's not a fair comparison though; pickle cannot serialize cell vars, and so I need to move the functions to the top-level in order to compare state.The library also provides tooling for inspecting state offline, which may come in handy in future.