Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing failure on Windows Python 3 #10

Closed
seanthegeek opened this issue Apr 2, 2019 · 17 comments
Closed

Multiprocessing failure on Windows Python 3 #10

seanthegeek opened this issue Apr 2, 2019 · 17 comments

Comments

@seanthegeek
Copy link

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\sean.whalen\GitHub\checkdmarc\venv\lib\site-packages\multiprocess\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\multiprocess\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\dill\_dill.py", line 305, in load
    obj = pik.load()
TypeError: __init__() missing 2 required positional arguments: 'max_len' and 'max_age_seconds'
@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

can You also give me the callers code or short example ?
is that really related to the timeout decorator ?

@seanthegeek
Copy link
Author

I'm trying to replace timeout_decorator here https://github.com/domainaware/checkdmarc/blob/master/checkdmarc.py#L1770

It seems to be failing here. https://github.com/domainaware/checkdmarc/blob/master/checkdmarc.py#L78

This looks more like a multiprocessing bug?

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

just try to load dill on top of the file (without using the timeout decorator) - because loading dill registers another (better) pickler. It might that the pickler (dill) is the reason. Lets narrow it down a bit.

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

looks like that can not be pickled :

DNS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)
TLS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)
STARTTLS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)

just put those definitions into an extra Module and refer in the main code to that objects :

cache_objects.py:

from expiringdict import ExpiringDict

DNS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)
TLS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)
STARTTLS_CACHE = ExpiringDict(max_len=200000, max_age_seconds=1800)

checkdmarc.py

from cache_objects import * 

...
...
...

then it can be probably pickled. Just a guess - I did not test this.
Try to import dill explicitly BEFORE importing Expiringdict
Try dill.dumps(), dill.loads() and dill.detect.badtypes(object_to_pickle)

let me know !

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

this kind of errors can only occur under windows - i wonder, did it work with the original timeout_decorator under Windows ?

@seanthegeek
Copy link
Author

I just tried importing the cache objects from a separate module and importing dill first, and got the exact same exception chain.

the original timeout_decorator does not work on Windows at all at all, because it relies on UNIX signals, which lead me to look for alternatives .

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

hang on, I look into it - but its not an issue of the wrapt_timeout_decorator itself.
It is probably because ExpiringDict can not be pickled.
Just let me fork expiringdict and test ...

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

the original timeout_decorator does not work on Windows at all at all, because it relies on UNIX signals, > which lead me to look for alternatives .

well, it kind of does - You need to pass @timeout(5, use_signals=True) but is even more picky when it comes to which objects can be pickled.

@bitranox
Copy link
Owner

bitranox commented Apr 2, 2019

ok, got it -
without diving to deep into it, ExpiringDict can not be pickled.
But the solution is easy - just copy the ExpiringDict to an OrderedDict before passing it to the decorated function.
here the small Test I wrote :

from expiringdict import ExpiringDict
import dill

try:
    from collections import OrderedDict
except ImportError:
    # Python < 2.7
    from ordereddict import OrderedDict


def test_pickle_ordered_dict():
    """
    >>> # test to pickle ordered dict
    >>> dict_test=OrderedDict()
    >>> dict_test['test'] = 1
    >>> pickled_object = dill.dumps(dict_test)
    >>> original_object = dill.loads(pickled_object)
    >>> original_object['test']
    1
    """
    pass


def test_pickle_expiring_dict():
    """
    >>> # test to pickle ordered dict
    >>> dict_test=ExpiringDict(max_len=200*1E3, max_age_seconds=30*60)
    >>> dict_test['test'] = 1
    >>> pickled_object = dill.dumps(dict_test)
    >>> original_object = dill.loads(pickled_object)  # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
    Traceback (most recent call last):
    ...
    TypeError: __init__() missing 2 required positional arguments: 'max_len' and 'max_age_seconds'

    """
    pass


def test_pickle_expiring_dict_pass():
    """
    >>> # test to pickle ordered dict
    >>> dict_test=ExpiringDict(max_len=200*1E3, max_age_seconds=30*60)
    >>> dict_test['test'] = 1
    >>> dict_test_pickable = OrderedDict(dict_test)
    >>> pickled_object = dill.dumps(dict_test_pickable)
    >>> original_object = dill.loads(pickled_object)  # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
    >>> original_object['test']
    1

    """
    pass


def main():
    pass


if __name__ == "__main__":
    main()

I also notified the Author of ExpiringDict, see :
mailgun/expiringdict#35

@seanthegeek
Copy link
Author

seanthegeek commented Apr 2, 2019

Thanks. That fixed that problem, and unfortunately uncovered a new one, this time around SSLContext objects from ssl:

domainaware/checkdmarc@9115659

File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\wrapt_timeout_decorator\wrapt_timeout_decorator.py", line 129, in wrapper
    return timeout_wrapper(*args, **kwargs)
  File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\wrapt_timeout_decorator\wrap_function_multiprocess.py", line 46, in __call__
    return self.value
  File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\wrapt_timeout_decorator\wrap_function_multiprocess.py", line 70, in value
    raise result
NameError: name 'create_default_context' is not defined

So then I tired to fix that by passing a SSLContext object before the decorated method, but it turns out SSLContext objects can't be pickled.

domainaware/checkdmarc@ff32a33

  File "C:\Users\sean\GitHub\checkdmarc\venv\lib\site-packages\dill\_dill.py", line 902, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "C:\Program Files\Python36\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Program Files\Python36\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\Program Files\Python36\lib\pickle.py", line 496, in save
    rv = reduce(self.proto)
TypeError: can't pickle SSLContext objects

Any suggestions on a workaround?

@bitranox
Copy link
Owner

bitranox commented Apr 3, 2019

in general, You should narrow down what really needs to be timed-out :

# instead of : 

@timout(5)
def do_everything():
    do_something_1()
    do_something_2()
    do_something_that_really_needs_timeout()
    do_something_4()

def do_something_1():
    ...
...

# You should do : 
def do_everything():
    do_something_1()
    do_something_2()
    do_something_that_really_needs_timeout()
    do_something_4()

@timout(5)
def do_something_that_really_needs_timeout(): 
    # needs to be pickable because of timeout decorator in windows
    ...

Under Linux it is no problem to generously spray the code with timeouts - under Windows You need to take care, because the decorated function needs to be pickable. So for multiplatform code, I tend to develop and test under Windows first - because it is the most limited system (no fork, no signals) - and after that, test under Linux - most likely it will run there anyway.
So You should narrow down with testing which part of Your code really needs timeouts.
Then You can concentrate on making that function pickable.

I will add a small conveniance function for You to show the cause of non-pickable objects or types when possible (dill does not cover all cases unfortunately)

I looked a bit at Your project and I did not find a test to provoke the timeout ?
When it occurs ?
I would suggest :

  • add codecov (You might look at my travis config)
  • add travis tests for windows (You might look at my travis config)
  • add some unittests for the timeout, check were it is really needed
  • add tests for 100% code coverage to make sure nothing is hiding
  • and then we can look to make that function / method pickable

@bitranox
Copy link
Owner

bitranox commented Apr 3, 2019

I extended the Class expiringdict for You - it can be pickled now with dill.
look at : mailgun/expiringdict#36
or
https://github.com/bitranox/expiringdict

so You can let Your code unchanged if You use that version.

it looks like the project is not taking care of - they almost never merge anything, probably I will lauch my own version, it somehow fits to the wrapt_timeout_decorator

@seanthegeek
Copy link
Author

seanthegeek commented Apr 3, 2019

Thanks. It's hard to have a test for this that reliably times out, because it's based on what the server at the other end of the TCP connection does. If a SMTP server does not exist, most will refuse the connection outright, but some will let the connection hang and time out. I know of a few servers that reliably do this, but I'd rather not have my tests depend on a third-party's misconfigured server.

That said, adding code coverage would be good. I'll work on that.

I've refactored my use of the timeout decorator to only apply to the actual SMTP connection, as you suggested, which eliminates the need to pickle ExpiringDict objects at all.

domainaware/checkdmarc@a9c7365

Unfortunately, dill cannot currently pickle SSLContext objects, so I've created an issue with that project.

uqfoundation/dill#308

It looks like that project is still pretty active?

@bitranox
Copy link
Owner

bitranox commented Apr 3, 2019

but some will let the connection hang and time out

then You should create some misbehaving SMTP Servers Yourself (probably more then one, with different misbehaving), it should not be that hard, in order to have an idependent full testsuite.

I also created a convenience function to be able to quickly test in dreampie or whatever You use:

https://github.com/bitranox/wrapt_timeout_decorator#convenience-function-to-detect-pickle-errors

I will release it shortly.

@seanthegeek
Copy link
Author

Will do at some point.

One last question for you, mostly out of curiosity: Why isn't this package on PyPI yet? It seems mature and stable, and I noticed that you have a placeholder badge for PyPI in the README.

@bitranox
Copy link
Owner

bitranox commented Apr 3, 2019

Will do at some point.

I consider not 100% covered as broken ;-)
I am always amazed about the stupid things I find in my own code, when I do TDD and get everything covered and tested - it also forces me to use a highly decoupled modular style ...

However - it is released now, You may update, the pickle analyzer should work.
It can not cover everything, but it might help !

One last question for you, mostly out of curiosity: Why isn't this package on PyPI yet? It seems mature > and stable, and I noticed that you have a placeholder badge for PyPI in the README.

I did not catch the time - I have a lot to code ... I might look into it in the future.
I derived it from the original timeout_decorator, so it was not TDD - but restructuring and restructuring quite a lot. I guess now it is mature enough to release it on PyPi.

Some small tests for the "setup.py test" are still missing - not a priority ATM, also I want to integrate WINE testing - not that it is important for that program.

I gave up on jython - if You know how to test, let me know.

Have fun !

@bitranox
Copy link
Owner

FYI the package is now released on PyPi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants