Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Making pyproj completely threadsafe #782

Closed
snowman2 opened this issue Feb 26, 2021 · 8 comments
Closed

REF: Making pyproj completely threadsafe #782

snowman2 opened this issue Feb 26, 2021 · 8 comments

Comments

@snowman2
Copy link
Member

pyproj CRS/Proj/Transformer objects aren't meant to be shared across threads, but they can exist if created within the thread (ref). Sharing the same object across threads can cause issues (#589, corteva/geocube#41)

The only way to entirely solve this is to be able to have a single PROJ context per thread and to be able to dynamically switch to that context depending on the thread they are in.

Current concerns to overcome:

  1. How to be able to have a thread local context? The threading.local object is a possibility. However it may require transitioning from cython to python to cython types. This has caused stability issues in the past (PROJ 6.2+ required; remove global context; autoclose database #412), so need to be careful.
  2. How to dynamically swap contexts based on the thread? The PJ * objects have a context assigned at creation. Would need to use proj_assign_context each time you use the context object.
@snowman2
Copy link
Member Author

snowman2 commented Feb 27, 2021

This kills making the pyproj objects thread safe: OSGeo/PROJ#1047

Seems the PJ* is not thread safe. So, you would have to construct a PJ* per thread per object. It would simpler and more stable to just tell users of pyproj to use threading,local to construct a pyproj object per thread.

Since this is the case, pyproj really cannot do anything at this time.

@snowman2
Copy link
Member Author

Just had another thought, The CRS and Transformer classes have both a Python and a Cython version.

Following the logic in geopandas/geopandas#1846,

Instead of having the Python CRS class inherit from the Cython _CRS class, it could instead store the Cython class on a threading.local in the class.

class CRSLocal(threading.local):
    def __init__(self):
        self._crs = None  # Initialises in each thread

class CRS:
    self._local = CRSLocal()
...

Then it could use the srs attribute to create a _CRS in each thread.

@property
def _local_crs(self):
    if self._local._crs is None:
        self._local._crs = _CRS(self.srs)
    return self._local._crs

A similar concept could be applied to the Transformer since it already stores the Cython class as a property.

@jorisvandenbossche
Copy link
Contributor

Instead of having the Python CRS class inherit from the Cython _CRS class, it could instead store the Cython class on a threading.local in the class.

Indeed, that's what I tried to suggest in geopandas/geopandas#1842 (comment) ;)

It might be require quite some changes in pyproj to do this, but on the other hand, would avoid similar changes in every package using pyproj.

@snowman2
Copy link
Member Author

I guess that you were one step ahead of me. I was stuck on another design and missed that. Glad we're on the same page now 😂

@snowman2
Copy link
Member Author

@jorisvandenbossche since you had the idea first, you have first dibs on implementing this. Is this something you would like to take a stab at? (no worries if you would like to pass on that)

@jorisvandenbossche
Copy link
Contributor

"Would like" in theory yes ;)
But I am currently completely swamped with pandas work (at least still for another month), so on the short term I won't be able to do that (but certainly can help review!)

@snowman2
Copy link
Member Author

snowman2 commented Mar 3, 2021

Sounds good. PR reviews and testing will be much appreciated 👍

@snowman2
Copy link
Member Author

Still some classes that aren't: CoordinateOperation, Datum, etc.. but I think that there isn't really a huge need for that. Can revisit if it arises. Calling this done for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants