Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multithreading #496

Merged
merged 6 commits into from
Sep 13, 2024
Merged

Support multithreading #496

merged 6 commits into from
Sep 13, 2024

Conversation

emmaai
Copy link
Contributor

@emmaai emmaai commented Aug 26, 2024

As the title (ref: #494), the change:

  • replace _numexpr_last with a dictionary like object aware of context

reasoning:

  • I do like the feature of re-evaluate and want it to stay the safe as well
  • open the pathways to async re/evaluate, which is to further cater to my specific user case.

benchmark case:
It's based on my reality that most of time, I got large amount of data and only chunks of data can fit into memory, most of my cpus are idling when io (especially i) happens. If the data can fit into memory and cpus are fully utilized, I guess it doesn't make much difference between threading with numpy by chunks vs numexpr. Certainly I can implement what numexpr can achieve by further chunking each chunk, but why? Specifically I use dask threading to schedule the tasks. All I have to do is to pack ne.evaluate nicely into a "task" if thread safety is taken care of.

@FrancescAlted
Copy link
Contributor

Cool. Can you add an example of a benchmark (you can put it in bench/) where the new feature is exercised and where we can see the advantage of the new method? It would be nice if you can add the results of your machine as a docstring at the beginning of the script; alternatively, you can use a jupyter notebook too. Thanks!

@gdementen
Copy link
Contributor

I am sorry if my comment is dumb, and I am not the maintainer for numexpr, so my opinion is not worth much, but I have the impression your benchmark compares apples to oranges: numpy 2 threads vs numexpr 32 (?) threads (2 explicit threads x 16 builtin threads) or how does the builtin numexpr threading interact with manual threading anyway? Also I would be interested in a benchmark against "normal"/builtin numexpr threading, which I think is more interesting than against numpy. Unless there is something I don't understand (very likely), I don't expect much difference.

@emmaai
Copy link
Contributor Author

emmaai commented Aug 27, 2024 via email

@FrancescAlted
Copy link
Contributor

Thanks @emmaai for your example. It was more for me (and others!) to understand the way you wanted to use your feature. Now it is much clearer, and sounds good to me. The only thing that I'd ask is to add a new test exercising this new feature; tests are actually the way to ensure that we are not introducing regressions in the future.

@emmaai
Copy link
Contributor Author

emmaai commented Aug 28, 2024

Test added to verify thread safety by always manifesting the race condition.

@gdementen
Copy link
Contributor

Thanks @emmaai for the added explanation.

@emmaai
Copy link
Contributor Author

emmaai commented Sep 12, 2024

I'm following up with this pr, wondering if there is any concern that it can't be merged?

@FrancescAlted
Copy link
Contributor

I've just activated the tests in CI, and Mac OSX is reporting a failure. Can you address it?

@emmaai
Copy link
Contributor Author

emmaai commented Sep 12, 2024

sorry forgot to commit the change in another file when pushing the test. it should pass now

@FrancescAlted FrancescAlted merged commit a99412e into pydata:master Sep 13, 2024
4 of 5 checks passed
@FrancescAlted
Copy link
Contributor

Thanks @emmaai !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants