-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data for different CPU architectures #2
Comments
Do the benchmarks not run directly on macOS? I amended the README to say that I don't really understand why pinning to a single core speeds up thread-brigade. I mean, sure, I can guess that cross-core traffic is too slow or whatever, but that's not the same as actually knowing what is specifically happening. |
I ran the tests in a Linux VM to keep the environment consistent with described in README. Running natively on macOS:
Looks like there are some scheduler policy differences between Linux and macOS leading to the difference. |
Wow, they're about the same. And, if I may continue to impose, how about one-thread-brigade, to see how much time is due to the I/O alone? |
macOS (M1):
Ubuntu (VM on M1)
|
...
Ubuntu in a VM is outperforming native macOS? Does seems like a weird result. |
It seems that the results vary a lot on different CPU architectures.
Testing on a Ubuntu VM (kernel version
5.4.0-65-generic
) running on Apple M1 with thethread-brigade
andasync-brigade
tests:So it's a 90% speedup, not a 30% one.
Pinning to a single CPU core brings the threaded version closer to async though:
The text was updated successfully, but these errors were encountered: