-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support and PortBLAS #799
Comments
DGEMM cannot and will not be performance portable unless your standards for performance are low. There are decades of empirical results confirming this. |
I think with a pinch of autotuning, I think it is possible. For example, see what triton does for matrix multiplication. No hard coding of important parameters and higher level abstraction. Nothing specific to Nvidia anymore. If your definition of performance portability is the same binary and no hardware specific optimizations, I agree with you. But I guess, my definition of that is not dissimilar to BLIS's own version of it: https://github.com/flame/blis/tree/master/config |
We welcome your contributions of new configurations, to accompany the ones you've discovered in |
Unfortunately BLIS doesn't compile to any sort of GPU :( |
I see in your readme that you're interested in GPU support. A library like yours called PortBLAS exists which promises performance portability across different GPUs: https://github.com/codeplaysoftware/portBLAS. It uses sycl standard and this library used to be called sycl-blas. I wonder if some inspiration (or code) can be take from here to support GPUs for BLIS.
Am no expert at BLAS stuff, but happy to contribute if hand holded a bit.
The text was updated successfully, but these errors were encountered: