-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorisation sprint #654
Vectorisation sprint #654
Conversation
Only works when kernel is a Loopy kernel.
…he tree vectorisation flag for our vectorisation anyways.
Don't vectorise, if complex arguments. Check if vect strategy specified, otw dont vectorise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are going to be a few naming errors from my refactoring. I've labelled where they are.
…ross caller-callee is a bit involved and loopy can't deal with it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one comment. Otherwise looks good AFAICT. You should definitely run the Firedrake test suite to make sure you haven't broken anything by accident.
N.B. This is currently set to use PYOP2_TIME as the configuration option. This is misleading and should be changed.
Connorjward/add nbytes
Don't add py-cpuinfo
Loopy now requires py3.8
Closing as #677 is a newer version of the same things. |
Implements automatic cross-element vectorisation (This is the work from TJ and Kaushik)
See firedrakeproject/firedrake#2365 for
firedrake
CI runs.The corresponding loopy PR is inducer/loopy#557.
Big thanks to @kaushikcfd for working hard on an update of this so that we can get it merged.
The mechanism in PyOP2
CVectorExtensionsTarget
lp.VectorizeTag(lp.OpenMPSIMDTag())
whereVectorizeTag
indicates that we try to use vector extensions first, but if an instruction can't be vectorised we use the fallbackOpenMPSIMDTag
which wraps the instruction in openmp simd pragmasKernels which cannot be vectorised
Single instructions which cannot be vectorised