-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for compiling to SoA ABI for CUDA #1103
base: branch-24.03
Are you sure you want to change the base?
Conversation
/ok to test |
typings/llvmlite/ir/builder.pyi
Outdated
self, | ||
agg: Value, | ||
idx: Union[Iterable[int], int], | ||
name: Optional[str] = "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Frustratingly, Optional[str] is not a reference to the fact that the parameter has a default, it's just an alias for Union[str, None]. So if the function doesn't actually accept None (which would typically be the case if it's set to "" if not provided), it would be more appropriate to type it simply as str.
@@ -0,0 +1 @@ | |||
from llvmlite import ir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these typings files should also get copyright headers?
cunumeric/numba_utils.py
Outdated
# values as SoA and some simplifications to keep it short | ||
if not device: | ||
raise NotImplementedError( | ||
"Only device functions can be compiled for " "the SoA ABI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only device functions can be compiled for " "the SoA ABI" | |
"Only device functions can be compiled for the SoA ABI" |
@manopapad Many thanks for having a look over this - I'll resolve the issues you noted shortly. It looks like tests also failed because Numba is not a dependency of cuNumeric, but it will need to be (at least for |
Can you try adding numba around here? https://github.com/nv-legate/cunumeric/blob/branch-24.01/conda/conda-build/meta.yaml#L148 |
Thanks, just giving that a try now. |
/ok to test |
What are the python dependencies being introduced here? Just a note that we may want to make some features optional if they significantly increase packaging complexity. |
The dependency being added is Numba, which also depends on llvmlite. These are both available on PyPI as wheels and on conda-forge for Linux x86_64, ppcle64, AArch64, macOS x86_64 and arm64, and Windows on x86_64. In general I think projects adding Numba as a dependency don't tend to have issues with increased packaging complexity - are there any special cuNumeric packaging requirements I should think about / elaborate on here? |
As @gmarkall noted, numba is widely available, so I'm not very worried about making it a hard dependency. That said, if there's pushback we can certainly make it optional (only necessary if the user wants to use |
@bryevdv we've gotten a bit bogged down with mypy on this, would you have some bandwidth to help resolve them? |
@manopapad These errors are because the stubs under diff --git a/typings/numba/core/codegen.pyi b/typings/numba/core/codegen.pyi
index a4288cce..409c94c0 100644
--- a/typings/numba/core/codegen.pyi
+++ b/typings/numba/core/codegen.pyi
@@ -1,4 +1,7 @@
class CodeLibrary:
codegen: "Codegen"
+ @property
+ def name(self) -> str: ...
+
class Codegen: ... fixes this error:
Either the stubs need to be expanded to cover the usage in this file, or else this file added to the override exclusions in |
@gmarkall Maybe for now it would be best to add the numba modules that are causing issues to the |
In the end I couldn't figure out how to make the overrides work so I ended up fixing up the typing. I've merged branch 24.03 and got CI green, so I think this is now ready for consideration / feedback. |
@manopapad This is fairly self-contained, so it should be straightforward to port to internal without much conflict, but at this point, should it just land in internal to begin with? |
This adds a new function,
compile_ptx_soa()
, which works similarly to Numba'scompile_ptx()
function, but the compiled code's writes the elements of tuple return types into SoA data structures passed in as pointers.Example
The Python function:
compiled with the
compile_ptx_soa()
function where:int32
type(int32, int32)
compiles to PTX equivalent to th C function:
Or, as the actual PTX produced:
Returning a heterogeneous tuple is also possible, For example where the return
type is specified as a tuple of
(int32, float32)
we get:(Note the
st.u32
for the first return value vs.st.f32
for the second).