-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add support for -foffload-fp32-prec-div/sqrt options. #15836
base: sycl
Are you sure you want to change the base?
Conversation
to be applied to OpenMP too.
if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_div)) | ||
CmdArgs.push_back("-fno-offload-fp32-prec-div"); | ||
else | ||
CmdArgs.push_back("-foffload-fp32-prec-div"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_div)) | |
CmdArgs.push_back("-fno-offload-fp32-prec-div"); | |
else | |
CmdArgs.push_back("-foffload-fp32-prec-div"); | |
if (!Args.hasFlag(option::OPT_foffload_fp32_prec_div, | |
option::OPT_fno_offload_fp32_prec_div, true)) | |
CmdArgs.push_back("-fno-offload-fp32-prec-div"); |
Since -foffload-fp32-prec-div
is default
if (Args.getLastArg(options::OPT_fno_offload_fp32_prec_sqrt)) | ||
CmdArgs.push_back("-fno-offload-fp32-prec-sqrt"); | ||
else | ||
CmdArgs.push_back("-foffload-fp32-prec-sqrt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment to above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@premanandrao can you review this please?
function instead of adding a JobAction to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has to withdraw my review as have 2 questions.
clang/lib/CodeGen/CGCall.cpp
Outdated
(FuncName == "sqrt" && !getLangOpts().OffloadFP32PrecSqrt && | ||
IsFloat32Type); | ||
bool isFP32FdivFunction = | ||
(FuncName == "fdiv" && !getLangOpts().OffloadFP32PrecDiv && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually though, that the request is done to replace fdiv instruction with the intrinsic, not fdiv function. Do we know if users actually use such function? I don't see any mentioning of it in SYCL or OpenCL specifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmlueck could you please comment on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent of -foffload-fp32-prev-div
is to affect the native divide operation (i.e. /
). There is no SYCL function named fdiv
. Is there a standard C / C++ function with that name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK there is no standard function float FP division. There is std::div, but it works only on integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no C/C++ fdiv function.
bool hasFPAccuracyFuncMap = hasAccuracyRequirement(FuncName); | ||
bool hasFPAccuracyVal = !getLangOpts().FPAccuracyVal.empty(); | ||
bool isFp32SqrtFunction = | ||
(FuncName == "sqrt" && !getLangOpts().OffloadFP32PrecSqrt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we compare with un-mangled sqrt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FuncName is the output of FD->getName() which returns a simple identifier. https://github.com/intel/llvm/blob/sycl/clang/include/clang/AST/Decl.h#L280
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So clang/test/CodeGenSYCL/offload-fp32-div-sqrt.cpp will pass even with extern "C"
removed from sqrt
function declaration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user has a function in their own namespace that happens to be named "sqrt"?
bool IsFp32PrecDivSqrtAllowed = JA.isDeviceOffloading(Action::OFK_SYCL) && | ||
!JA.isDeviceOffloading(Action::OFK_Cuda) && | ||
!JA.isOffloading(Action::OFK_HIP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offlne, something like:
bool IsFp32PrecDivSqrtAllowed = JA.isDeviceOffloading(Action::OFK_SYCL) && | |
!JA.isDeviceOffloading(Action::OFK_Cuda) && | |
!JA.isOffloading(Action::OFK_HIP); | |
bool IsFp32PrecDivSqrtAllowed = JA.isDeviceOffloading(Action::OFK_SYCL) && | |
TC.getTriple().isSPIROrSPIRV(); |
instruction gets the precision set instead of the fdiv function.
@MrSidims and @gmlueck I have removed the restriction for |
In email thread I've replied, that I'm planning to take care of the precise option propagating to CUDA and HIP drivers. I can take a look what should be done for the implementation of non-precise intrinsics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming that the code doesn't affect C stdlib's div
function, see the comment above.
// ROUNDED-SQRT-PREC-DIV: call reassoc nnan ninf nsz arcp afn float @llvm.fpbuiltin.sqrt.f32(float {{.*}}) #[[ATTR_SQRT:[0-9]+]] | ||
// ROUNDED-DIV-PREC-SQRT: call reassoc nnan ninf nsz arcp afn spir_func nofpclass(nan inf) float @sqrt(float noundef nofpclass(nan inf) {{.*}}) | ||
// ROUNDED-DIV-ROUNDED-SQRT-FAST: call reassoc nnan ninf nsz arcp afn float @llvm.fpbuiltin.sqrt.f32(float {{.*}}) #[[ATTR_SQRT:[0-9]+]] | ||
// LOW-PREC-DIV: call float @llvm.fpbuiltin.sqrt.f32(float {{.*}}) #[[ATTR_SQRT_LOW:[0-9]+]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to check if I understand this correctly. In this case we pass: -fno-offload-fp32-prec-div -ffp-builtin-accuracy=high flags. And 1.0 ULP fpbuiltin-error attribute for llvm.fpbuiltin.sqrt.f32 was generated in response of ffp-builtin-accuracy=high flag and we don't expect here precise calculations as high != precise, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct.
clang/lib/CodeGen/CGCall.cpp
Outdated
StringRef FPAccuracyVal = llvm::fp::getAccuracyForFPBuiltin( | ||
ID, FuncType, convertFPAccuracy(getLangOpts().FPAccuracyVal)); | ||
StringRef FPAccuracyVal; | ||
if (!getLangOpts().OffloadFP32PrecDiv && Name == "div") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that familiar with the code, so would like to double check - "div" is just an alias to "llvm.fpbuiltin.fdiv" or is it some function call @div(...)
. If it's an intrinsic, then why renaming from fdiv
to div
was required?
Basically I'm asking if these lines may affect C stdlib's div
function that have 'long integer' operands. If you say that this code won't affect it - that would be enough explanation for me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes div
is just an alias. You can look at the code in ScalarExprEmitter::EmitDiv
at about line #3788 that the intrinsic's ID is set to llvm::Intrinsic::fpbuiltin_fdiv
. The call to CreateBuiltinCallWithAttr
is made with the hard code name div
.
In order to remove this confusion. I will name it back to fdiv
as it was. I thought that it would be clearer to rename it div
so that we know that this is the /
function instead of the fdiv
function that responds to these options.
The FE work has been approved. But this is an attempt to fix the LIT test |
Add support for options
-f[no]-offload-fp32-prec-div
and-f[no-]-offload-fp32-prec-sqrt
.These options are added to allow users to control whether
fdiv
andsqrt
operations in offload device code are required to return correctly rounded results. In order to communicate this to the device code, we need the front end to generate IR that reflects the choice.When the correctly rounded setting is used, we can just generate the
fdiv
instruction andllvm.sqrt
intrinsic, because these operations are required to be correctly rounded by default in LLVM IR.When the result is not required to be correctly rounded, the front end should generate a call to the
llvm.fpbuiltin.fdiv
orllvm.fpbuiltin.sqrt
intrinsic with thefpbuiltin-max-error
attribute set. For single precisionfdiv
, the setting should be2.5
. For single-precision sqrt, the setting should be3.0
.If the -ffp-accuracy option is used, we should issue warnings if the settings conflict with an explicitly set
-foffload-fp32-prec-div
or-foffload-fp32-prec-sqrt
option.