-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There are several functions in the FilteringFunctions that should not have DSP extended macros. #193
Comments
@JoleenSun I agree that the test with ARM_MATH_DSP is not useful here and could be removed (What is tested in fact here is that we want a function that may be unrolled or not). So the problem is : you'd like to have a function that is not impacted by ARM_MATH_LOOP_UNROLL. Because if I just disable the ARM_MATH_DSP cases when ARM_MATH_LOOP_UNROLL is defined, then it selects the C version of the function where the loop is never unrolled (whatever the value of ARM_MATH_LOOP_UNROLL). Unfortunately, the effect of ARM_MATH_LOOP_UNROLL and the manual unrolling is highly compiler dependent. Most of the times it improves the performances but there are some cases where it does not. And relying on the compiler to do the automatic unrolling has shown (in our tests) that most of the case it is even less predictable that doing the manual unrolling. So, we will always have cases where enabling ARM_MATH_LOOP_UNROLL may decrease the performance of a specific function on a specific version of a compiler. If it is a case where it always decrease the performance (whatever the compiler version) then the manually unrolled version should be removed. But it would require tests with several compiler versions to be really sure. |
@christophe0606 Thanks for your reply. I think I get what you mean. But I have another question: why can't we remove the "ARM_MATH_DSP "macro but keep "ARM_MATH_LOOP_UNROLL", so that the C version have both a version with loop_unrolling and a version without. In this case, maybe we only need to test the two C versions both without loop unrolling: based on the current fuction, a version is define the "ARM_MATH_DSP " but disable the "ARM_MATH_LOOP_UNROLL", the other is just disable the "ARM_MATH_DSP "(also disable RISCV_MATH_VECTOR). |
@JoleenSun I can remove the Now the problem is that `ARM_MATH_LOOP_UNROLL" applies to several functions. So if you unset it because it gives better performance for this function perhaps it will give worse performance for another function you use. One should have a better granularity and be able to set this for each function. But I have not yet found a developer friendly way to do it. |
Thanks! I just mean that the "ARM_MATH_DSP " maybe can be removed for functions with float-type(in which case "ARM_MATH_LOOP_UNROLL" also can be reserved), not desiring the granularity of "ARM_MATH_LOOP_UNROLL" needs to be small enough as can be set for each function separately. And all I said is just related to functions arm_conv_f32/arm_conv_partial_f32/arm_correlate_f32/arm_correlate_f16. |
@JoleenSun I think I understand the issue. But i think it is not just restricted to arm_conv_f32/arm_conv_partial_f32/arm_correlate_f32/arm_correlate_f16 since the But for the float function, I totally agree than Anyway, your feedback is taken into account and I have tagged this issue as enhancement. |
Haha, thank you so much. Actually, the point is that the “ARM_MATH_DSP ” but not "ARM_MATH_LOOP_UNROLL ", as long as the use of "ARM_MATH_LOOP_UNROLL "is not bound to the "ARM_MATH_DSP ", and can be set independently, the current adaptation and support are already perfect(personal opinion ^_^ ). |
P extension instruction does not work with floating point types.
But function arm_conv_f32/arm_conv_partial_f32/arm_correlate_f32/arm_correlate_f16 has "#if defined(ARM_MATH_DSP)" ,which impacts on the performance when testing without any extension but the "ARM_MATH_LOOPUNROLL" is defined.
The text was updated successfully, but these errors were encountered: