Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enoki does not generate fma instruction for fmadd with Array<float, 1> and Clang #127

Open
robinchrist opened this issue May 22, 2022 · 0 comments

Comments

@robinchrist
Copy link

Working example here:
https://godbolt.org/z/rPbr91Gfb

Consider the following code:

#include <enoki/array.h>

template<class VecT>
void fma_foo(
    void* x,
    void* y,
    void* z,
    void* target
) {
    VecT res = enoki::fmadd(enoki::load<VecT>(x), enoki::load<VecT>(y), enoki::load<VecT>(z));

    enoki::store(target, res);
}

The following two instantiations work as expected:

//Correct
template
void fma_foo<float>(
    void* x,
    void* y,
    void* z,
    void* target
);

//Correct
template
void fma_foo<enoki::Array<float, 8>>(
    void* x,
    void* y,
    void* z,
    void* target
);

and generate a vfmadd132.. instruction

This one does not:

//Uh Oh
template
void fma_foo<enoki::Array<float, 1>>(
    void* x,
    void* y,
    void* z,
    void* target
);

instead of fma, it generates

vmulss  xmm0, xmm0, dword ptr [rsi]
vaddss  xmm0, xmm0, dword ptr [rdx]

This issue is present on all versions of Clang (9+), but does not seem to exist with GCC

Can you fix this on your side or should I file a bug with Clang?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant