Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplification of vector handling: importance of helper functions? #159

Open
hikari-no-yume opened this issue Apr 4, 2022 · 0 comments
Open
Labels

Comments

@hikari-no-yume
Copy link
Collaborator

I've been thinking about ways the C backend could be simplified. I think a lot of complexity comes from trying to handle so many details of the translation in a single pass. By splitting things into multiple passes (operating on an IR, probably LLVM IR), maybe it could be easier to work with.

Something that could be moved to a pass is the handling of vector operations. The LLVM Scalarizer pass can lower most vector operations to simple scalar operations for us, meaning we can remove the handling for vector addition, multiplication etc, leaving just things like generating structs for them, converting GEPs, and a few other things like that.

It's a pretty simple change to use the scalariser:

--- a/lib/Target/CBackend/CTargetMachine.cpp
+++ b/lib/Target/CBackend/CTargetMachine.cpp
@@ -19,6 +19,8 @@
 #include "llvm/Transforms/Utils.h"
 #endif
 
+#include "llvm/Transforms/Scalar/Scalarizer.h"
+
 namespace llvm {
 
 bool CTargetMachine::addPassesToEmitFile(PassManagerBase &PM,
@@ -53,6 +55,8 @@ bool CTargetMachine::addPassesToEmitFile(PassManagerBase &PM,
   // Lower atomic operations to libcalls
   PM.add(createAtomicExpandPass());
 
+  PM.add(createScalarizerPass());
+
   PM.add(new llvm_cbe::CWriter(Out));
   return false;
 }

The main difference in the generated C code is essentially that what would otherwise be the body of a helper function like llvm_fmul_f32x4 instead gets inlined at the call-site.

I'm wondering whether there's a disadvantage to this approach. For a simple matrix multiplication test I wrote, clang seemed to produce similarly good code for the the helper function and non-helper-function versions (i.e. it successfully re-vectorises both). But it might be the case that in a more complex program, switching to scalarisation like this would produce worse code.

Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant