-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mulsw etc functions not working correctly? #54
Comments
One downside of my solution seems to be that the compiler will add ext.l before the muls because 'a' is copied to 'r' which is an int. So far I haven't found a solution that doesn't either add some extra unnecessary code or remove necessary upper bits. |
I ran into this again because I needed to remove some 32bit multiplies. The support code as it is doesn't return the full 32bit result unless you use the version I provided above. |
I haven't really found a satisfying solution either. It's always either non-optimal code or breaking code. |
The current code has the problem that if the result doesn't fit in 16bits then you get the wrong value. |
I've been trying to use mulsw to make sure the compiler does a 16x16=32 multiplication. But as things are the compiler seems to assume mulsw results can be truncated to 16bits. I'm seeing generated code that does this. I think its because the inline asm used 'a' for the result and 'a' is a short. So it optimizes out the additional bits you might hope for in the returning int.
Well, anyway I think this alternative works:
inline int mulsw(short a, short b) {
int r = a;
asm("mulsw %1,%0":"+d"(r): "mid"(b): "cc");
return r;
}
But you will probably want to check that yourself and do the same for muluw.
The text was updated successfully, but these errors were encountered: