Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the loopfilter support #2

Open
lu-zero opened this issue Oct 18, 2016 · 4 comments
Open

Implement the loopfilter support #2

lu-zero opened this issue Oct 18, 2016 · 4 comments

Comments

@lu-zero
Copy link
Owner

lu-zero commented Oct 18, 2016

(TBD: prepare the list of functions)

@sukrosono
Copy link

start?

@luctrudeau
Copy link
Collaborator

luctrudeau commented Jun 27, 2018

Probably only worth it to implement vpx_lpf_horizontal_16_dual_c and vpx_lpf_vertical_16_dual_c

This is the % time the libvpx spends in these when encoding a 1080p video
0.35% vpx_lpf_horizontal_16_dual_c
0.35% vpx_lpf_vertical_16_dual_c
0.00% vpx_lpf_vertical_8_c
0.00% vpx_lpf_horizontal_8_c

@lu-zero
Copy link
Owner Author

lu-zero commented Jun 27, 2018

Agreed, even if it is suspiciously low in the list.

@shawnl
Copy link

shawnl commented May 15, 2019

Just looking at the C code, I don't think vpx_lpf_horizontal_16_dual_c or vpx_lpf_vertical_16_dual_c will get any faster with vsx, as vsx lacks a vector-gather instruction. These loads are all over the place:

    const int8_t flat2 =
        flat_mask5(1, s[-8 * p], s[-7 * p], s[-6 * p], s[-5 * p], p0, q0,
                   s[4 * p], s[5 * p], s[6 * p], s[7 * p]);

    filter16(mask, *thresh, flat, flat2, s - 8 * p, s - 7 * p, s - 6 * p,
             s - 5 * p, s - 4 * p, s - 3 * p, s - 2 * p, s - 1 * p, s,
             s + 1 * p, s + 2 * p, s + 3 * p, s + 4 * p, s + 5 * p, s + 6 * p,
             s + 7 * p);

I ran into the same issue vectoring log/logf for glibc. https://sourceware.org/ml/libc-alpha/2019-05/msg00192.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants