Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block product #48

Open
antoine-levitt opened this issue Aug 15, 2019 · 12 comments
Open

Block product #48

antoine-levitt opened this issue Aug 15, 2019 · 12 comments

Comments

@antoine-levitt
Copy link

The following is slow

using BenchmarkTools
using LazyArrays

X = randn(1000,100)
Yh = hcat(X,X)
YH = Hcat(X,X)
@btime $Yh'*$Yh
@btime $YH'*$YH

Am I doing something wrong? I was hoping this package would cleverly dispatch to BLAS here.

@antoine-levitt
Copy link
Author

Hm, I guess the problem is that it's tricky to do when the subarray boundaries don't align. I'd be willing to try my hand on a PR to implement this in the case where the boundaries are aligned (I need this functionality pretty badly), but is it OK to error when they are not? I'm guessing the users of this package are here for performance, so it seems acceptable. If not, I'm not sure how to call the fallback from the specialization...

@dlfivefifty
Copy link
Member

Yes it's fine to throw an error. Note that MemoryLayout for Vcat is already implemented:

julia> LazyArrays.MemoryLayout(typeof(YH'))
LazyArrays.VcatLayout{Tuple{LazyArrays.DenseRowMajor,LazyArrays.DenseRowMajor}}()

Thus the easiest way is to add an HcatLayout and overload materialize!, something like:

function materialize!(M::MulAdd{<:VcatLayout,<:HcatLayout})
   α,A,B,β,C =  M.α,M.A,M.B,M.β,M.C
   T = eltype(C)
   _fill_lmul!(β,C) # this is temporary until strong β = false is supported
   for (a,b) in zip(A.arrays,B.arrays)
       materialize!(MulAdd(α,a,b,one(T),C))
   end
   C
end

@antoine-levitt
Copy link
Author

I'm confused, this is for the lazy Mul, right? It won't do anything for my MWE above?

@dlfivefifty
Copy link
Member

As long as we add @lazymul Vcat it will automatically lower to materialize!(::MulAdd):

julia> LazyArrays.@lazymul Vcat

julia> @which YH'YH # calls materialize!(Mul(YH',YH)), which defaults to MulAdd
*(A::Vcat, B::AbstractArray{T,2} where T, C...) in Main at /Users/solver/Projects/LazyArrays.jl/src/linalg/lazymul.jl:18

The "power" of the lazy memory layout approach is it avoids ambiguity issues.

@antoine-levitt
Copy link
Author

That feels evil. I don't actually need the lazy broadcasting stuff, I just want to use the lazy arrays approach to improve the readability of my code. I was going to just override mul! for Vcat/Hcat, so there's no need for @lazymul. Should I not do that and use @lazymul instead?

@dlfivefifty
Copy link
Member

Definitely do @lazymul, especially if it’s a PR. You get * , mul! for free and it automatically will work with transposes/adjoints.

@lazymul is thoroughly tested as it’s used for BandedMatrices.jl (where i needed it so subviews behaved like banded matrices).

@antoine-levitt
Copy link
Author

OK. In that case should it be exported and documented? Using an unexported macro with global side effects is extremely scary :p

@dlfivefifty
Copy link
Member

Sorry I meant call it inside LazyArrays.jl

@dlfivefifty
Copy link
Member

dlfivefifty commented Aug 16, 2019

I'll do the change now, then you can see what it looks like for future ref. Though your fear of @lazymul is unwarranted, it just overloads the relevant commands for the custom type:

https://github.com/JuliaArrays/LazyArrays.jl/blob/master/src/linalg/lazymul.jl

Errrr I'll do the simpler Hcat * Vcat. Your Vcat * Hcat is a bit more involved in the actual details....

@antoine-levitt
Copy link
Author

Yes I understand now, I thought it was supposed to be a user-facing thing.

@antoine-levitt
Copy link
Author

antoine-levitt commented Aug 17, 2019

OK, I tried to address this but in the end there are too many combinations and things that I don't understand on this codebase, and the time I'm able to devote to this is running out. I won't be able to make a good PR in a reasonable timeframe and I'll probably just end up monkey-patching the specific functionality I care about for my application. Here are some tests if someone is interesting in tackling this

using Test
using LazyArrays

# Matmat
N = 200
H = Hcat(randn(2N,N), randn(2N,N))
V = Vcat(randn(N,2N), randn(N,2N))
D = randn(2N,2N)
x_D = randn(2N)
x_V = Vcat(x_D[1:N], x_D[1:N])
timings = []
for A = (H, V, D)
    for B = (H, V, D)
        A*B
        A*B
        T = @elapsed A*B
        push!(timings, T)
        println(T, " ", nameof(typeof(A)), " ", nameof(typeof(B)))
        @test norm(A * B - Array(A) * Array(B)) <= sqrt(eps())
    end
end
@test maximum(timings) < 5*minimum(timings)

# Matvec and vecmat
N = 1000
H = Hcat(randn(2N,N), randn(2N,N))
V = Vcat(randn(N,2N), randn(N,2N))
D = randn(2N,2N)
x_D = randn(2N)
x_V = Vcat(x_D[1:N], x_D[1:N])
timings = []
for A = (H, V, D)
    for x in (x_D, x_V)
        A*x
        A*x
        T = @elapsed A*x
        push!(timings, T)
        println(T, " ", nameof(typeof(A)), " * ", nameof(typeof(x)))
        @test norm(A * x - Array(A) * Array(x)) <= sqrt(eps())
        x'*A
        x'*A
        T = @elapsed x'*A
        push!(timings, T)
        println(T, " ", nameof(typeof(x)), " '* ", nameof(typeof(A)))
        @test norm(x' * A - Array(x') * Array(A)) <= sqrt(eps())
    end
end
@test maximum(timings) < 5*minimum(timings)

@dlfivefifty
Copy link
Member

You can just make a WIP PR and I’ll try to complete it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants