-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure in LinearAlgebra/lapack.jl
in comparing LAPACK.sytri!
to inv
for an almost-symmetric matrix
#1101
Comments
The failure reported above is reproducible for me on both |
Minimal reproducer should be using LinearAlgebra
A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
A = A + transpose(A); #symmetric!
B = copy(A);
B,ipiv = LAPACK.sytrf!('U',B);
isapprox(triu(inv(A)), triu(LAPACK.sytri!('U',B,ipiv)); rtol=eps(cond(A))) which fails for me also on julia> versioninfo()
Julia Version 1.12.0-DEV.1431
Commit fc40e629b1d (2024-10-18 19:33 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 384 × AMD EPYC 9654 96-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, znver4)
Threads: 1 default, 0 interactive, 1 GC (on 384 virtual cores) This system is using
For what is worth: using LinearAlgebra
A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
A = A + transpose(A); #symmetric!
B = copy(A);
B,ipiv = LAPACK.sytrf!('U',B);
c1 = triu(inv(A));
c2 = triu(LAPACK.sytri!('U',B,ipiv));
@show norm(c1 - c2) / max(norm(c1), norm(c2))
@show eps(cond(A)) results in
|
Locally, I obtain julia> using LinearAlgebra
julia> A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
julia> A = A + transpose(A); #symmetric!
julia> B = copy(A);
julia> B,ipiv = LAPACK.sytrf!('U',B);
julia> isapprox(triu(inv(A)), triu(LAPACK.sytri!('U',B,ipiv)); rtol=eps(cond(A)))
true norm(c1 - c2) / max(norm(c1), norm(c2)) = 8.0597636f-7
eps(cond(A)) = 1.9073486f-6 julia> using OpenBLAS_jll, LinearAlgebra
julia> strip(unsafe_string(ccall(((:openblas_get_config64_), libopenblas), Ptr{UInt8}, () )))
"OpenBLAS 0.3.28 USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=512"
julia> versioninfo()
Julia Version 1.12.0-DEV.1438
Commit e08280a24fb (2024-10-20 01:04 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_EDITOR = subl |
Adding a few top level tests, which basically make the same LAPACK calls: julia> using OpenBLAS_jll, LinearAlgebra
julia> strip(unsafe_string(ccall(((:openblas_get_config64_), libopenblas), Ptr{UInt8}, () )))
"OpenBLAS 0.3.27 USE64BITINT DYNAMIC_ARCH NO_AFFINITY SkylakeX MAX_THREADS=512"
julia> A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
julia> A = A + transpose(A); #symmetric!
julia> B = bunchkaufman(A);
julia> R = bunchkaufman(A,true); # rook pivoting
julia> isapprox(inv(B), inv(A); rtol=eps(cond(A)))
false
julia> isapprox(B.P'*inv(B.U')*inv(B.D)*inv(B.U)*B.P, inv(A); rtol=eps(cond(A)))
true
julia> isapprox(inv(R), inv(A); rtol=eps(cond(A)))
true
|
Yes, the test passes for me on Haswell: julia> using OpenBLAS_jll, LinearAlgebra
julia> strip(unsafe_string(ccall(((:openblas_get_config64_), libopenblas), Ptr{UInt8}, () )))
"OpenBLAS 0.3.28 USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=512"
julia> A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
julia> A = A + transpose(A); #symmetric!
julia> B = bunchkaufman(A);
julia> R = bunchkaufman(A,true); # rook pivoting
julia> isapprox(inv(B), inv(A); rtol=eps(cond(A)))
true
julia> isapprox(B.P'*inv(B.U')*inv(B.D)*inv(B.U)*B.P, inv(A); rtol=eps(cond(A)))
true
julia> isapprox(inv(R), inv(A); rtol=eps(cond(A)))
true |
Thank you very much for the confirmation! Could you kindly compare your outputs to mine in the JLD2 file below? The JLD2 file was generated in Julia as follows: julia> using JLD2
julia> save("56255-1.jld2",Dict("A" => A, "B" => B, "R" => R, "inv(B)" => inv(B), "inv(A)" => inv(A), "inv(R)" => inv(R),"cond(A)" => cond(A), "eps(cond(A))" => eps(cond(A)))) And its contents can be inspected as follows: julia> using JLD2
julia> L = load("56255-1.jld2")
Dict{String, Any} with 8 entries:
"cond(A)" => 31.674
"B" => BunchKaufman{Float32, Matrix{Float32}, Vector{Int64}}(Float32[0.858494 -0.220169 … 0.218334 0.84004…
"A" => Float32[1.03302 1.13542 … 1.28012 1.48514; 1.13542 0.437221 … 0.911591 0.804944; … ; 1.28012 0.9115…
"inv(R)" => Float32[1.16483 0.256459 … -0.827415 0.334034; 0.256459 -0.671429 … 0.51427 -0.288833; … ; -0.82741…
"inv(B)" => Float32[1.16483 0.25646 … -0.827415 0.334034; 0.25646 -0.671429 … 0.51427 -0.288833; … ; -0.827415 …
"inv(A)" => Float32[1.16483 0.256459 … -0.827415 0.334033; 0.256459 -0.671429 … 0.51427 -0.288833; … ; -0.82741…
"R" => BunchKaufman{Float32, Matrix{Float32}, Vector{Int64}}(Float32[1.92042 0.663282 … 1.42713 0.84004; 1…
"eps(cond(A))" => 1.90735f-6
julia> L["cond(A)"]
31.67395f0 Although in my case JLD2 File
|
Here's what I obtain: Comparison between matricesjulia> L["cond(A)"] == cond(A)
true
julia> L["A"] == A
true
julia> L["inv(A)"] == inv(A)
false
julia> L["inv(A)"] - inv(A)
10×10 Matrix{Float32}:
-1.19209f-7 -2.38419f-7 1.78814f-7 -1.78814f-7 5.96046f-8 -2.68221f-7 -1.19209f-7 2.98023f-7 4.17233f-7 0.0
-1.19209f-7 1.78814f-7 1.19209f-7 0.0 -1.78814f-7 1.49012f-7 -2.98023f-8 -2.08616f-7 -5.96046f-8 2.98023f-8
1.19209f-7 1.19209f-7 -2.08616f-7 2.98023f-8 5.96046f-8 1.71363f-7 4.47035f-8 -1.19209f-7 -1.19209f-7 -1.3411f-7
-2.98023f-8 -1.78814f-7 2.98023f-8 1.19209f-7 5.96046f-8 -5.96046f-8 -5.96046f-8 5.96046f-8 -5.30854f-8 2.98023f-8
5.96046f-8 -1.78814f-7 2.98023f-8 2.98023f-8 5.96046f-8 -2.38419f-7 -5.96046f-8 1.93715f-7 1.19209f-7 5.96046f-8
-7.45058f-9 1.19209f-7 0.0 -5.96046f-8 -5.96046f-8 1.19209f-7 2.98023f-8 -9.68575f-8 -5.96046f-8 0.0
5.96046f-8 8.9407f-8 -7.45058f-8 8.9407f-8 0.0 -2.98023f-8 0.0 -8.9407f-8 -1.19209f-7 8.9407f-8
5.96046f-8 -1.19209f-7 -5.96046f-8 5.96046f-8 2.98023f-8 -2.23517f-8 -2.98023f-8 5.02914f-8 2.98023f-8 4.47035f-8
1.19209f-7 5.96046f-8 -1.78814f-7 -4.93601f-8 1.19209f-7 2.98023f-7 5.96046f-8 -8.9407f-8 -1.49012f-7 -2.08616f-7
-5.96046f-8 -5.96046f-8 2.98023f-8 0.0 -1.49012f-8 5.96046f-8 -4.47035f-8 1.49012f-8 5.96046f-8 0.0
julia> L["B"].U - B.U
10×10 Matrix{Float32}:
0.0 -1.78814f-7 2.98023f-7 0.0 -1.19209f-7 -2.38419f-7 -1.49012f-7 -2.98023f-7 4.47035f-8 0.0
0.0 0.0 -1.19209f-7 2.98023f-8 1.78814f-7 1.19209f-7 1.19209f-7 1.19209f-7 1.49012f-8 0.0
0.0 0.0 0.0 5.96046f-8 5.96046f-8 0.0 -1.19209f-7 -1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 -2.38419f-7 -1.19209f-7 0.0 -1.19209f-7 -5.96046f-8 0.0
0.0 0.0 0.0 0.0 0.0 7.17118f-8 -5.96046f-8 -2.98023f-8 2.98023f-8 0.0
0.0 0.0 0.0 0.0 0.0 0.0 -1.19209f-7 -1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
julia> L["B"].D - B.D
10×10 Tridiagonal{Float32, Vector{Float32}}:
3.57628f-7 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
0.0 -4.76837f-7 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
⋅ 0.0 0.0 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ 0.0 9.53674f-7 0.0 ⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ 0.0 -5.96046f-8 0.0 ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ 0.0 4.76837f-7 0.0 ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0 0.0 ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 5.96046f-8 0.0 ⋅
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 -5.96046f-8 0.0
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0
julia> L["inv(B)"] - inv(B)
10×10 Matrix{Float32}:
-4.76837f-7 1.19209f-7 8.34465f-7 -5.96046f-8 -7.7486f-7 7.45058f-9 5.36442f-7 -1.04308f-6 1.13249f-6 -7.15256f-7
1.19209f-7 3.57628f-7 -2.08616f-7 0.0 -1.78814f-7 4.47035f-8 8.9407f-8 -2.98023f-8 -2.38419f-7 5.96046f-8
8.34465f-7 -2.08616f-7 1.66893f-6 2.98023f-8 -6.55651f-7 -9.08971f-7 -2.38419f-7 -2.92063f-6 1.84774f-6 -6.25849f-7
-5.96046f-8 0.0 2.98023f-8 5.96046f-8 5.96046f-8 -5.96046f-8 -2.98023f-8 -1.49012f-7 1.46218f-7 5.96046f-8
-7.7486f-7 -1.78814f-7 -6.55651f-7 5.96046f-8 5.36442f-7 4.76837f-7 2.98023f-7 1.72853f-6 -7.15256f-7 8.9407f-8
7.45058f-9 4.47035f-8 -9.08971f-7 -5.96046f-8 4.76837f-7 3.57628f-7 -1.78814f-7 1.57952f-6 -1.19209f-6 4.76837f-7
5.36442f-7 8.9407f-8 -2.38419f-7 -2.98023f-8 2.98023f-7 -1.78814f-7 -4.17233f-7 -5.96046f-8 -2.98023f-7 3.12924f-7
-1.04308f-6 -2.98023f-8 -2.92063f-6 -1.49012f-7 1.72853f-6 1.57952f-6 -5.96046f-8 5.24521f-6 -3.12924f-6 1.08778f-6
1.13249f-6 -2.38419f-7 1.84774f-6 1.46218f-7 -7.15256f-7 -1.19209f-6 -2.98023f-7 -3.12924f-6 1.78814f-6 -7.7486f-7
-7.15256f-7 5.96046f-8 -6.25849f-7 5.96046f-8 8.9407f-8 4.76837f-7 3.12924f-7 1.08778f-6 -7.7486f-7 5.36442f-7
julia> L["inv(R)"] - inv(R)
10×10 Matrix{Float32}:
0.0 2.08616f-7 -5.96046f-8 3.8743f-7 2.98023f-8 -5.02914f-7 0.0 1.19209f-7 -1.78814f-7 2.68221f-7
2.08616f-7 2.38419f-7 -1.19209f-7 1.78814f-7 0.0 -1.78814f-7 -2.98023f-8 -7.45058f-8 -2.38419f-7 5.96046f-8
-5.96046f-8 -1.19209f-7 1.04308f-7 -2.38419f-7 -5.96046f-8 1.56462f-7 4.47035f-8 -1.19209f-7 2.38419f-7 -1.3411f-7
3.8743f-7 1.78814f-7 -2.38419f-7 1.78814f-7 1.49012f-7 -5.96046f-8 -1.78814f-7 2.98023f-8 -3.51109f-7 1.78814f-7
2.98023f-8 0.0 -5.96046f-8 1.49012f-7 1.19209f-7 -1.19209f-7 -6.70552f-8 1.49012f-7 -1.19209f-7 4.47035f-8
-5.02914f-7 -1.78814f-7 1.56462f-7 -5.96046f-8 -1.19209f-7 3.57628f-7 1.49012f-7 -1.49012f-8 3.27826f-7 -5.96046f-8
0.0 -2.98023f-8 4.47035f-8 -1.78814f-7 -6.70552f-8 1.49012f-7 5.96046f-8 -1.19209f-7 2.98023f-8 5.96046f-8
1.19209f-7 -7.45058f-8 -1.19209f-7 2.98023f-8 1.49012f-7 -1.49012f-8 -1.19209f-7 1.49012f-7 -1.19209f-7 5.96046f-8
-1.78814f-7 -2.38419f-7 2.38419f-7 -3.51109f-7 -1.19209f-7 3.27826f-7 2.98023f-8 -1.19209f-7 3.20375f-7 -5.96046f-8
2.68221f-7 5.96046f-8 -1.3411f-7 1.78814f-7 4.47035f-8 -5.96046f-8 5.96046f-8 5.96046f-8 -5.96046f-8 -2.38419f-7
julia> L["R"].U - R.U
10×10 Matrix{Float32}:
0.0 0.0 1.78814f-7 1.86265f-7 -2.79397f-8 1.49012f-7 0.0 0.0 0.0 0.0
0.0 0.0 5.96046f-8 -2.38419f-7 0.0 -1.19209f-7 2.98023f-8 -5.96046f-8 0.0 0.0
0.0 0.0 0.0 -4.47035f-8 5.21541f-8 0.0 3.72529f-8 0.0 0.0 0.0
0.0 0.0 0.0 0.0 8.3819f-8 8.9407f-8 5.96046f-8 7.45058f-9 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 8.9407f-8 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 -5.96046f-8 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
julia> L["R"].D - R.D
10×10 Tridiagonal{Float32, Vector{Float32}}:
-1.19209f-7 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
0.0 -2.38419f-7 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
⋅ 0.0 -5.96046f-8 0.0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ 0.0 -5.96046f-8 0.0 ⋅ ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ 0.0 7.45058f-8 0.0 ⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ 0.0 -2.98023f-8 0.0 ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0 0.0 ⋅ ⋅
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0 0.0 ⋅
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0 0.0
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0.0 0.0 |
Thank you very much again, @jishnub. Please see below for some general points and some points on your above comparison.
Regarding your comparison, I see (analysis below) that when the deltas to BunchKaufman's Analysis of the comparison# Original A matrix from above
julia> A = Float32[0.5165111 0.40158212 0.5574516 0.8424667 0.73727703 0.99013567 0.027817607 0.7097305 0.5340295 0.8029875; 0.73383516 0.21861029 0.79692626 0.31265068 0.82242167 0.96243966 0.32410073 0.043915033 0.39456594 0.27034175; 0.3094836 0.52017105 0.19609857 0.8344968 0.3720402 0.93413603 0.18523753 0.22515428 0.87819797 0.2643844; 0.1071493 0.19598752 0.45016456 0.69307053 0.26639366 0.30572063 0.052308798 0.64963084 0.070156276 0.18587488; 0.12925565 0.3273763 0.6430414 0.20246255 0.56501704 0.31385344 0.51954776 0.26923156 0.28905785 0.008479714; 0.91714334 0.056781054 0.4873765 0.7959971 0.08249408 0.61231 0.6228918 0.73107404 0.49783945 0.027323008; 0.25009543 0.2422458 0.21773565 0.25646633 0.4824924 0.7195719 0.54094857 0.9027088 0.7447457 0.5782057; 0.78010947 0.016479433 0.43123013 0.6893958 0.36214143 0.9177889 0.47928184 0.11119521 0.43025148 0.5782057; 0.74608874 0.5170252 0.9612415 0.7183566 0.41310024 0.9337084 0.25631362 0.40240157 0.45893312 0.8839705; 0.6821535 0.53460234 0.7399696 0.0029093027 0.42698807 0.22273403 0.8483786 0.7350969 0.8365097 0.8839705];
julia> A = A + transpose(A); #symmetric!
# Load the JLD file as well as define the deltas (Haswell to SkylakeX).
julia> using LinearAlgebra, JLD2
julia> L = load("56255-1.jld2");
julia> B = L["B"];
julia> dU = Matrix{Float32}([
0.0 -1.78814f-7 2.98023f-7 0.0 -1.19209f-7 -2.38419f-7 -1.49012f-7 -2.98023f-7 4.47035f-8 0.0
0.0 0.0 -1.19209f-7 2.98023f-8 1.78814f-7 1.19209f-7 1.19209f-7 1.19209f-7 1.49012f-8 0.0
0.0 0.0 0.0 5.96046f-8 5.96046f-8 0.0 -1.19209f-7 -1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 -2.38419f-7 -1.19209f-7 0.0 -1.19209f-7 -5.96046f-8 0.0
0.0 0.0 0.0 0.0 0.0 7.17118f-8 -5.96046f-8 -2.98023f-8 2.98023f-8 0.0
0.0 0.0 0.0 0.0 0.0 0.0 -1.19209f-7 -1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.19209f-7 1.19209f-7 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]);
julia> dD = Matrix{Float32}([
3.57628f-7 0.0 0 0 0 0 0 0 0 0
0.0 -4.76837f-7 0.0 0 0 0 0 0 0 0
0 0.0 0.0 0.0 0 0 0 0 0 0
0 0 0.0 9.53674f-7 0.0 0 0 0 0 0
0 0 0 0.0 -5.96046f-8 0.0 0 0 0 0
0 0 0 0 0.0 4.76837f-7 0.0 0 0 0
0 0 0 0 0 0.0 0.0 0.0 0 0
0 0 0 0 0 0 0.0 5.96046f-8 0.0 0
0 0 0 0 0 0 0 0.0 -5.96046f-8 0.0
0 0 0 0 0 0 0 0 0.0 0.0]);
# The original algorithm fails the current rtol
julia> isapprox(inv(B), inv(A); rtol=eps(cond(A)))
false julia> norm(inv(B)-inv(A))
8.977805f-6
# However, applying the diagonal and U deltas passes the test!
julia> B1 = BunchKaufman(B.LD + dD + dU, B.ipiv, 'U', true, false, 0);
julia> isapprox(inv(B1), inv(A); rtol=eps(cond(A)))
true
julia> norm(inv(B1)-inv(A))
4.343118f-6 Outcome:
|
Extending the analysis in [1] given below for inversion, it seems that the required The following script (for Gaussian matrices) seems to agree with this. function worstN(N)
R = zeros(N,1)
E = zeros(N,1)
B = zeros(Bool,N,1)
for i in 1:N
A = (X->(X+X')/2)(Float32.(randn(4,4)))
A1 = inv(A)
A2 = inv(bunchkaufman(A))
R[i] = norm(A1-A2)/max(norm(A1), norm(A2))
E[i] = eps(cond(A))
B[i] = isapprox(A1, A2; rtol=eps(cond(A)))
end
maximum(R./E)
end
for i = 1:100 display(worstN(1000000)); end Note: Also applies to
|
In the
aarch64-apple-darwin
job https://buildkite.com/julialang/julia-master/builds/41270#0192a51c-3afb-4a6f-a53f-3c3b9f537ed4,there is
Unfortunately, I am not being able to replicate it locally using the same seed but on a different platform (x86_64-linux-gnu). The issue seems to be related to floating-point accuracy.
The text was updated successfully, but these errors were encountered: