Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flipwind: reduce roundoff in phase winding a[n] calc #534

Merged
merged 5 commits into from
Sep 10, 2024
Merged

Conversation

ahbarnett
Copy link
Collaborator

reduce rounding error in onedim_fseries_kernel CPU code by avoiding catastrophic cancellation via (nf/2 - z[n]) being sent into the complex exp.
A minus sign is used instead. The reason for the nf/2 shift is obscure and to do with the find grid being {0,1,...,nf-1} but sitting physically in [-pi,pi), I think. It's as if the kernel were centered at nf/2 when its Fourier series is evaluated.

Here's an extreme case where tol is around epsmach * N1. Compare the rel l2-err figures (ignore one mode):
master:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-9 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.535 s 	18.7 NU pts/s
	one mode: rel err in F[3700000] is 1.27e-08
	dirft1d: rel l2-err of result F is 6.99e-08
test 1d type 2:
	10000000 modes to 10 NU pts in 0.458 s 	21.8 NU pts/s
	one targ: rel err in c[5] is 6.57e-08
	dirft1d: rel l2-err of result c is 7.57e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.15 s        	8.7e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.02e-07
	dirft1d: rel l2-err of result F is 2.12e-07
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-8 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.544 s 	18.4 NU pts/s
	one mode: rel err in F[3700000] is 7.87e-09
	dirft1d: rel l2-err of result F is 1.18e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.479 s 	20.9 NU pts/s
	one targ: rel err in c[5] is 1.36e-07
	dirft1d: rel l2-err of result c is 1.37e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.12 s        	8.94e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.41e-08
	dirft1d: rel l2-err of result F is 2.48e-08
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-7 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.536 s 	18.6 NU pts/s
	one mode: rel err in F[3700000] is 8.22e-09
	dirft1d: rel l2-err of result F is 1.27e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.452 s 	22.1 NU pts/s
	one targ: rel err in c[5] is 5.62e-08
	dirft1d: rel l2-err of result c is 1.29e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.06 s        	9.41e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 2.95e-07
	dirft1d: rel l2-err of result F is 6.01e-07

flipwind:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-9 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.527 s 	19 NU pts/s
	one mode: rel err in F[3700000] is 1.01e-08
	dirft1d: rel l2-err of result F is 9.82e-09
test 1d type 2:
	10000000 modes to 10 NU pts in 0.461 s 	21.7 NU pts/s
	one targ: rel err in c[5] is 1.04e-08
	dirft1d: rel l2-err of result c is 1.1e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.17 s        	8.58e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 4.82e-08
	dirft1d: rel l2-err of result F is 9.96e-08
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-8 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.539 s 	18.5 NU pts/s
	one mode: rel err in F[3700000] is 1.48e-09
	dirft1d: rel l2-err of result F is 3.23e-08
test 1d type 2:
	10000000 modes to 10 NU pts in 0.468 s 	21.4 NU pts/s
	one targ: rel err in c[5] is 1.66e-08
	dirft1d: rel l2-err of result c is 2.51e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.13 s        	8.86e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.14e-07
	dirft1d: rel l2-err of result F is 2.55e-07
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-7 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.536 s 	18.7 NU pts/s
	one mode: rel err in F[3700000] is 8.61e-09
	dirft1d: rel l2-err of result F is 1.26e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.454 s 	22 NU pts/s
	one targ: rel err in c[5] is 4.43e-08
	dirft1d: rel l2-err of result c is 1.23e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.13 s        	8.87e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 2.35e-07
	dirft1d: rel l2-err of result F is 6.11e-07

Thus is seems to reduce the "bounce-back" of errors when tol reaches epsmach * max(N1,N2,N3).

@ahbarnett ahbarnett changed the title Flipwind Flipwind: reduce roundoff in phase winding a[n] calc Aug 20, 2024
@ahbarnett
Copy link
Collaborator Author

ahbarnett commented Aug 20, 2024

Thus is gains nearly a full digit in the most dramatic case (t1 and t2 mostly; effect on t3 is less, I believe because its deconv array does not use such a shift). There is one t3 case (1e-8) made a digit worse, strangely, but I don't think it deserved to be that good in master. To study.

@DiamonDinoia This should be brought into GPU t1 and t2 code, at some point.

@ahbarnett
Copy link
Collaborator Author

More typical comparison (maybe not enough NU pts in prev):

master:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-9 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.079 s 	1.27e+03 NU pts/s
	one mode: rel err in F[370000] is 4.63e-10
	dirft1d: rel l2-err of result F is 1.39e-09
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0583 s 	1.71e+03 NU pts/s
	one targ: rel err in c[50] is 5.89e-10
	dirft1d: rel l2-err of result c is 1.17e-09
test 1d type 3:
	100 NU to 1000000 NU in 0.185 s        	5.39e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.01e-10
	dirft1d: rel l2-err of result F is 1.86e-09
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-10 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0793 s 	1.26e+03 NU pts/s
	one mode: rel err in F[370000] is 1.28e-10
	dirft1d: rel l2-err of result F is 2.89e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0597 s 	1.68e+03 NU pts/s
	one targ: rel err in c[50] is 2.32e-11
	dirft1d: rel l2-err of result c is 2.45e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.186 s        	5.37e+06 tot NU pts/s
	one targ: rel err in F[500000] is 1.31e-10
	dirft1d: rel l2-err of result F is 3.98e-10
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-11 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0803 s 	1.24e+03 NU pts/s
	one mode: rel err in F[370000] is 7.24e-11
	dirft1d: rel l2-err of result F is 1.56e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0616 s 	1.62e+03 NU pts/s
	one targ: rel err in c[50] is 3.21e-11
	dirft1d: rel l2-err of result c is 1.43e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.184 s        	5.45e+06 tot NU pts/s
	one targ: rel err in F[500000] is 1.08e-10
	dirft1d: rel l2-err of result F is 3.12e-10

flipwind:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-9 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0803 s 	1.24e+03 NU pts/s
	one mode: rel err in F[370000] is 4.33e-10
	dirft1d: rel l2-err of result F is 1.37e-09
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0566 s 	1.77e+03 NU pts/s
	one targ: rel err in c[50] is 6.04e-10
	dirft1d: rel l2-err of result c is 1.14e-09
test 1d type 3:
	100 NU to 1000000 NU in 0.215 s        	4.65e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.1e-10
	dirft1d: rel l2-err of result F is 1.84e-09
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-10 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.081 s 	1.23e+03 NU pts/s
	one mode: rel err in F[370000] is 5.86e-11
	dirft1d: rel l2-err of result F is 1.7e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0584 s 	1.71e+03 NU pts/s
	one targ: rel err in c[50] is 2.58e-11
	dirft1d: rel l2-err of result c is 1.43e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.187 s        	5.34e+06 tot NU pts/s
	one targ: rel err in F[500000] is 4.48e-11
	dirft1d: rel l2-err of result F is 2.96e-10
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-11 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0792 s 	1.26e+03 NU pts/s
	one mode: rel err in F[370000] is 3.32e-11
	dirft1d: rel l2-err of result F is 1.17e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0612 s 	1.63e+03 NU pts/s
	one targ: rel err in c[50] is 3.06e-11
	dirft1d: rel l2-err of result c is 1.16e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.183 s        	5.47e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.06e-11
	dirft1d: rel l2-err of result F is 2.24e-10

0.3 digits better for tol = 1e-10, 1e-11.

@DiamonDinoia
Copy link
Collaborator

Hi Alex, this looks good to me. Shall we bring this in for 2.3.0? Or delay to the next release?

@ahbarnett
Copy link
Collaborator Author

ahbarnett commented Aug 21, 2024 via email

@DiamonDinoia
Copy link
Collaborator

I totally agree. There are a lot of changes for 2.3.0 already.

@DiamonDinoia DiamonDinoia added this to the 2.4 milestone Aug 27, 2024
@ahbarnett ahbarnett merged commit 311e419 into master Sep 10, 2024
90 of 91 checks passed
@ahbarnett ahbarnett deleted the flipwind branch September 10, 2024 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants