Flipwind: reduce roundoff in phase winding a[n] calc #534

ahbarnett · 2024-08-20T22:04:07Z

reduce rounding error in onedim_fseries_kernel CPU code by avoiding catastrophic cancellation via (nf/2 - z[n]) being sent into the complex exp.
A minus sign is used instead. The reason for the nf/2 shift is obscure and to do with the find grid being {0,1,...,nf-1} but sitting physically in [-pi,pi), I think. It's as if the kernel were centered at nf/2 when its Fourier series is evaluated.

Here's an extreme case where tol is around epsmach * N1. Compare the rel l2-err figures (ignore one mode):
master:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-9 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.535 s 	18.7 NU pts/s
	one mode: rel err in F[3700000] is 1.27e-08
	dirft1d: rel l2-err of result F is 6.99e-08
test 1d type 2:
	10000000 modes to 10 NU pts in 0.458 s 	21.8 NU pts/s
	one targ: rel err in c[5] is 6.57e-08
	dirft1d: rel l2-err of result c is 7.57e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.15 s        	8.7e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.02e-07
	dirft1d: rel l2-err of result F is 2.12e-07
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-8 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.544 s 	18.4 NU pts/s
	one mode: rel err in F[3700000] is 7.87e-09
	dirft1d: rel l2-err of result F is 1.18e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.479 s 	20.9 NU pts/s
	one targ: rel err in c[5] is 1.36e-07
	dirft1d: rel l2-err of result c is 1.37e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.12 s        	8.94e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.41e-08
	dirft1d: rel l2-err of result F is 2.48e-08
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-7 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.536 s 	18.6 NU pts/s
	one mode: rel err in F[3700000] is 8.22e-09
	dirft1d: rel l2-err of result F is 1.27e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.452 s 	22.1 NU pts/s
	one targ: rel err in c[5] is 5.62e-08
	dirft1d: rel l2-err of result c is 1.29e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.06 s        	9.41e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 2.95e-07
	dirft1d: rel l2-err of result F is 6.01e-07

flipwind:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-9 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.527 s 	19 NU pts/s
	one mode: rel err in F[3700000] is 1.01e-08
	dirft1d: rel l2-err of result F is 9.82e-09
test 1d type 2:
	10000000 modes to 10 NU pts in 0.461 s 	21.7 NU pts/s
	one targ: rel err in c[5] is 1.04e-08
	dirft1d: rel l2-err of result c is 1.1e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.17 s        	8.58e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 4.82e-08
	dirft1d: rel l2-err of result F is 9.96e-08
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-8 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.539 s 	18.5 NU pts/s
	one mode: rel err in F[3700000] is 1.48e-09
	dirft1d: rel l2-err of result F is 3.23e-08
test 1d type 2:
	10000000 modes to 10 NU pts in 0.468 s 	21.4 NU pts/s
	one targ: rel err in c[5] is 1.66e-08
	dirft1d: rel l2-err of result c is 2.51e-08
test 1d type 3:
	10 NU to 10000000 NU in 1.13 s        	8.86e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 1.14e-07
	dirft1d: rel l2-err of result F is 2.55e-07
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e7 1e1 1e-7 0 2 1.25
test 1d type 1:
	10 NU pts to 10000000 modes in 0.536 s 	18.7 NU pts/s
	one mode: rel err in F[3700000] is 8.61e-09
	dirft1d: rel l2-err of result F is 1.26e-07
test 1d type 2:
	10000000 modes to 10 NU pts in 0.454 s 	22 NU pts/s
	one targ: rel err in c[5] is 4.43e-08
	dirft1d: rel l2-err of result c is 1.23e-07
test 1d type 3:
	10 NU to 10000000 NU in 1.13 s        	8.87e+06 tot NU pts/s
	one targ: rel err in F[5000000] is 2.35e-07
	dirft1d: rel l2-err of result F is 6.11e-07

Thus is seems to reduce the "bounce-back" of errors when tol reaches epsmach * max(N1,N2,N3).

…; may be more accurate

ahbarnett · 2024-08-20T22:05:30Z

Thus is gains nearly a full digit in the most dramatic case (t1 and t2 mostly; effect on t3 is less, I believe because its deconv array does not use such a shift). There is one t3 case (1e-8) made a digit worse, strangely, but I don't think it deserved to be that good in master. To study.

@DiamonDinoia This should be brought into GPU t1 and t2 code, at some point.

ahbarnett · 2024-08-20T22:17:49Z

More typical comparison (maybe not enough NU pts in prev):

master:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-9 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.079 s 	1.27e+03 NU pts/s
	one mode: rel err in F[370000] is 4.63e-10
	dirft1d: rel l2-err of result F is 1.39e-09
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0583 s 	1.71e+03 NU pts/s
	one targ: rel err in c[50] is 5.89e-10
	dirft1d: rel l2-err of result c is 1.17e-09
test 1d type 3:
	100 NU to 1000000 NU in 0.185 s        	5.39e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.01e-10
	dirft1d: rel l2-err of result F is 1.86e-09
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-10 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0793 s 	1.26e+03 NU pts/s
	one mode: rel err in F[370000] is 1.28e-10
	dirft1d: rel l2-err of result F is 2.89e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0597 s 	1.68e+03 NU pts/s
	one targ: rel err in c[50] is 2.32e-11
	dirft1d: rel l2-err of result c is 2.45e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.186 s        	5.37e+06 tot NU pts/s
	one targ: rel err in F[500000] is 1.31e-10
	dirft1d: rel l2-err of result F is 3.98e-10
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-11 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0803 s 	1.24e+03 NU pts/s
	one mode: rel err in F[370000] is 7.24e-11
	dirft1d: rel l2-err of result F is 1.56e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0616 s 	1.62e+03 NU pts/s
	one targ: rel err in c[50] is 3.21e-11
	dirft1d: rel l2-err of result c is 1.43e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.184 s        	5.45e+06 tot NU pts/s
	one targ: rel err in F[500000] is 1.08e-10
	dirft1d: rel l2-err of result F is 3.12e-10

flipwind:

(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-9 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0803 s 	1.24e+03 NU pts/s
	one mode: rel err in F[370000] is 4.33e-10
	dirft1d: rel l2-err of result F is 1.37e-09
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0566 s 	1.77e+03 NU pts/s
	one targ: rel err in c[50] is 6.04e-10
	dirft1d: rel l2-err of result c is 1.14e-09
test 1d type 3:
	100 NU to 1000000 NU in 0.215 s        	4.65e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.1e-10
	dirft1d: rel l2-err of result F is 1.84e-09
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-10 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.081 s 	1.23e+03 NU pts/s
	one mode: rel err in F[370000] is 5.86e-11
	dirft1d: rel l2-err of result F is 1.7e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0584 s 	1.71e+03 NU pts/s
	one targ: rel err in c[50] is 2.58e-11
	dirft1d: rel l2-err of result c is 1.43e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.187 s        	5.34e+06 tot NU pts/s
	one targ: rel err in F[500000] is 4.48e-11
	dirft1d: rel l2-err of result F is 2.96e-10
(base) alex@ross /home/alex/numerics/finufft> test/finufft1d_test 1e6 1e2 1e-11 0 2 2.0
test 1d type 1:
	100 NU pts to 1000000 modes in 0.0792 s 	1.26e+03 NU pts/s
	one mode: rel err in F[370000] is 3.32e-11
	dirft1d: rel l2-err of result F is 1.17e-10
test 1d type 2:
	1000000 modes to 100 NU pts in 0.0612 s 	1.63e+03 NU pts/s
	one targ: rel err in c[50] is 3.06e-11
	dirft1d: rel l2-err of result c is 1.16e-10
test 1d type 3:
	100 NU to 1000000 NU in 0.183 s        	5.47e+06 tot NU pts/s
	one targ: rel err in F[500000] is 3.06e-11
	dirft1d: rel l2-err of result F is 2.24e-10

0.3 digits better for tol = 1e-10, 1e-11.

DiamonDinoia · 2024-08-21T15:22:00Z

Hi Alex, this looks good to me. Shall we bring this in for 2.3.0? Or delay to the next release?

ahbarnett · 2024-08-21T15:28:15Z

Let's wait to post-2.3.0 tag otherwise there may be unintended consequences. It's a change in core math. Annoying to wait, I know.

…

On Wed, Aug 21, 2024 at 11:23 AM Marco Barbone ***@***.***> wrote: Hi Alex, this looks good to me. Shall we bring this in for 2.3.0? Or delay to the next release? — Reply to this email directly, view it on GitHub <#534 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACNZRSV3ENGCLC32CG433T3ZSSWC7AVCNFSM6AAAAABM2YTDOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSGM3DCNRSGQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

DiamonDinoia · 2024-08-21T16:00:24Z

I totally agree. There are a lot of changes for 2.3.0 already.

ahbarnett added 3 commits August 20, 2024 17:25

onedim_fseries_kernel use minus sign instead of subtraction nf/2-z[n]…

afc4839

…; may be more accurate

use negation instead of nf/2 shift of kernel in onedim_fseries_kernel

20aeacf

remove extraneous minus inside sign-flipped a[n]

60e9ee8

ahbarnett changed the title ~~Flipwind~~ Flipwind: reduce roundoff in phase winding a[n] calc Aug 20, 2024

changelog re a[n]

754d5b6

DiamonDinoia added this to the 2.4 milestone Aug 27, 2024

Merge branch 'master' into flipwind

2dc001a

ahbarnett merged commit 311e419 into master Sep 10, 2024
90 of 91 checks passed

ahbarnett deleted the flipwind branch September 10, 2024 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flipwind: reduce roundoff in phase winding a[n] calc #534

Flipwind: reduce roundoff in phase winding a[n] calc #534

ahbarnett commented Aug 20, 2024

ahbarnett commented Aug 20, 2024 •

edited

Loading

ahbarnett commented Aug 20, 2024

DiamonDinoia commented Aug 21, 2024

ahbarnett commented Aug 21, 2024 via email

DiamonDinoia commented Aug 21, 2024

Flipwind: reduce roundoff in phase winding a[n] calc #534

Flipwind: reduce roundoff in phase winding a[n] calc #534

Conversation

ahbarnett commented Aug 20, 2024

ahbarnett commented Aug 20, 2024 • edited Loading

ahbarnett commented Aug 20, 2024

DiamonDinoia commented Aug 21, 2024

ahbarnett commented Aug 21, 2024 via email

DiamonDinoia commented Aug 21, 2024

ahbarnett commented Aug 20, 2024 •

edited

Loading