Port potential Databox enhancements into Spiner from Singe -- transformations #95

BrendanKKrueger · 2024-08-26T22:30:43Z

PR Summary

Proposing changes to Spiner based on a wrapper that was written in Singe, in order to provide:

~~extrapolations beyond the edge of the data table;~~
transformations (e.g., log-log interpolation)

PR Checklist

Adapt example code from Singe to fit within Spiner
The Singe example hard-codes the transformation to be log-lin with both linear and logarithmic accessors for both axes; make this general and provide linear, logarithmic, NQT-log as example transformations.
Because Singe doesn't own Databox, this is a wrapper around Databox. Determine if these features should continue to be an extended wrapper around Databox or features of Databox itself.
Bring tests over from Singe or write new tests.
Code is formatted. (You can use the format_spiner make target.)

BrendanKKrueger · 2024-08-26T22:46:51Z

@dholladay00 and @Yurlungur, these are some features from Singe that Jonah and I discussed transferring over to Spiner to make them more widely accessible. My main question at the moment is whether you want these features to (a) be added directly to Databox, (b) be added to a wrapper around or extension to Databox, or (c) kept in Singe.

Yurlungur · 2024-08-27T00:16:11Z

Broadly I think I'm supportive of these features moving into spiner... but looking at the header file you added for this MR I wonder how much work it would be because I notice that your version pretty explicitly talks about, e.g., density and temperature, which I don't think we want to do with spiner. It should be more generic.

BrendanKKrueger · 2024-08-27T12:31:18Z

Oh, absolutely it should be (and will be) more generic. The first commit was just copying over the Singe file to show a starting point, but there's plenty in that file that's specific to Singe and that will need to be cleaned up to be more appropriate for Spiner.

BrendanKKrueger · 2024-09-13T23:04:05Z

A few implementation question that came up as I started adapting RegularGrid1D to have transformations:

Do we ever expect a transformation to have runtime state? The current implementation requires that transformation state (if there is any) be available at compile time, because it simplifies the interface.
- Allowing runtime state would also require that either (a) the transformation have a default constructor, or (b) users would be forced to update all their code to provide a transformation (even when there is a default constructor).
- ~~The current implementation assumes no runtime state, which means in the future there would be bigger changes if a transformation ever needed runtime state.~~
- Edit 2024-10-23: We decided that we don't want to allow transformations to have runtime state.
Do we consider x (the untransformed variable) or u (the transformed variable) to be the "ground truth"? Transformations may not always be perfectly symmetric, which creates the possibility of small gaps appearing. And because those gaps are at critical points (the endpoints of the domain), you could hit them more often than you might think.
- Edit 2024-10-23: See discussions below on handling this.
~~Should umin, umax, and du be accessible by the user?~~
- Edit 2024-10-23: I made (nearly?) everything about u hidden from the user in RegularGrid1D, so that it's treated as an internal detail that the user shouldn't be expected to know about.
The quantity dx is now poorly defined, because now du is fixed and dx is derived from the transformation, and it may vary across the domain. Do we need the dx function? The best we can do is probably to take an index or an x value and return the dx value for that specific interval. It probably makes more sense to remove it entirely, but that changes the interface that's available to the user.

…ing of the new feature without further adding to the overcrowdedness of test.cpp.

Yurlungur · 2024-09-30T17:38:41Z

Instead, I've opted for a "safer" option: write new methods get_data_value and set_data_value (I'm not attached to these names), which are classic accessors to the dependent variable values, which allow the DataBox to handle the transformations correctly.

This makes complete sense to me. I'm in favor.

Regarding `operator()`

I feel pretty strongly that operator needs to stay as databox sometimes plays the role of a multiD array, not an interpolator. And that's what some of the internal metadata. That said, I think I'm okay with your compromise solution, so long as we have a public accessor for the underlying pointer, which I think we do.

Yurlungur · 2024-09-30T17:39:02Z

Related to the operator() / accessor discussion: Should be set_data_value (or whatever we name it) be const? I made it const because there's already operator() const that returns mutable reference. This implies that when a DataBox is const, the metadata of the DataBox is const, but the dependent variable data is not constant.

I think that's right

BrendanKKrueger · 2024-09-30T17:55:32Z

The head of main isn't formatted correctly. I ran the clang-format, but now I'm rolling back changes to lines I didn't mess with. I'll leave it to y'all if you want to run clang-format completely or not.

Yurlungur · 2024-09-30T17:57:05Z

The head of main isn't formatted correctly. I ran the clang-format, but now I'm rolling back changes to lines I didn't mess with. I'll leave it to y'all if you want to run clang-format completely or not.

Let's just submit a separate MR to main that formats the code in one sweep.

Yurlungur

Thanks for adding this! Some comments below but overall I'm very happy with how this came out. As an aside, should I add the NQT logs and friends now in this PR? Or save them for a later implementation?

spiner/transformations.hpp

spiner/databox.hpp

BrendanKKrueger · 2024-10-02T22:42:21Z

should I add the NQT logs and friends now in this PR? Or save them for a later implementation?

Whatever makes you happy. I just didn't feel like getting into adding the NQT stuff as a dependency, because dependency management is often a tricky question and usually best left to a core team member. I'm fine if you add commits to this MR or if you add another MR that depends on this one.

Co-authored-by: Jonah Miller <[email protected]>

Yurlungur · 2024-10-11T21:37:49Z

should I add the NQT logs and friends now in this PR? Or save them for a later implementation?

Whatever makes you happy. I just didn't feel like getting into adding the NQT stuff as a dependency, because dependency management is often a tricky question and usually best left to a core team member. I'm fine if you add commits to this MR or if you add another MR that depends on this one.

I'll add them in a later MR... Very busy at the moment so that will keep things moving along.

…r message.

BrendanKKrueger · 2024-10-22T22:58:55Z

Earlier I stated

Do we consider x (the untransformed variable) or u (the transformed variable) to be the "ground truth"? Transformations may not always be perfectly symmetric, which creates the possibility of small gaps appearing. And because those gaps are at critical points (the endpoints of the domain), you could hit them more often than you might think.

Thinking about this more, I think I should clarify my thinking and get your thoughts. For additional clarity, let's define some notation:

x_in is the value of x that the user inputs into a given function
u_in is the value of x_in after transformation: u_in = Transform::forward(x_in)
x_tr is the value of x_in after a round-trip transformation: x_tr = Transform::reverse(u_in)
Analytically, x_in and x_tr are equal, but with finite-precision numerics they may not be equal.

Option 1: x is the "ground truth"

The bounds on u should be shifted slightly away from u_in so that when the user sets x_in to the lower bound or upper bound that the user passed into the constructor, we are guaranteed that those values will fall within the bounds of u_in regardless of how theoretically accurate u_in is and whether or not x_tr does something weird.
In this case, the Databox should save the x_in values for the lower and upper bounds, and the methods that query the bounds should return these saved values even if x_tr for the lower and upper bounds is different.

Option 2: u is the "ground truth"

The bounds on u are exactly the bounding values of u_in. This means that the bounds on x are the x_tr bounds rather than the x_in bounds, which may slightly shift the range of x. That means that the bounds on x that the user passed to the constructor as x_in values may or may not actually lie within the range of the Databox.
In this case we're already storing the u values, and calling the methods that query the bounds will just call Transform::reverse on the bounding u values.

The current implementation is option 2 (u is the "ground truth"), but thinking about it more, I think it should be option 1 (x is the "ground truth"). If you agree with that, then I'll have to put in some logic to shift the bounds in u to ensure that the bounds in x_in are within the bounds of x_tr. It's the end of the day, so I may try to address that tomorrow.

BrendanKKrueger · 2024-10-23T14:39:00Z

I thought about this more last night, and realized that my comment from the end of the day yesterday was incorrect. I now think the following is correct:

Note: f(x) = forward transform; r(u) = reverse transform.
Since ulo is derived from xlo, then f(xlo) = ulo and f(anything less than xlo) will be outside the u bounds. Similarly on the high end. So if x is the ground truth, we don't need to do any corrections to the transformations.
If u is the ground truth, then we do need to muck about with things.
- f(r(ulo)) is not guaranteed to be equal to ulo. If you call the method to query the lower bound, that will return r(ulo). If you then interpolate to that point, you would be interpolating to the point u = f(r(ulo)), which may or may not be within the u bounds.
- r(f(xlo)) is not guaranteed to be equal to xlo. If you call the constructor with xlo then ulo will be set as f(xlo). If you then call the method to query the lower bound, you get r(f(xlo)) back, which may or may not match xlo.
- There are two critical points (the lower and upper bounds), and two equations to resolve (f(r(u)) = u, r(f(x)) = x), so you would apply a linear offset to the forward and reverse transformations in order to get everything to work. But that could mess up any careful calibration of the transform itself (for example, the stuff I did with the logarithmic transform to ensure that r(f(0)) = 0).

I think that tells us:

We need to treat x as the "ground truth" representation, which makes sense since the user works in x-space and u-space should be treated like an internal detail not inherently visible to the user.
I need to check the code and ensure that the xlo and xhi values from the constructor are saved and used for the lower-bound and upper-bound query methods.

PS: The GitHub interface is driving me nuts, because I can't seem to make conversation threads, which means that every conversation gets interleaved with every other conversation.

…itHub don't always agree)

BrendanKKrueger · 2024-10-23T16:57:53Z

@Yurlungur, I updated how the bounds are handled, and added some additional testing. I found that we end up with an issue in RegularGrid1D::x(const int i), because that's derived from u so you have to make some sort of decision. I added a "TODO" comment with a few possible ways to handle this and very brief comments on each. I'd be interested in your thoughts on this.

BrendanKKrueger added the enhancement New feature or request label Aug 26, 2024

BrendanKKrueger self-assigned this Aug 26, 2024

BrendanKKrueger marked this pull request as draft August 26, 2024 22:30

BrendanKKrueger force-pushed the bkk_wrapper branch from 9afa8d4 to 3229fe1 Compare September 13, 2024 22:54

BrendanKKrueger added 7 commits September 19, 2024 08:22

Copy databox_wrapper.hpp from Singe as a starting point for discussion.

216d115

Notes

ac86b2f

Notes again

7d0865a

Start adding the transformations.

f99c241

Started writing the transformations.

e0e8b0a

Modify RegularGrid1D to allow for transformations.

5ce03cb

Forgot include

be1acd1

BrendanKKrueger force-pushed the bkk_wrapper branch from a87e220 to be1acd1 Compare September 19, 2024 14:22

BrendanKKrueger added 14 commits September 19, 2024 11:08

Forgot (a) some templates and (b) static.

fc0f83f

I removed dx() from RegularGrid1D, so update the tests.

622c8fb

Tests for transformations.

2eccab5

Add Transform template to DataBox (currently doesn't do anything yet).

becaae9

Update TODO items

18cd066

Add dependent variable transformation.

af53549

Update TODO items

4f8fbe2

Update TODO items

6e292a0

Add GPU testing for transformations.

fa2dd5b

Delete a bunch of stuff we no longer need.

c54a6a5

Move existing RegularGrid1D test to a new file, so I can add new test…

15aa48a

…ing of the new feature without further adding to the overcrowdedness of test.cpp.

Add some testing with transformations.

222bdae

Add a TODO

0bc4e32

test tags

db0323c

BrendanKKrueger added 2 commits September 30, 2024 11:59

format

c7afa64

missing header

28bcf26

BrendanKKrueger marked this pull request as ready for review September 30, 2024 18:03

BrendanKKrueger changed the title ~~Port potential Databox enhancements into Spiner from Singe~~ Port potential Databox enhancements into Spiner from Singe -- transformations Oct 1, 2024

Yurlungur approved these changes Oct 2, 2024

View reviewed changes

BrendanKKrueger and others added 8 commits October 2, 2024 16:43

Update spiner/transformations.hpp

51d11c8

Co-authored-by: Jonah Miller <[email protected]>

force inline

df8f052

constexpr

2f39150

Documentation

8d76c55

documentation again

09c3aed

Change logarithmic transform's epsilon value.

45fe193

format

8578f89

Merge branch 'main' into bkk_wrapper

9aaa291

Switch from SFINAE to static_assert so that we can give a better erro…

5942d20

…r message.

BrendanKKrueger added 7 commits October 23, 2024 08:50

x is the 'ground truth' representation, so fix the bounds.

624d244

Cleaning up things that should (probably?) be private.

0486a9d

More cleanup

aba2fd4

Expanded tests for RegularGrid1D.

82fe518

Expand TODO comment.

ab02e9f

format

1955937

format (again... doing it manually because clang-format here and on G…

fce0f56

…itHub don't always agree)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port potential Databox enhancements into Spiner from Singe -- transformations #95

Port potential Databox enhancements into Spiner from Singe -- transformations #95

BrendanKKrueger commented Aug 26, 2024 •

edited

Loading

BrendanKKrueger commented Aug 26, 2024

Yurlungur commented Aug 27, 2024

BrendanKKrueger commented Aug 27, 2024

BrendanKKrueger commented Sep 13, 2024 •

edited

Loading

Yurlungur commented Sep 30, 2024

Yurlungur commented Sep 30, 2024

BrendanKKrueger commented Sep 30, 2024

Yurlungur commented Sep 30, 2024

Yurlungur left a comment

BrendanKKrueger commented Oct 2, 2024

Yurlungur commented Oct 11, 2024

BrendanKKrueger commented Oct 22, 2024

BrendanKKrueger commented Oct 23, 2024

BrendanKKrueger commented Oct 23, 2024

Port potential Databox enhancements into Spiner from Singe -- transformations #95

Are you sure you want to change the base?

Port potential Databox enhancements into Spiner from Singe -- transformations #95

Conversation

BrendanKKrueger commented Aug 26, 2024 • edited Loading

PR Summary

PR Checklist

BrendanKKrueger commented Aug 26, 2024

Yurlungur commented Aug 27, 2024

BrendanKKrueger commented Aug 27, 2024

BrendanKKrueger commented Sep 13, 2024 • edited Loading

Yurlungur commented Sep 30, 2024

Regarding operator()

Yurlungur commented Sep 30, 2024

BrendanKKrueger commented Sep 30, 2024

Yurlungur commented Sep 30, 2024

Yurlungur left a comment

Choose a reason for hiding this comment

BrendanKKrueger commented Oct 2, 2024

Yurlungur commented Oct 11, 2024

BrendanKKrueger commented Oct 22, 2024

BrendanKKrueger commented Oct 23, 2024

BrendanKKrueger commented Oct 23, 2024

BrendanKKrueger commented Aug 26, 2024 •

edited

Loading

BrendanKKrueger commented Sep 13, 2024 •

edited

Loading

Regarding `operator()`