`ReducedFunctional.optimize_tape` will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

JHopeCollins · 2024-10-11T10:54:17Z

This PR addresses #170

Currently ReducedFunctional.optimize_tape will modify self.tape in-place. This means that if anything else is using the same tape, then they now also have a modified tape.

This PR adds an optional replace argument to optimize_tape to give the option of copying the tape and replacing self.tape with the optimized tape, without modifying the original tape. The optimized tape is also returned (whether a copy was made or not).
I did it this way so that the default call has the same behaviour as before, so it shouldn't break any existing code - unless someone is relying on it returning None, but seeing as this is implicit and non-documented behaviour I don't think we care.

@dham also suggested that the default behaviour should change so that it doesn't modify self.tape, but just returns the optimized tape. Is this something we want to do? If so, should it raise a warning for now that the default behaviour will change in the future, and delay making the change for a bit?
I think if this behaviour does change, it would be good to have an optimize_tape(modify=False) argument to optionally still update self.tape, otherwise we'll probably see Jhat.tape = Jhat.optimize_tape() popping up all over.

…h optimized tape, but not modify original tape

jrmaddison · 2024-10-11T11:22:49Z

I think if this behaviour does change, it would be good to have an optimize_tape(modify=False) argument to optionally still update self.tape, otherwise we'll probably see Jhat.tape = Jhat.optimize_tape() popping up all over.

Is optimizing the tape actually expensive (it involves just a few passes over the tape)? I'd expect it's cheap enough (compared e.g. to a gradient calculation) that is isn't necessary to cache the optimized tape.

JHopeCollins · 2024-10-11T11:31:49Z

I think if this behaviour does change, it would be good to have an optimize_tape(modify=False) argument to optionally still update self.tape, otherwise we'll probably see Jhat.tape = Jhat.optimize_tape() popping up all over.

Is optimizing the tape actually expensive (it involves just a few passes over the tape)? I'd expect it's cheap enough (compared e.g. to a gradient calculation) that is isn't necessary to cache the optimized tape.

The optional argument to update self.tape is so that you can still have (close to) the same behaviour as you have currently, i.e. optimize_tape actually does modify self.tape.

Otherwise anywhere that Jhat.optimize_tape() is currently called will inevitably get replaced with Jhat.tape = Jhat.optimize_tape(). This feels like a bit of a code smell because what if at some point the implementation changes and there's other things that ReducedFunctional needs to do if self.tape changes?

I guess the answer in that case would be to make ReducedFunctional.tape a property, but the Jhat.tape = Jhat.optimize_tape() still feels clunky compared to just having Jhat.optimize_tape(modify=True) (or whatever you want to call the argument).

jrmaddison · 2024-10-11T14:29:47Z

What I mean is that, since it's (probably) cheap, the activity analysis can be performed each time a derivative is computed. i.e. I think it makes more sense as an argument, ReducedFunctional.derivative(..., optimize=True), than as a method modifying the ReducedFunctional. The optimized version of the tape could easily be cached if that's still wanted, but not as ReducedFunctional.tape.

Modifying ReducingFunctional.tape probably also breaks Hessian actions.

dham · 2024-10-14T08:45:03Z

Modifying ReducingFunctional.tape probably also breaks Hessian actions.

Can you explain this? I don't understand why it would.

jrmaddison · 2024-10-14T10:17:47Z

Can you explain this? I don't understand why it would.

You're correct, I've worked through the second order and I think it's OK.

…r, and just transfers any package data that doesn't have a copy method

JHopeCollins · 2024-10-25T12:43:10Z

I've tested out having ReducedFunctional.__init__ take a copy of the tape and optimize it, so that ReducedFunctional only ever has a view over the most minimal part of the tape required.

The main issue I ran into is that Tape.copy creates an incomplete copy that only works in the most simple situations, mainly down to not copying checkpoints. The current copy doesn't transfer over the checkpoint manager, so there were errors anywhere a manager was used. I made two changes in Tape.copy to "fix" this. Better fixes are almost certainly needed.

The checkpoint manager is attached to the new tape. This fixed the errors for in-memory checkpointing.
The private _checkpoint_manager member is set explicitly because Tape.enable_checkpointing will throw an error if the tape already has blocks on. The checkpoint manager now has a reference to the wrong tape (but at the time of the copy that wrong tape still has all the right blocks for the new Tape in it). I imagine this is a Very Bad Thing to do, open to suggestions for the Right Thing.
I could have a go at implementing a copy for CheckpointManager, I imagine it would need to take a tape argument and try to copy only the bits of the schedule that correspond to that tape (and error if it doesn't have something it needs).
Any items in the package_data that don't have a copy method just get transferred to the package_data for the new Tape. This fixes the errors for disk checkpointing.
This was to get around the fact that firedrake.DiskCheckpointer doesn't implement copy. Again, open to suggestions for the Right Thing.

JHopeCollins · 2024-11-22T10:39:32Z

Closing this PR as it isn't clear how to deal with the issues in the comment above, particularly what it means to copy and optimize a tape with a checkpoint manager.
The aim of this PR (multiple ReducedFunctionals with very localised and independent tapes) can be achieved with liberal use of set_working_tape and stop_annotating.

ReducedFunctional.optimize_tape will optionally replace self.tape wit…

86197a8

…h optimized tape, but not modify original tape

JHopeCollins added enhancement New feature or request question Further information is requested labels Oct 11, 2024

JHopeCollins requested a review from dham October 11, 2024 10:54

JHopeCollins self-assigned this Oct 11, 2024

JHopeCollins linked an issue Oct 11, 2024 that may be closed by this pull request

ReducedFunctional.optimize_tape should replace self.tape with the optimized tape without modifying the original self.tape. #170

Closed

ReducedFunctional.optimize_tape returns optimized tape

5846632

JHopeCollins added 3 commits October 24, 2024 10:28

ReducedFunctional uses an optimized copy of the tape

71ed944

should copying a control give the current or original value?

3ba7929

Tape.copy now also copies over the checkpoint manager and progress ba…

1f728f5

…r, and just transfers any package data that doesn't have a copy method

tape.copy copies over latest checkpoint

87d1ea4

JHopeCollins closed this Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ReducedFunctional.optimize_tape` will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

`ReducedFunctional.optimize_tape` will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

JHopeCollins commented Oct 11, 2024

jrmaddison commented Oct 11, 2024

JHopeCollins commented Oct 11, 2024

jrmaddison commented Oct 11, 2024

dham commented Oct 14, 2024

jrmaddison commented Oct 14, 2024

JHopeCollins commented Oct 25, 2024

JHopeCollins commented Nov 22, 2024

ReducedFunctional.optimize_tape will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

ReducedFunctional.optimize_tape will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

Conversation

JHopeCollins commented Oct 11, 2024

jrmaddison commented Oct 11, 2024

JHopeCollins commented Oct 11, 2024

jrmaddison commented Oct 11, 2024

dham commented Oct 14, 2024

jrmaddison commented Oct 14, 2024

JHopeCollins commented Oct 25, 2024

JHopeCollins commented Nov 22, 2024

`ReducedFunctional.optimize_tape` will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171

`ReducedFunctional.optimize_tape` will optionally _replace_ self.tape with the optimized tape, without modifying original tape. #171