-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TAO interface #143
TAO interface #143
Conversation
I think that even if you are going to go via Numpy, this should be done by giving |
I think the 'one-or-more' pyadjoint interfaces mean that copying is going to be needed somewhere. e.g. there's nothing to stop a mixture of |
f771894
to
a976ee9
Compare
OK, I can see what's going on here. I actually think it would be an easy fix to do this without the global gather, it just requires a few very simple methods to be added to Overloaded type, but I guess that can be delayed. Please add a health warning about the global gather in the class docstring, because the comment on the PR will not be visible to users. |
It's not even a global gather, we just need to known the number of process local (owned) degrees of freedom, and |
OK, I'm confused again. I think the current implementation does a global gather onto a single numpy array. Is that not correct? |
You're correct. I'd missed (or forgotten) how This obviously won't scale, but I also don't see an alternative via the current |
e851645
to
8996d71
Compare
Firedrake test requires firedrakeproject/firedrake#3657 |
pyadjoint/overloaded_type.py
Outdated
The method should implement a routine for assigning other to `self`. | ||
|
||
Note: | ||
This method should not be employed for immutable types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line here highlights an issue with this approach. OverloadedType
actually conflates two sets of functionality:
- Things needed in order to tape a variable.
- Things needed in order to work with variables when computing taped operations (replay, tlm, adjoint etc.) and optimisation.
This functionality is part of 2 (it's not needed for taping). At that stage of operations, all variables are treated as immutable, so assignment looks like a strange operation at best. The fact that it doesn't work for some types is a further red flag.
Assignment itself is also not really what is meant here. The use case in question seems to be assigning a scalar to a vector, which is not well-defined (think about non-point-evaluation FunctionSpaces, for example). I think what is really meant here is interpolate.
On the mutability point, I think this should be more like _ad_init_object
and return a new value. I.e. the signature would be:
@classmethod
def _ad_interpolate(cls, obj, value):
and the required semantics would be to return a new object which is identical to obj
except that its value approximates value
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ad_iadd
and _ad_imul
are existing in-place operations. However I agree here an in-place operation is not needed. I think the easiest is just to change VecInterface.to_petsc
to handle the scalar case using VecSetValues
.
I also don't think scalar bounds are clearly defined. These could mean
- Bounds the dof values. I think this is in the intention, and is what
VecSetValues
would do. - Bound function values. Too difficult, but see Bound constraint possibly incorrect for higher order discretization #145.
- Bound interpolated function values, but this is backend specific. Interpolation may not be meaningfully defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would bring us back to the option of requiring the user to specify bounds of the control type. I think this is probably the right thing - Pyadjoint can't interpret the meaning of variables beyond basic Hilbert space operations (i.e. vector space operations + inner product).
I guess the corner case is the no bounds case. We could either make that a user problem or we'd have to add _ad
methods for producing variables that are element-wise minimal or maximal. That would be convenient for users but I'm not sure it smells good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requiring the user to specify bounds of the control type
I'd definitely prefer that. Existing pyadjoint examples use scalars, though.
I guess the corner case is the no bounds case.
That at least can be handled with VecSetValues
and np.finfo(PETSc.ScalarType).min
/np.finfo(PETSc.ScalarType).max
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the 'scalar bound = bound on all dofs' case using VecSet
. This might still lead to some surprises depending on the basis, but I think is consistent with pyadjoint behaviour elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current pyadjoint behaviour is described here.
pyadjoint/pyadjoint/optimization/optimization_problem.py
Lines 24 to 28 in 92121af
#: bounds: lower and upper bounds for the control (optional). None means | |
#: unbounded. if not None, then it must be a list of the same length as | |
#: the number controls for the reduced_functional. Each entry in the list | |
#: must be a tuple (lb, ub), where ub and lb are floats, or objects | |
#: of the same kind as the control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual check allows None
and int
as well.
pyadjoint/pyadjoint/optimization/optimization_problem.py
Lines 51 to 55 in 92121af
for b in bound: | |
b = create_overloaded_object(b, suppress_warning=True) | |
klass = control.tape_value().__class__ | |
if not (isinstance(b, (int, float, type(None), klass))): | |
raise TypeError("This pair (lb, ub) should be None, a float, or a %s." % klass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the meeting, this should be fixed separately from this PR.
pyadjoint/optimization/tao_solver.py
Outdated
def finalize_callback(*args): | ||
for arg in args: | ||
if arg is not None: | ||
arg.destroy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JDBetteridge says this needs to be .delayed_destroy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not available, but the possible deadlock is now documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JDBetteridge is unconvinced. Let's discuss this next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to work around the following, which leaks memory on my machine (Ubuntu 24.04, run on two processes)
from firedrake import *
from firedrake.petsc import garbage_cleanup
import numpy as np
from itertools import count
import gc
import weakref
class Mat:
def __init__(self, space):
self._x = Function(space)
self._test = TestFunction(space)
self._memory = np.full(1000000, 1, dtype=np.uint8)
def mult(self, x, y):
with self._x.dat.vec_wo:
x.copy(result=x_v)
y_c = assemble(inner(self._x, self._test) * dx)
with y_c.dat.vec_ro as y_v:
y_v.copy(result=y)
def attach_destroy_finalizer(obj, *args):
def finalize_callback(*args):
for arg in args:
if arg is not None:
arg.destroy()
finalize = weakref.finalize(obj, finalize_callback,
*args)
finalize.atexit = False
mesh = UnitSquareMesh(10, 10)
space = FunctionSpace(mesh, "Lagrange", 1)
u = Function(space, name="u")
with u.dat.vec_ro as u_v:
n, N = u_v.getSizes()
class Test:
pass
for _ in count():
t = Test()
A = Mat(space)
mat = PETSc.Mat().createPython(((n, N), (n, N)), A, comm=u.comm)
# attach_destroy_finalizer(t, mat) # Uncomment to fix leak
del t
gc.collect()
garbage_cleanup(u.comm)
Arguably it would be better to track down the source of the leak -- but the case where I've hit this was much harder to debug than the above, hence the defensive use of destroy
. The main downside is if properties are accessed and retained, e.g.
tao_solver = TAOSolver(...)
tao = tao_solver.tao
del tao_solver
# TaoDestroy has been called for tao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous example leaks because garbage_cleanup(u.comm)
doesn't call PETSc.garbage_cleanup(u.comm)
(is this a separate bug?).
pyadjoint/optimization/tao_solver.py
Outdated
n = sum(vec.getLocalSize() for vec in vecs) | ||
N = sum(vec.getSize() for vec in vecs) | ||
_, isets = PETSc.Vec().concatenate(vecs) | ||
for vec in vecs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the destructor is really not necessary, let's not do it.
if it is necessary then _ad_to_petsc()
should probably be a context manager so as to remove the footgun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
pyadjoint/optimization/tao_solver.py
Outdated
x_sub.restoreSubVector(iset, x_sub) | ||
|
||
|
||
class PETScOptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pirate the PETSc Options handler from Firedrake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pyadjoint/optimization/tao_solver.py
Outdated
to_petsc(ub_vec, ubs) | ||
tao.setVariableBounds(lb_vec, ub_vec) | ||
lb_vec.destroy() | ||
ub_vec.destroy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't destroy unless needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pyadjoint/optimization/tao_solver.py
Outdated
self, tao, H_matrix, M_inv_matrix, B_0_matrix_pc, B_0_matrix, x) | ||
|
||
@property | ||
def taoobjective(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 name should have underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, although I followed ROLSolver.rolobjective
.
pyadjoint/optimization/tao_solver.py
Outdated
self.tao.solve() | ||
self._vec_interface.from_petsc(self.x, M) | ||
if self.tao.getConvergedReason() <= 0: | ||
raise RuntimeError("Convergence failure") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the conventional exception for solver fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use a TAOConvergenceError
with a message similar to Firedrake linear solve failures.
pyadjoint doesn't seem to raise exceptions for optimizer failures in the SciPy or ROL interfaces. I'm pretty sure that's unsafe for the SciPy interface, #166.
Co-authored-by: Daiane Iglesia Dolci <[email protected]>
Co-authored-by: Daiane Iglesia Dolci <[email protected]>
Co-authored-by: Daiane Iglesia Dolci <[email protected]>
Co-authored-by: Daiane Iglesia Dolci <[email protected]>
…ove Firedrake specifics
03d8384
to
cd51a38
Compare
Optimisation using PETSc/TAO.
Uses the pyadjoint
OverloadedType
interface to provide a generic implementation. However these interfaces don't provide enough information about data decomposition with MPI parallelism. This is worked around in this PR usingOverloadedType._ad_assign_numpy
, but this requires a temporary, but global, NumPy array. Fixing this requires a change toOverloadedType
and then other updates elsewhere (in particular in Firedrake).Related,
OverloadedType._ad_to_list
returns alist
rather than a NumPyndarray
.Not added:
MaximisationProblem
support.TAOObjective.objective_gradient
followed byTAOObjective.hessian
).