Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary of breakages when updating problems with conservation laws #952

Open
TorkelE opened this issue Jun 13, 2024 · 27 comments
Open

Summary of breakages when updating problems with conservation laws #952

TorkelE opened this issue Jun 13, 2024 · 27 comments
Labels

Comments

@TorkelE
Copy link
Member

TorkelE commented Jun 13, 2024

So at this stage I am starting to be uncertain exactly what is intended to work and what is not intended to work. At least here is a summary of the tests I have broken, and which works and which do not work. There are definitely some oddities going on here.

# Create model and fetch the conservation parameter (Γ).
t = default_t()
@parameters k1 k2
@species X1(t) X2(t)
rxs = [
    Reaction(k1, [X1], [X2]),
    Reaction(k2, [X2], [X1])
]
@named rs = ReactionSystem(rxs, t)
osys = convert(ODESystem, complete(rs); remove_conserved = true)
osys = complete(osys)
@unpack Γ = osys

# Creates an `ODEProblem`.
u0 = [X1 => 1.0, X2 => 2.0]
ps = [k1 => 0.1, k2 => 0.2]
oprob = ODEProblem(osys, u0, (0.0, 1.0), ps)

# Check `ODEProblem` content.
oprob[X1] == 1.0
oprob[X2] == 2.0
oprob.ps[k1] == 0.1
oprob.ps[k2] == 0.2
oprob.ps[Γ[1]] == 3.0

# Attempts to update problem  by giving new values for `X2` (broken)
@test_broken oprob[X2] = 20.0
@test_broken oprob_new = remake(oprob; u0 = [X1 => 10.0, X2 => 20.0])

# Updates problem using `remake` and check that the new values are correct.
oprob_new = remake(oprob; u0 = [X1 => 10.0])
@test oprob_new[X1] == 10.0
@test_broken oprob_new[X2] == 2.0 # Currently -7.
@test_broken oprob_new.ps[Γ[1]] == 12.0 # Currently 3.0.
integrator = init(oprob_new, Tsit5())
@test integrator[X1] == 10.0
@test_broken integrator[X2] == 2.0 # Currently -7.
@test_broken integrator.ps[Γ[1]] == 12.0 # Currently 3.0

# Updates problem using normal indexing (uncertain exactly what intended behaviour here should be).
oprob[X1] = 10.0
@test oprob[X1] == 10.0
@test_broken oprob[X2] == 2.0 # Currently -7.
@test_broken oprob.ps[Γ[1]] == 12.0 # Currently 3.0.
integrator = init(oprob, Tsit5())
@test integrator[X1] == 10.0
@test_broken integrator[X2] == 2.0 # Currently -7.
@test_broken integrator.ps[Γ[1]] == 12.0 # Currently 3.0.

I also have some general MTK tests, where remake actually works now (except that I thought that it wasn't meant to any longer).

# Checks that initial conditions/paraemters dpending on other initial conditions/parameters
# as defaults are updated correctly.
# Checks using normal indexing and `remake`.
# Checs effect on updated problem and integrators initiatedfrom it.
# An issue discussing what should, and should not, work is here: https://github.com/SciML/ModelingToolkit.jl/issues/2733
let
    # Creates the `ReactionSystem` (with defaults and an observable).
    @parameters k1
    @species X1(t)
    @parameters k2 = X1
    @species X2(t) = k1
    @variables O(t)
    rxs = [
        Reaction(k1, [X1], [X2]),
        Reaction(k2, [X2], [X1])
    ]
    observed = [O ~ X1 + X2 + k1 + k2]
    @named rs = ReactionSystem(rxs, t; observed)
    rs = complete(rs)

    # Creates the various problem types.
    u0 = [X1 => 1]
    ps = [k1 => 10]
    oprob = ODEProblem(rs, u0, (0.0, 1.0), ps)
    sprob = SDEProblem(rs, u0, (0.0, 1.0), ps)
    dprob = DiscreteProblem(rs, u0, (0.0, 1.0), ps)
    jprob = JumpProblem(rs, dprob, Direct())
    nprob = NonlinearProblem(rs, u0, ps)
    @test_broken false # ssprob = SteadyStateProblem(rs, u0, ps) # Cannot generate integrators from `SteadyStateProblem`s (https://github.com/SciML/SteadyStateDiffEq.jl/issues/79).
    probs = [oprob, sprob, jprob, nprob, ssprob]
    solvers = [Tsit5(), ImplicitEM(), SSAStepper(), NewtonRaphson(), DynamicSS(Tsit5())]

    # Checks that values depending on defaults are updated correctly. Checks values both in original
    # problem and in the initialised integrator. Only uses single symbolic when indexing (other
    # alternatives should have been checked elsewhere).
    for (prob, solver) in zip(deepcopy(probs), solvers)
        # Checks when updating using `remake` indexing. I *think* this *should* update values (in the
        # updated problem) that depend on default (the integrator should have the correct values).
        prob_new = remake(prob; u0 = [X1 => 2], p = [k1 => 20])
        @test prob_new[X1] == 2
        @test prob_new[X2] == 20
        @test prob_new.ps[k1] == 20
        @test prob_new.ps[k2]  == 2
        @test prob_new[O] == 44
        integrator = init(prob_new, solver)
        @test integrator[X1] == 2
        @test integrator[X2] == 20
        @test integrator.ps[k1] == 20
        @test integrator.ps[k2] == 2
        @test integrator[O] == 44

        # Checks when updating using normal indexing. I *think* this *should not* update values (in
        # the updated problem) that depend on default (the integrator should have the correct values).
        prob[X1] = 3
        prob.ps[k1] = 30
        @test prob[X1] == 3
        @test prob[X2] == 10
        @test prob.ps[k1] == 30
        @test prob.ps[k2] == 1
        @test prob[O] == 44
        integrator = init(prob, solver)
        @test integrator[X1] == 3
        @test_broken integrator[X2] == 30
        @test integrator.ps[k1] == 30
        @test_broken integrator.ps[k2] == 3
        @test_broken integrator[O] == 66
    end
end
@TorkelE TorkelE added the bug label Jun 13, 2024
@ChrisRackauckas
Copy link
Member

X2 is algebraic and simplified out, right? So it doesn't have an initial condition since it's not a state. You can give it a parameter for the initialization, though in this kind of case it's generally much better to not over-specify initial conditions anyways, i.e. you want the user to not supply an initial condition for X2.

@isaacsas
Copy link
Member

There is no way we can auto-determine the value of the new parameter associated with a conservation law without having initial values for all pre-elimination species.

@ChrisRackauckas
Copy link
Member

If x + y = 1 and you tell me that x = 1, then I know that y = 0. If you specify that x = 1 and y = 1 I will tell you that you cannot do that because that makes x + y = 1 impossible. That's what's going on here.

@isaacsas
Copy link
Member

But in our case Γ := x(t) + y(t) with Γ a new unknown parameter. We want to use x(0) and y(0) to set the value of Γ.

@ChrisRackauckas
Copy link
Member

Γ would need to be made a dependent parameter then, built from the initial conditions of the other two variables (which would need to be made into parameters) and then that would need to use the initialization system.

@isaacsas
Copy link
Member

But doesn't that mean users will have to be told that if they use Catalyst they have to explicitly specify parameters to represent all initial conditions? We have no way of knowing a priori which unknowns will appear in conservation laws, and we wouldn't want a situation where we sometimes inject new parameters in the background and then expect users to know about them and switch to setting initial conditions using them...

@TorkelE
Copy link
Member Author

TorkelE commented Jun 13, 2024

I might be misunderstanding, but weren't dependent parameters which were just expressions of other parameters? I.e. basically a way to write a commonly occurring parameter expression in a shorter form as a single symbol? I think this case is something else, i.e. a parameter which value is being initiated according to some other values.

The funny thing is that everything works fine, it is just the case when the problem is updated that there is a problem, but the initial creation is all good.

@isaacsas
Copy link
Member

@TorkelE what Chris was suggesting is we need to make the initial conditions parameters too for each species appearing within a conservation law, and have users update those parameters to change initial conditions.

@isaacsas
Copy link
Member

i.e.

@parameters A0
@species A(t)=A0

and then users update A0 to change the initial condition.

@ChrisRackauckas
Copy link
Member

No user's don't need to do that, Catalyst would need to. Or @AayushSabharwal do we support initial values in dependent parameters?

@AayushSabharwal
Copy link
Member

AayushSabharwal commented Jun 14, 2024

Yeah we do.

@variables x(t)
@parameters p = x

will initialize p to x if a value is not explicitly provided.

The reverse works as well.

@isaacsas
Copy link
Member

But wouldn't Catalyst adding parameters to represent initial conditions mean that any user using downstream functionality (like remake) would have to know and always update initial conditions via changing these parameters? Moreover, everything would then break silently should a user directly update an initial condition instead of updating it via its parameter.

Doesn't it also mean that there are possibly large numbers of extra parameters hanging around for users working on higher level analyses (sensitivity analysis, etc)? This seems undesirable.

@ChrisRackauckas
Copy link
Member

Since using initial conditions as parameters like Aayush shows works, making a dependent parameter Γ ~ x + y should be all that's needed?

@isaacsas
Copy link
Member

My understanding from @TorkelE is that when subsequently remaking a problem it won’t be correctly recalculated as described above.

@ChrisRackauckas
Copy link
Member

Okay so remake isn't tying into the initialization system properly. The docs for what should be happening is:

https://docs.sciml.ai/ModelingToolkit/stable/tutorials/initialization/

Let's assume we have an ODE with states x and y, and an algebraic equation x + y ~ gamma. Structural simplification will turn it into just an ODE D(x) ~ ... and y will be eliminated as y ~ gamma - x so it will become an observed variable, not in the ODEProblem and not solved for, just computed on demand.

However, when you have a DAE you need to ensure consistent initialization, and so there's a nonlinear system that is built for that initialization. If you say x => 2 and x + y ~ gamma, then you get a nonlinear system:

x - 2 ~ 0
x + y - gamma ~ 0

that is then ran to give the initial conditions for y as needed. In many cases the structural simplification there will automatically solve it for y, in other cases it needs to do a nonlinear solve. Changing initial conditions on x and remaking should be all fine. y can then not exist in the solve and things are all good.

If you say y => 2 and x + y ~ gamma, then you get a nonlinear system:

y - 2 ~ 0
x + y - gamma ~ 0

I.e. the user can specify the initial condition for y, but we know we actually aren't solving an ODE with y in it, just x, and from the condition we can know how to compute the initial condition for x. This is done automatically under the hood so that the user does not have to ever know what variables are simplified and which are not.

When you then add x => 2, y => 2 as an initial condition, it then needs to build the nonlinear system:

x - 2 ~ 0
y - 2 ~ 0
x + y - gamma ~ 0

which is then a NonlinearLeastSquaresProblem that is solved before the DAE integration starts, the residual is checked to make sure the initialization is consistent. If gamma in this case is 4, then it's effectively just checking all constraints are satisfied, while if gamma is 5 it will give an InitializationFailure letting you know that the system is not consistent.

So there's two things going on here. For one, gamma ~ x + y should be declared as a dependent parameter, which is effectively just an alias and is guaranteed to be kept true. That needs to be done on the Catalyst side. But for two, oprob_new = remake(oprob; u0 = [X1 => 10.0, X2 => 20.0]) needs to do the same thing as ODEProblem([X1 => 10.0, X2 => 20.0], ...), which is that it needs to construct a new initializationsystem / initializationprob if the u0map is different. The key is that changing the initial conditions that are given changes that initialization problem. If you give the initial value for x then it checks y is consistent, if you give an initial value for y it needs to compute the initial value for x from that, and if you give the initial value for x and y it needs to calculate a consistent solution. So as the u0map changes what values are given, you're changing the nonlinear system that has to be constructed. @AayushSabharwal I don't think remake is properly doing that right now.

What should then happen is that if gamma is a dependent parameter and remake is doing this properly, then you always get a consistent system and it has the right equations to prove that, remake should work then and all is good.

Likewise though, I don't expect oprob[y] = 20.0 to ever work, since if x is the variable in the problem, there is no y to be "set". y is implicitly defined through x, and in theory you could have x + y + z ~ gamma and so oprob[y] = 20.0, what does it mean? Do you change x and z? It only has a meaning in terms of the complete system, i.e. you also have to tell me how z changes and then I have complete knowledge to compute x. So while the prob setters are niceties to have, fundamentally with DAEs we cannot allow them to be used on all variables, only state variables, and those states are the minimal derived set given by structural simplification. In theory this could be made lazy, so oprob[y] = 20.0 could add a new equation, but that would need to generate a new nonlinear system, which has type issues and you wouldn't want to do it incrementally so ... it would be an absolute mess.

Likewise oprob_new.ps[Γ[1]] == 12.0 if gamma is a derived variable, it's not always clear how to invert it so setting it isn't necessarily well-defined. Only non-dependent parameters can be set directly, and dependent parameters are explicit functions of those. But for consistency in a Catalyst sense, I don't think you'd want people to be setting gamma independently of other variables anyways, so I think this restriction makes a lot of sense and is another reason why you'd want it to not be a normal parameter but instead a dependent parameter.

All of the same logic of course applies to the integrator type itself. Note that after callbacks, the DAE initialization has to be re-ran to check that there's always consistency at each step or error since having a non-consistent state is not recoverable by the solver. This is one of the reasons why MTK is going the route of more symbolic callbacks too, since then it can generate custom initializers for specific callbacks given the known equation changes. This then ends up being a really deep topic... which I think Catalyst can mostly avoid given the constrained use of DAEs it tends to have. In theory, all of Catalyst DAEs are linear constraints and so tearing always analytically solves them and so the deeper DAE stuff doesn't apply.

Hopefully that's a better explanation of all of the behaviors there, and points to one missing feature in remake.

@isaacsas
Copy link
Member

Thanks, that is very helpful and would be a perfect solution for us. I’m fine with just telling users to use remake and not pushing the convenience direct updating anyways. Making the constraint equation that defines the conservation constant a dependent parameter sounds good — we plan to add dependent parameter support after we get V14 out and settled.

@AayushSabharwal
Copy link
Member

gamma ~ x + y should be declared as a dependent parameter,

Dependent parameters can only depend on other parameters. What semantics would gamma ~ x + y have? Currently since y ~ gamma - x is an observed equation, this is always satisfied if we made it an observed. But what if both x and y are both unknowns? This is also why allowing parameters to be unknowns in the initialization problem is probably necessary. What if the user specifies x => 1.0, y => 2.0 and doesn't give a value for gamma? We can definitely solve for it. (SciML/ModelingToolkit.jl#2665)

I don't think remake is properly doing that right now.

I'll try and come up with an API for that. It's really easy to get an implicit circular dependency with this stuff.

oprob_new = remake(oprob; u0 = [X1 => 10.0, X2 => 20.0]) needs to do the same thing as ODEProblem([X1 => 10.0, X2 => 20.0], ...)

It's not exactly identical, since remake will fall back to using existing values in the problem over defaults unless use_defaults = true is provided. This is because use_defaults = true as the default behavior is breaking. Also, since packages like OrdinaryDiffEq re-export remake, breaking it in SciMLBase is also breaking for all packages that re-export it.

The key is that changing the initial conditions that are given changes that initialization problem.

I'll try and come up with a way to do this without turning it into a circular dependency.

@isaacsas
Copy link
Member

Bump, any updates on supporting parameters defined via initial conditions getting updated correctly? Could this be simplified by introducing a special notation to indicate the parameter is defined via an initial condition and not via a general value for a variable? i.e. p ~ X(0) + Y(0) or p ~ initialvalue(X) + initialvalue(Y)?

@AayushSabharwal
Copy link
Member

SciML/ModelingToolkit.jl#2928 will enable solving for parameters during initialization

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

Awesome, thanks! Is this for ODEs only or will it ultimately work with other system types too?

@AayushSabharwal
Copy link
Member

Right now we only solve the initialization for ODEs. I'm not sure if there are additional concerns required when solving it for other systems. @ChrisRackauckas?

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

We definitely need it for NonlinearProblems and SteadyStateProblems as they are both places conservation laws get used, and it would be nice for SDEs too (though not really high priority I'd say). I can't see a need for it with regards to jump problems right now since we don't support conservation law elimination there (and probably won't anytime soon as they aren't expected to generally offer performance benefits).

@ChrisRackauckas
Copy link
Member

SteadyStateProblems would use it because those use the ODEProblem. If they end up as a NonlinearProblem though it would be ignored since it should be inconsequential to the solution?

For SDEs, there's a bit more to work out anyways. SDAEs are still a research topic with very little actually written about them.

@TorkelE
Copy link
Member Author

TorkelE commented Aug 6, 2024

In our case, going through SDAEs shouldn't be a problem though, since Catalyst deals with conservation laws without going through the generation of algebraic equations.

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

There are no DAEs here since the point is that a user supplies initial values for all species, including eliminated species that were moved to observables, and that is used to determine a parameter's value before any solve is being called (irregardless of equation type). The value of the parameter is then known and available for use in equations passed to the associated solver.

For NonlinearProblems one can specify an initial guess via value mappings of unknowns right? That would be the place it would be nice to have the parameter calculated from that mapping (i.e. if users specify a mapping with all unknowns, even those that have been moved to be observables, it is used to generate the parameter value -- similar to how the value is generated from initial value mappings for ODEs).

@ChrisRackauckas
Copy link
Member

For NonlinearProblems one can specify an initial guess via value mappings of unknowns right?

Yes, and that could be useful since starting with the conservation supported should keep it supported through the stepping.

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

Exactly, hence why we'd like that use-case supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants