Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment: attach optional migration expression to actor (class) as actor [exp]? #4812

Open
wants to merge 54 commits into
base: master
Choose a base branch
from

Conversation

crusso
Copy link
Contributor

@crusso crusso commented Dec 10, 2024

Support the specification of an optional migration function on the definition of an actor or actor class.

The function must consume and produce records of stable types.

Roughly:

  • Fields in the domain of the migration function override any stable variables of the same name in the regular stable signature.
    This determines the pre stable signature of the actor.
  • The post stable signature is determined by the actors stable field declarations (as previously).
  • Fields in the codomain/range of the migration function that also occur in the regular stable signature of the actor must be consumable (subtypes, ignoring top-level var/non-var distinctions) by the stable variables. Irrelevant, new fields produce errors.
  • Fields neither in the domain nor codomain are treated as usual and transferred from the old actor, if present and a subtype, or initialized when absent.

Dynamically, on upgrade, the (implicit or explicit) post signature of the old actor must be compatible with the pre-signature of the new actor. This is either checked offline, statically in the classical compiler (and by Candid deserialization), and dynamically in the EOP compiler, using a pre signature type descriptor. All stable variables in the domain of the migration function must be non-null in the deserialized or transferred stable record. The migration function is applied to the subset of these fields to produce the codomain record. A new stable record is constructed using the values from the codomain and the values that were not overridden by the domain. Values in the domain are set to null (and will be initialized if present in the post signature).
(It would also be possible to transfer values from the domain, if not in the codomain, but required and compatible with the post-signature, but this design choice is not implemented)

A fresh install ignores the migration code; an upgrade applies it before entering the constructor.
All variables in the codomain must be present for the migration to succeed (since there is no way to provide an initial value).

You will need to remove the migration code to do a self-upgrade.

Follow on PRs implement support for EOP (#4829) and syntactic stable signatures (#4833) (both merged here but might simplify reviewing)

Extended stable signatures

A stable signature (stored in metadata) is now either a single interface of stable fields (as before) or (the new bit) a dual interface recording the pre and post upgrade interface when performing migration.

The pre and post fields are implicitly identical for a singleton interface and provide for backwards compatibility and ordinary upgrades (sans explicit migration).

Stability compatibility now checks the post-signature of the old actor is compatible with the pre-signature of the new actor.

<stab_sig> ::=
  <typ_dec>;* actor { <stab_field>;* }
  <typ_dec>;* actor ( { <stab_field>;* },  <stab_field>;* } )

Done this way, there should be no need to modify dfx, which defers to moc for the compatibility check anyway.

Example: adding record fields

To upgrade from:

persistent actor {
  type Card = {
    title : Text;
  };
  var map : [(Nat32, Card)] = [];
  var log : []
};

to incompatible stable type interface:

persistent actor {
  type Card = {
    title : Text;
    description : Text; // new field!
  };
  var map : [(Nat32, Card)] = [];
};
  1. Define a migration module and function that transform the old stable variable, at its current type, into the new stable variable at its new type.
// CardMigration.mo
import Array "mo:base/Array";

module CardMigration {
  type OldCard = {
    title : Text;
  };

  type NewCard = {
    title : Text;
    description : Text;
  };

  // our migration function
  public func migrate(old : {
      var map : [(Nat32, OldCard)] // old type
    }) :
    {
      var map : [(Nat32, NewCard)] // new type
    } {
    { var map : [(Nat32, NewCard)] =
        Array.map<(Nat32, OldCard), (Nat32, NewCard)>(
          old.map,
          func(key, { title }) { (key, { title; description = "<empty>" }) }) }
  }

}
  1. Specify the migration function as the migration expression of your actor declaration:
import CardMigration "CardMigration";

persistent actor
  [ CardMigration.migrate ] // Declare the migration function
  {
  type Card = {
    title : Text;
    description : Text;
  };

  var map : [(Nat32, Card)] = []; // Initialized by migration on upgrade
  var log : [Text] = []; // migrated implicitly as usual

};

After we have successfully upgraded to this new version, we can also upgrade once more to a version that drops the migration code.

persistent actor {
  type Card = {
    title : Text;
    description : Text;
  };

  var map : [(Nat32, Card)] = [];
  var log  : [Text] = []
};

TODOs

  • Decide on final syntax, if keeping this one, perhaps tuck the migration expression into the obj_sort (to reduce diffs)
  • Review and address remaining TODOS (some are actually done)
  • Normalize and promote types of the migration function as appropriate.
  • Check for field hash collisions
  • Use better names in code (pre_ty, post_ty etc)
  • Test actor classes
  • Test more scenarios: variable retyping, variable removal, swap etc
  • Devise some mechanism to detect repeated migration with the same migration code, or a coding pattern to prevent it?
    Some, but not all migrations might be reject as type-incompatible, but others may not be:
    * Maybe an implicit isMigrated field is enough, or a check that the post-signature of the new "not equals" the post-signature of the old canister?
  • review/adjust error codes.
  • fig grammar.sed to delete new production
  • add some tests to test/cmp (tests: extended --stable-compatible checks #4850)
  • avoid capture of shared pattern variables in migration function (or revise syntax)
  • doc (Draft doc: migration expressions #4845)
  • consider requiring migration exp to be static (thus pure)
  • test with dfx (tested manually using DFX_MOC_PATH - can't fork sdk tp test without a proper moc release)

@ggreif
Copy link
Contributor

ggreif commented Dec 18, 2024

Did you consider persistent actor A (with migration = mf) { ... };?

@crusso
Copy link
Contributor Author

crusso commented Dec 18, 2024

Did you consider persistent actor A (with migration = mf) { ... };?

Not yet, but I also don't want A to be in scope in mf.

Maybe

persistent actor (with migration = mf) A { ... };

Or

(with migration = mf)
persistent actor  { ... };

But I expect parsing hell from the last one.

Are parentheticals only meant to attach to expressions?

@ggreif
Copy link
Contributor

ggreif commented Dec 18, 2024

Are parentheticals only meant to attach to expressions?

Not specifically. Any AST node can be it.

src/mo_frontend/typing.ml Outdated Show resolved Hide resolved
Comment on lines +11636 to +11638
G.if1 I32Type
(Bool.lit true)
(Bool.lit false)
Copy link
Contributor

@ggreif ggreif Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Bool.vanilla_lit

Suggested change
G.if1 I32Type
(Bool.lit true)
(Bool.lit false)

Eqz produces 0 or 1 only, which is exactly Bool.lit false/true!

and stab_sig' = (dec list * typ_field list) (* type declarations & stable actor fields *)
and stab_sig' = (dec list * stab_body) (* type declarations & stable actor fields *)
and stab_body = stab_body' Source.phrase (* type declarations & stable actor fields *)
and stab_body' =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one could have only PrePost and when both equal unparse it as version 1.0.0. Similarly parse that and set pre = post.

Comment on lines +1004 to +1005
note =
{ filename; trivia }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
note =
{ filename; trivia }}
note = { filename; trivia } }

Comment on lines +2585 to +2589
and infer_migration env exp_opt =
match exp_opt with
| Some exp ->
Some (infer_exp_promote { env with async = C.NullCap; rets = None; labs = T.Env.empty } exp)
| None -> None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an Option.map

local_error env exp.at "M0203"
"expected non-generic, local function type, but migration expression produces type%a"
display_typ_expand typ;
([], [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant parens 3x

@@ -1035,7 +1035,7 @@ let rec is_explicit_exp e =
| ObjE (bases, efs) ->
List.(for_all is_explicit_exp bases
&& for_all (fun (ef : exp_field) -> is_explicit_exp ef.it.exp) efs)
| ObjBlockE (_, _, dfs) ->
| ObjBlockE (_, _e_opt, _, dfs) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| ObjBlockE (_, _e_opt, _, dfs) ->
| ObjBlockE (_, _exp_opt, _, dfs) ->

Comment on lines +11705 to +11707
E.if1 I64Type
(Bool.lit true)
(Bool.lit false)
Copy link
Contributor

@ggreif ggreif Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically

𝗂𝟨𝟦.𝖾𝗑𝗍𝖾𝗇𝖽_𝗎/𝗂𝟥𝟤 | 𝟶𝚡𝙰𝙳 | [𝗂𝟥𝟤]→[𝗂𝟨𝟦]

even under the assumption that runtime tagging is happening.

Suggested change
E.if1 I64Type
(Bool.lit true)
(Bool.lit false)
G.i (Convert (Wasm_exts.Values.I64 I64Op.ExtendUI32))

/// The type is stored in the persistent metadata memory for later retrieval on canister upgrades.
/// The `new_type` value points to a blob encoding the new stable actor type.
#[ic_mem_fn]
pub unsafe fn assign_stable_type<M: Memory>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it could make sense to split register_stable_type in two parts, checking compatibility and assigning/storing the new stable type. The logic of assign_stable_type is like a subset of register_stable_type. Just to reduce some code duplication. Not a big deal, just an idea.

@@ -11628,6 +11628,15 @@ and compile_prim_invocation (env : E.t) ae p es at =
SR.Vanilla,
StableMem.get_mem_size env ^^ BigNum.from_word64 env

| OtherPrim "rts_in_install", [] -> (* classical specific *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a detail, feel free to disregard: I wonder what a good name for this helper function would be. IIUC, this is called during initialization and checks whether the initialization is actually an upgrade, not a fresh (re-)install?

@@ -452,7 +452,7 @@ let transform prog =
inspect = t_exp inspect;
low_memory = t_exp low_memory;
stable_record = t_exp stable_record;
stable_type = t_typ stable_type;
stable_type = {pre = t_typ stable_type.pre; post = t_typ stable_type.post};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this symmetry :)

@@ -480,7 +481,7 @@ and export_runtime_information self_id =
let scope_con2 = Cons.fresh "T2" (Abs ([], Any)) in
let bind1 = typ_arg scope_con1 Scope scope_bound in
let bind2 = typ_arg scope_con2 Scope scope_bound in
let gc_strategy =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my trailing white space...

display_typ_expand typ;
([], [])
in
List.iter
Copy link
Contributor

@luc-blaeser luc-blaeser Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that slightly worries me is that the return type of the migration function can be super-type of the new actor's stable type which might cause accidental data loss in a case like the following:

Initial version:

persistent actor {
   var data1 : [Nat] = ...;
   var data2 : Text = ...;
}

Next version:

persistent actor [Migration.run] {
   var data1 : [Text] = ...;
   var data2 : Text = ...;
}

with Migration logic:

module {
   type OldActor = {
      data1 : [Nat];
      data2 : Text; // specifies all stable variables
   };
  type NewActor = {
      data1 : [Text];
      // accidentally forgot to mention `data2`
   }
   public void run(old: OldActor) : NewActor {
      ...
   }
}

With this, data2 is accidentally lost, i.e. re-initialized on migration. This is because the variable is mentioned in input of the migration function but not in the output of the migration function.

Maybe we should still require that migration function returns the full stable type of the actor, even if it would be less convenient for large actors?

Copy link
Contributor

@luc-blaeser luc-blaeser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Many thanks for this big effort.
I tested a few cases, all worked perfectly.

The only thing that I am thinking of is the accidental data loss that can happen if migration function skips a stable field in its return type, but that field is part of the input of the migration function (I left a more detailed comment above).

@@ -14,7 +14,7 @@

<obj_sort> ::=
'object'
'persistent'? 'actor'
'persistent'? 'actor' ('[' <exp> ']')?
Copy link
Contributor

@luc-blaeser luc-blaeser Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the syntax although some users might be confuse with array syntax. An alternative could be ('migrate' <exp>)? or, not sure, if we want to keep existing reserved keywords maybe('with' <exp>)?.

| OtherPrim "rts_in_install", [] -> (* EOP specific *)
assert (!Flags.enhanced_orthogonal_persistence);
SR.Vanilla,
EnhancedOrthogonalPersistence.load_stable_actor env ^^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will trigger a trap because the memory is default-initialized in the runtime system on first initialization. If you run with sanity checks, it will trigger (*metadata).stable_type.assert_initialized(); in RTS persistence. (I believe this is also the reason for CI currently failing).

Comment on lines +11703 to +11707
EnhancedOrthogonalPersistence.load_stable_actor env ^^
compile_test I64Op.Eqz ^^
E.if1 I64Type
(Bool.lit true)
(Bool.lit false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be used:

Suggested change
EnhancedOrthogonalPersistence.load_stable_actor env ^^
compile_test I64Op.Eqz ^^
E.if1 I64Type
(Bool.lit true)
(Bool.lit false)
Persistence.use_enhanced_orthogonal_persistence env ^^
Bool.neg

Still something off in tests...

@@ -0,0 +1,13 @@
# SKIP ic-ref-run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be good to also test all these with graph copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants