Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deriving show for GADTs #290

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Deriving show for GADTs #290

wants to merge 5 commits into from

Conversation

keigoi
Copy link

@keigoi keigoi commented Oct 16, 2024

This PR introduces [@@deriving show] for a subset of GADTs.

Code might look a bit ugly for now, and there are some rooms to be improved. So, please discuss, suggest, and/or comment on this PR. Thanks!

Summary

Before thinking about the implementation, let me re-cap the "calling convention" for derivers on certain types (which I believe). For polymorphic types like 'a foo, Ppx_deriving generates functions like:

let pp_foo : 'a. (formatter -> 'a -> unit) -> formatter -> 'a foo -> unit = fun poly_a -> ...

See that here is an extra parameter poly_a : formatter -> 'a -> unit for pretty-printing its type parameter 'a, and let's call such a parameter (above poly_a) a sub-instance.

On calling pretty printers for such (non-GADT) types, the caller will supply sub-instances for the argument types. For example, the deriver will generate (pp_foo pp_int) for printing int foo where pp_int : formatter -> int -> unit.

Sub-instances are often unused for GADTs

The observation is that, such sub-instances like poly_a above are usually never invoked for certain GADTs. Instead, type parameters are eventually refined to a ground type and pretty-printed based on that (concrete) type without the need for supplying the sub-instances.

This PR will exploit the fact. Here, we would have a few options for that:

  1. Completely eliminate the parameter from the signature. This seems not working in practice, as the calling converntion for the polymorphic types explained above unconditionally gives extra argument for sub-instances, breaking the compatibility. This is even fatal for deriving instances for recursive types. Instead, I propose another strategy:
  2. Give more "liberal" signature for such functions, like (here, the sub-instance has completely unrelated type 'b):
let pp_foo : 'a 'b. (formatter -> 'b -> unit) -> formatter -> 'a foo -> unit = fun _poly_a -> ...

The current PR follows this strategy. By relying on this signature generation policy, a larger class of GADTs will (hopefully) be handled without introducing compatibility issues (see below).

Here are what this PR extends Ppx_deriving (show), but I suppose the same strategy would be effective for other derivers as well:

1. Generating "degenerate" pretty-printers (instances)

One of obstacles for handling GADTs in the current Ppx_deriving (show) is that it cannot generate valid code for type variables in the refined types and for existentials. For example, the Pair below has type variables 'x and 'y for which we have no way to generate pretty-printers (instances), and the deriver will just put nonexistent functions poly_x and poly_y which leads to a compile-time error.

type 'a baz = Pair : 'x baz * 'y baz -> ('x * 'y) baz | ...

Instead of that, based on the above "eventually grounded" assumption, this PR will generate a never-called, degenerate pretty-printer for both 'x and 'y:

(fun fmt -> (_:[`this_type_is_refined]) -> failwith "impossible")

which raises an exception, but is actually never called as it is guarded by the type [this_type_is_refined] polymorphic variant type which prevents accidental call. The case for Pair in the printer pp_baz will be:

| Pair(a,b) -> 
  fprintf "Pair(";
  fprintf "%a" (pp_baz (fun fmt -> (_:[`this_type_is_refined]) -> failwith "impossible")) a;
  fprintf ", "
  fprintf "%a" (pp_baz (fun fmt -> (_:[`this_type_is_refined]) -> failwith "impossible")) b;
  fprintf ")"

This typechecks thanks to the "more liberal" signature above for GADTs.

2. New option refined_params for declaring the "refining" type parameter

As I described above, we need some way to specify which parameters are "refining" at first. For this purpose, the current code provides the option refined_params=[ <list of parameter positions> ] which can give the positions for refined type parameters to the deriver. For example,

type ('a, 'b) bar
[@@deriving show {refined_params=[0]}]

will produce

val pp_bar : 'a 'b 'c. (formatter -> 'c -> unit) -> (formatter -> 'b -> unit) -> formatter -> ('a, 'b) bar -> unit

See that the first (0-th) argument of pp_bar is parametric over type 'c which is unrelated to the rest of signature.

(As @gasche mentioned in his comment, it would be more preferable if we could just put _ in the type declaration to specify this. However, It is not compatible with the current behaviour which just eliminates the sub-instance argument for the corresponding type parameter e.g. _ foo will lead to pp_foo : formatter -> 'a foo -> unit without argument poly_a.)

3. Semi-automatic inference for refined parameters

In the implementations (.ml), refined parameters are usually apparent from the return type of constructor types. For this, I implemented a quick code to decide which types are actually defined. However, I admit that this convention to use type variable names to specify refined parameters is rather convoluted...

See: https://github.com/keigoi/ppx_deriving/blob/d2ccd5fc983e3fb5f19fa4a2408ae923de8c06a8/src_plugins/show/ppx_deriving_show.ml#L247-L283

4. Wrapping functions with locally abstract type signatures (newtypes)

Apparently we need this to handle pattern-maching on GADTs.

See:
https://github.com/keigoi/ppx_deriving/blob/e18c11e4e56349d52dcd0acb5e203d6f555ba262/src_plugins/show/ppx_deriving_show.ml#L394-L399

(I agree that this is better to be done unconditionally (and to replace the current signature), as @gasche had mentioned.)


Related: #7

@gasche
Copy link
Contributor

gasche commented Oct 17, 2024

This looks like impressive work, but it's also not so clear what it does right now from the code. Some early comments:

  • When you decide to un-draft this PR, it would help if you could show a simple example of type and generated printing function in the PR description. This helps build intuition when reading the code.

  • From a distance I wonder if the logic could be simplified a bit if you allowed yourself to cut some corners, for example:

    • maybe you could generate GADT-like annotations unconditionally (whether the type is actually a GADT or not): let rec foo : 'a 'b 'c . .....
    • you have a clever check to decide which paramaters will be refined as GADTs, but maybe we could do something simpler that is directed by the user: if they use type ('a, _) t = ..., use GADT mode for the second parameter. This is different from what we do today, and I wonder if there are scary compatibility issues to think about, for now it's just a thought.

@keigoi keigoi marked this pull request as ready for review October 18, 2024 10:38
@keigoi
Copy link
Author

keigoi commented Oct 18, 2024

Hello @gasche, thank you for your prompt comments! I filled up the summary now, and any further comments/edits will be highly appreciated. Thanks!

@keigoi keigoi changed the title Deriving show for GADTs (wip) Deriving show for GADTs Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants