Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discriminators and internally tagged enums #498

Open
rseymour opened this issue Feb 1, 2024 · 9 comments
Open

Discriminators and internally tagged enums #498

rseymour opened this issue Feb 1, 2024 · 9 comments

Comments

@rseymour
Copy link

rseymour commented Feb 1, 2024

Hello! Thanks so much for typify.

I have a relatively simple issue to explain, with some code samples. I have some discriminated unions which simply need #[serde(tag = "my_type")] instead of #[serde(untagged)] but they keep coming up untagged. This leads to one item getting parsed as the other and vice versa, which isn't what I want. I can edit the generated code, but I'd really love to get typify to honor the discriminator.propertyName that pydantic produces in its jsonschema as the preferred tag.

Of course propertyName is some offshoot of jsonschema proper (OAI I suppose), but it solves a problem and I'd love to either support it directly or have a way to override the parameters of the serde field attribute programmatically?

In short:

      {
        "discriminator": {
          "mapping": {
            "Baz": "#/$defs/Baz",
            "FooBar": "#/$defs/FooBar"
          },
          "propertyName": "my_type"
        },
        "oneOf": [
          {
            "$ref": "#/$defs/FooBar"
          },
          {
            "$ref": "#/$defs/Baz"
          }
        ]
      }

produces an attribute of:
#[serde(untagged)]
when in my opinion (and experience because changing this works) it should produce:
#[serde(tag="my_type") based off of propertyName.

I put the runnable example code in a repo: https://github.com/rseymour/typify-tag-example/tree/main

I think I could fix this myself, but I haven't been able to correlate the code to the strings it outputs for annotations.

[addendum]
If there's a way to fix the jsonschema itself to produce internally tagged enums that's also an option.

@rseymour
Copy link
Author

rseymour commented Feb 2, 2024

For folks in this same situation we are working on a fix on the python (pydantic) side of the generation story for discriminated unions that makes this (a) less error prone in python (b) "just work" with typify and (c) completely roundtrippable json with no build.rs hacks. I'll post more in this issue for posterity then maybe even blog about it since it sparks joy to see two languages getting along so nicely.

@rseymour
Copy link
Author

rseymour commented Feb 2, 2024

Finally found the vital piece of the puzzle, which was, how is typify-impl choosing to ever use internally tagged:
Printed this out with a println!("{}", serde_json::to_string_pretty(&schema).unwrap());
on ~line 934 of typify-impl/src/enums.rs

  "title": "InternallyTaggedEnum",
  "oneOf": [
    {
      "type": "object",
      "required": [
        "tag"
      ],
      "properties": {
        "tag": {
          "type": "string",
          "enum": [
            "Alpha"
          ]
        }
      }
    },

This elucidated things quite a bit, this matches the comment, but I didn't quite believe the comment when I read it. For instance I could see internal tags being in the jsonschema (with the openapi extension) or with draft jsonschema 6 like:

        "tag": {
          "type": "string",
          "const": "Alpha"
        }

but in the case I'm fighting we are doing python trickery to make the __class__.__name__ dynamically into the tag, so I can probably coerce pydantic to output the enum style, and then if I get very ambitious try to expand the maybe_internally_tagged_enum fn to support the openAPI style. That said as is usually the case exploring good code, I see why it doesn't work the way I think it should, and I appreciate why.

@ahl
Copy link
Collaborator

ahl commented Feb 3, 2024

First--as you've observed--we don't support discriminator here. There are several reasons for this:

  1. It's not part of JSON Schema
  2. While this primary purpose of this crate is to serve OpenAPI SDK generation, the APIs we care most about don't contain discriminators
  3. OPINION discriminator is complicated and adds further redundancy without any significant benefit.

That said, I do hope to support discriminator, but I'd prefer to do this when we add support for multiple $schema variants so that consumers can indicate, say, that they're using https://spec.openapis.org/oas/3.1/dialect/base so discriminator is a valid construction.

You raise another point: we look for singleton enumerated values, but not const. We should do that. We should also be permissive about the absence of the type. (See #499)

This leads to one item getting parsed as the other and vice versa

I see. For what it's worth, I believe that if my_type were required that would not occur. It's worth noting that typify only considers required fields when deciding if an enum is internally tagged (or adjacently tagged).

It this the type you ideally want to see?

#[serde(tag = "my_type")]
pub enum FbzItem {
    FooBar(FooBar),
    Baz(Baz),
}

I'm not sure what it means for serde if there's a conflict between the tag property and the struct... but I guess in this case it would probably work out.

@rseymour
Copy link
Author

rseymour commented Feb 3, 2024

Yes, the real question is how to get that (tag = "my_type") and how do we define where the string my_type comes from in the jsonschema.

#[serde(tag = "my_type")]
pub enum FbzItem {
    FooBar(FooBar),
    Baz(Baz),
}

I'm trying your PR now, still working on tweaking the jsonschema to make it go.

@rseymour
Copy link
Author

rseymour commented Feb 3, 2024

I just realized there's one impedance mismatch between what I'm doing and what the new typify code is doing. I want a union type so I can do [ FbzItem ] but to do that I am getting a list with anyOf, not an enum with oneOf. Subtle difference, but I think it may be preventing the code you changed from even being called. I'll keep exploring, thanks again!

{
  "$defs": {
    "EventA": {
      "properties": { "event_type": { "const": "EventA" } },
      "required": [ "event_type" ],
      "title": "EventA",
      "type": "object"
    },
    "EventB": {
      "properties": { "event_type": { "const": "EventB" } },
      "required": [ "event_type" ],
      "title": "EventB",
      "type": "object"
    }
  },
  "properties": {
    "events": {
      "items": {
        "anyOf": [
          { "$ref": "#/$defs/EventA" },
          { "$ref": "#/$defs/EventB" }
        ]
      },
      "title": "Events",
      "type": "array"
    }
  },
  "required": [
    "events"
  ],
  "title": "EventsContainer",
  "type": "object"
}

so here I'm defining the container and then the items in it. the anyOf does the work of a proper rust enum, but it's not a python enum or really a jsonschema enum. But it does come out as untagged with the PR #499 with cargo typify unless I'm doing something wrong.

I may be barking up the wrong tree and the same code path is being hit, I will explore more deeply, likely adding something like this to a unit test.

@colmanhumphrey
Copy link

I'm not sure if this is fully related, but along these lines, I'm also having trouble making the tag work because the enum arm names don't match. They're not matching the mapping names, nor are they matching the title field in the schema.json. I'm probably missing something simple, but they seem to be generated by the underlying feature names minus the first word, which is common to all arms.

That is, something like this:

"features": {
      "type": "array",
      "items": {
        "discriminator": {
          "propertyName": "type",
          "mapping": {
            "X": "#/definitions/FeatureTypeX",
            "Y": "#/definitions/FeatureTypeY"
          }
        },
        "oneOf": [
          {
            "title": "X",
            "$ref": "#/definitions/FeatureTypeX"
          },
          {
            "title": "Y",
            "$ref": "#/definitions/FeatureTypeY"
          },
        ]
      }
    },

Produces:

#[serde(untagged)]
pub enum FieldFeaturesItem{
    TypeX(FeatureTypeX),
    TypeY(FeatureTypeY)
}

instead of say:

#[serde(untagged)]
pub enum FieldFeaturesItem{
    X(FeatureTypeX),
    Y(FeatureTypeY)
}

@ahl
Copy link
Collaborator

ahl commented May 22, 2024

I just realized there's one impedance mismatch between what I'm doing and what the new typify code is doing. I want a union type so I can do [ FbzItem ] but to do that I am getting a list with anyOf, not an enum with oneOf.

In this situation, oneOf and anyOf are equivalent, in particular because EventA and EventB are mutually exclusive: there is no JSON value that can satisfy them both simultaneously. The distinction between anyOf and oneOf is that a value that satisfied more than one subschema of the oneOf would fail to validate!

@ahl
Copy link
Collaborator

ahl commented May 22, 2024

@colmanhumphrey I'm seeing this from main:

#[serde(untagged)]
pub enum FeatureItem {
    X(FeatureTypeX),
    Y(FeatureTypeY),
}

I recall some fixes in this domain, but... not specifically what might have led to the behavior change. In any case: mission accomplished?

@colmanhumphrey
Copy link

@colmanhumphrey I'm seeing this from main:

#[serde(untagged)]
pub enum FeatureItem {
    X(FeatureTypeX),
    Y(FeatureTypeY),
}

I recall some fixes in this domain, but... not specifically what might have led to the behavior change. In any case: mission accomplished?

Oh great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants