Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of internally tagged enum with newtypes #347

Closed
wants to merge 13 commits into from

Conversation

bralax
Copy link

@bralax bralax commented Oct 12, 2024

Right now Schemar's behavior for Internally tagged enums, while valid, is lossy and generates schemas that I have found cause some problems in practice.

Effectively, it always generates a schema that inline's subschemas creating potentially deeply nested schemas. For example, the following rust code:

use schemars::{generate::SchemaSettings, schema_for, JsonSchema};

#[derive(JsonSchema)]
#[serde(tag = "type")]
enum Root {
    A(SubEnum1),
    B(SubEnum2),
}

#[derive(JsonSchema)]
#[serde(tag = "other_type")]
enum SubEnum1 {
    C(SubSchema1),
    D(SubSchema2),
}

#[derive(JsonSchema)]
#[serde(tag = "other_type")]
enum SubEnum2 {
    E(SubSchema3),
    F(SubSchema4),
    G(String),
    H{id: String}
}

#[derive(JsonSchema)]
struct SubSchema1 {
    w: String,
}

#[derive(JsonSchema)]
struct SubSchema2 {
    x: String,
}

#[derive(JsonSchema)]
struct SubSchema3 {
    y: String,
}

#[derive(JsonSchema)]
struct SubSchema4 {
    z: String,
}

Currently generates the schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Root",
  "oneOf": [
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "const": "A"
        }
      },
      "oneOf": [
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "C"
            },
            "w": {
              "type": "string"
            }
          },
          "required": [
            "other_type",
            "w"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "D"
            },
            "x": {
              "type": "string"
            }
          },
          "required": [
            "other_type",
            "x"
          ]
        }
      ],
      "required": [
        "type"
      ]
    },
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "const": "B"
        }
      },
      "oneOf": [
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "E"
            },
            "y": {
              "type": "string"
            }
          },
          "required": [
            "other_type",
            "y"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "F"
            },
            "z": {
              "type": "string"
            }
          },
          "required": [
            "other_type",
            "z"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "G"
            }
          },
          "required": [
            "other_type"
          ]
        },
        {
          "type": "object",
          "properties": {
            "id": {
              "type": "string"
            },
            "other_type": {
              "type": "string",
              "const": "H"
            }
          },
          "required": [
            "other_type",
            "id"
          ]
        }
      ],
      "required": [
        "type"
      ]
    }
  ]
}

In this schema, all the types except for Root are not explicitly listed and it's Impossible to access any of the subschemas directly. Especially if I want to use JsonSchema to share these models with another language, generating a deeply nested schema like this makes it complicated to reuse.

This PR updates the logic for internally tagged enum NewType variants so that if the user provides the setting inline_subschemas is true. Rather than inlining the subschema, it generates a schema that looks like this:

{
  "allOf": [{
     "$ref": "{path to subschema}"
  }],
  "properties": {
     "{tagField}": {
        "type": "string",
        "const": "{tagValue}"
     }
  },
  "required": [
    "{tagField}"
  ]
}

This new schema uses JsonSchemas allOf field to say that the actual type is the subschemas type and the tag field. This allows the resulting schema to be functionally equivalent but rather reference the existing type rather than inlining it.

For the sample above, the new resulting schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Root",
  "oneOf": [
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "const": "A"
        }
      },
      "allOf": [
        {
          "$ref": "#/$defs/SubEnum1"
        }
      ],
      "required": [
        "type"
      ]
    },
    {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "const": "B"
        }
      },
      "allOf": [
        {
          "$ref": "#/$defs/SubEnum2"
        }
      ],
      "required": [
        "type"
      ]
    }
  ],
  "$defs": {
    "SubEnum1": {
      "oneOf": [
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "C"
            }
          },
          "allOf": [
            {
              "$ref": "#/$defs/SubSchema1"
            }
          ],
          "required": [
            "other_type"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "D"
            }
          },
          "allOf": [
            {
              "$ref": "#/$defs/SubSchema2"
            }
          ],
          "required": [
            "other_type"
          ]
        }
      ]
    },
    "SubEnum2": {
      "oneOf": [
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "E"
            }
          },
          "allOf": [
            {
              "$ref": "#/$defs/SubSchema3"
            }
          ],
          "required": [
            "other_type"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "F"
            }
          },
          "allOf": [
            {
              "$ref": "#/$defs/SubSchema4"
            }
          ],
          "required": [
            "other_type"
          ]
        },
        {
          "type": "object",
          "properties": {
            "other_type": {
              "type": "string",
              "const": "G"
            }
          },
          "allOf": [
            {
              "type": "string"
            }
          ],
          "required": [
            "other_type"
          ]
        },
        {
          "type": "object",
          "properties": {
            "id": {
              "type": "string"
            },
            "other_type": {
              "type": "string",
              "const": "H"
            }
          },
          "required": [
            "other_type",
            "id"
          ]
        }
      ]
    },
    "SubSchema1": {
      "type": "object",
      "properties": {
        "w": {
          "type": "string"
        }
      },
      "required": [
        "w"
      ]
    },
    "SubSchema2": {
      "type": "object",
      "properties": {
        "x": {
          "type": "string"
        }
      },
      "required": [
        "x"
      ]
    },
    "SubSchema3": {
      "type": "object",
      "properties": {
        "y": {
          "type": "string"
        }
      },
      "required": [
        "y"
      ]
    },
    "SubSchema4": {
      "type": "object",
      "properties": {
        "z": {
          "type": "string"
        }
      },
      "required": [
        "z"
      ]
    }
  }
}

This resulting schema includes all the subtypes and the enum schemas are able to just reference the other definitions.

@bralax
Copy link
Author

bralax commented Oct 13, 2024

@GREsau This Pull request should be ready for you to review. Let me know if you have any questions or concerns.

@GREsau
Copy link
Owner

GREsau commented Nov 25, 2024

Thanks for the PR, although it has a few issues. It's unnecessary to nest the $ref schema in an allOf (in draft 2019-09+), and you can easily cause unsatisfiable schemas if the inner type has deny_unknown_fields.

But I have merged a separate change #355 which applies a similar change, and it's now released in v1.0.0-alpha.16 🙂

@GREsau GREsau closed this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants