Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patternProperties? #522

Open
staehle opened this issue Mar 6, 2024 · 3 comments
Open

patternProperties? #522

staehle opened this issue Mar 6, 2024 · 3 comments

Comments

@staehle
Copy link

staehle commented Mar 6, 2024

Hello:

Cool project! I started implementing it into a sort of repo-management-type software I'm working on. However, typify didn't quite do what I expected, and looking into it, doesn't seem to support JSON Schema's patternProperties. I'm using YAML, where for example, I have a file like this:

variables:
  BRANCH_GROUP_A: main
  BRANCH_GROUP_B: feature/beta
  SOMETHINGELSE: "123"
repos:
  thing:
    branch: hotfix/BUG-1234
    commit: HEAD
  some-code:
    branch: ${BRANCH_GROUP_A}
    commit: HEAD
  doesnt-matter:
    branch: ${BRANCH_GROUP_A}
    commit: 7f9329dc0741a4eeea209497f4e03513454e7606
  meta-star:
    branch: ${BRANCH_GROUP_B}
    commit: da9063bdfbe130f424ba487f167da68e0ce90e7d

I could argue for the repos section to convert that to a list instead of an object and add an extra "name: xyz" field, but that just adds more boilerplate, but variables definitely doesn't make much sense to do anything other than the current format.

So the JSON schema looks like this:

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://example.com/snapshot.json",
    "title": "snapshot",
    "description": "Schema for SNAPSHOT",
    "type": "object",
    "properties": {
        "variables": {
            "type": "object",
            "patternProperties": {
                "^[A-Z_]+$": {
                    "$comment": "Variable names uppercase and underscores only, value any string",
                    "type": "string"
                }
            },
            "additionalProperties": false,
            "minProperties": 1,
            "uniqueItems": true
        },
        "repos": {
            "type": "object",
            "patternProperties": {
                "^[a-z_-]+$": {
                    "$comment": "Repo names lowercase only, object types",
                    "type": "object",
                    "properties": {
                        "branch": { "type": "string" },
                        "commit": { "type": "string" }
                    },
                    "required": ["branch", "commit"]
                }
            },
            "additionalProperties": false,
            "minProperties": 1,
            "uniqueItems": true
        }
    },
    "required": ["repos"],
    "additionalProperties": false
}

I integrated typify with build.rs, but get the same results with cargo typify ./snapshot.json, small example (which I believe is correct?):

let str_snapshot_content = std::fs::read_to_string("./test_snapshot.yaml").unwrap();
let test = serde_yaml::from_str::<snapshot::Snapshot>(&str_snapshot_content).unwrap();

And panicking at the second line, as I now expect, since it doesn't know what to look for:

thread 'main' panicked at src/main.rs:68:72:
called `Result::unwrap()` on an `Err` value: Error("variables: unknown field `BRANCH_GROUP_A`, there are no fields", line: 2, column: 3)

Also, various unused variable warnings in the generated code file:

warning: unused variable: `value`
   --> <snip>/out/gen_schema_types.rs:254:17
    |
254 |                 value: SnapshotRepos,
    |                 ^^^^^ help: if this is intentional, prefix it with an underscore: `_value`
    |
    = note: `#[warn(unused_variables)]` on by default

Anyway, I didn't see an existing issue for this, so figured I'd open one! Thanks!

@ahl
Copy link
Collaborator

ahl commented Mar 6, 2024

You're right: patternProperties aren't consumed at all. Here are the generated types:

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct Snapshot {
    pub repos: SnapshotRepos,
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub variables: Option<SnapshotVariables>,
}

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct SnapshotRepos {}

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct SnapshotVariables {}

The Rust type you do want is a bit tricky: a HashMap or BTreeMap with keys of a particular string type (i.e. with the regex constraints you specified) and at least one value. (Note that your use of uniqueItems is, I think, spurious as that only applies to array type values). This is related to #456.

I think we might want something like this:

pub struct SnapshotRepos(BTreeMap<SnapshotReposKey, SnapshotReposValue>); // note not pub inner

impl<'de> Deserialize<'de> for SnapshotRepos {
    // custom deserializer that enforces minProperties = 1
}

impl TryFrom<BTreeMap<SnapshotReposKey, SnapshotReposValue>> for SnapshotRepos {
    // enforce minProperties = 1
}

// easy...
impl From<SnapshotRepos> for BTreeMap<SnapshotReposKey, SnapshotReposValue> {
    fn from(value: SnapshotRepos) -> Self {
        value.0
    }
}

pub struct SnapshotReposKey(String); // like other constrained string types

// pretty simple type
pub struct SnapshotReposValue {
    branch: String,
    commit: String
}

This isn't very hard to implement, but I'd like to make sure we have some reasonable approach for dealing with constrained types in a generic fashion. If someone is interested, please follow up and I'll do some thinking about what might make this consistent with the direction we want to take here.

@staehle
Copy link
Author

staehle commented Mar 7, 2024

Hey thanks for the quick response! That sounds entirely reasonable.

Quick side notes:

  • On the topic of HashMap vs BTreeMap: Not sure if you're aware, but as I was researching existing examples I did notice getsentry doing something like this to hot-replace the map type. Personally for my usage, I'm impartial as my data won't be large enough to be really affected by the choice, but perhaps some method of customization could be useful? Not sure how feasible that is.
  • uniqueItems was definitely from some overzealous copy/pasting! JSON Schema does just ignore it if it's not with an array type though.

@ahl
Copy link
Collaborator

ahl commented Mar 8, 2024

On the topic of HashMap vs BTreeMap: Not sure if you're aware, but as I was researching existing examples I did notice getsentry doing something like this to hot-replace the map type. Personally for my usage, I'm impartial as my data won't be large enough to be really affected by the choice, but perhaps some method of customization could be useful? Not sure how feasible that is.

Interesting! I had recently been checking out Sentry; was not aware that they were typify users. Maybe there should be some setting to use one or the other. I'm not fully sure why I might have used HashMap...

uniqueItems was definitely from some overzealous copy/pasting! JSON Schema does just ignore it if it's not with an array type though.

Right; not a big deal, just a heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants