-
-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
$id
updates
#1537
base: main
Are you sure you want to change the base?
$id
updates
#1537
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Most of the changes are simple one-character edits, and the rest read very well to me
jsonschema-core.md
Outdated
Due to the potential break in functionality described above, the behavior for | ||
using JSON Pointer fragments that point to or cross a resource boundary is | ||
undefined. Schema authors SHOULD NOT rely on such IRIs, as using them may | ||
reduce interoperability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stating here that this behavior is undefined seems to me to contradict the above "an embedded schema resource and its subschemas can be identified by JSON Pointer fragments relative to either its own canonical IRI, or relative to any containing resource's IRI".
This also leaves [^8]
with no reference to it.
And, minor, but two spaces after a period isn't consistent with the rest of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also leaves [^8] with no reference to it.
I'll add that back in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was discussed that we need to make pointers that cross boundaries an undefined behavior rather than explicitly allowing implementations to support it. Of course, implementations can still support it if they want, but the spec should push schema authors a bit more forcefully toward using the proper $id
instead.
I think the critical point here is "an embedded schema resource and its subschemas". It's arguable that an embedded schema resource isn't strictly a subschema since it can be refactored out without any change in behavior. If we need to clarify that, I'm happy to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I get making this undefined is the main point of this PR (I edited my comment a bit to clarify that, had it slightly wrong initially ... I'm not sure this should be undefined but if that is the way it is to be, okay).
To me "an embedded schema resource ... can be identified by JSON Pointer fragments ... relative to any containing resource's IRI" reads as defined behavior and not consistent with making it undefined some paragraphs later, and needs some change. I see your point on ambiguity whether "subschema" includes embedded schemas but I don't see language present that actually calls embedded schemas subschemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read that paragraph as informative, not prescriptive. It's setting up the reasoning behind the requirement.
Embedded schemas can be referenced this way, but because reasons it's a bad idea to actually do it, so we're gonna say probably don't. We're also letting implementations decide whether they want to support it, no pressure either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling these issues!
jsonschema-core.md
Outdated
If present, the value for this keyword MUST be a string, and MUST represent a | ||
valid [IRI-reference](#rfc3987). This IRI-reference SHOULD be normalized, and | ||
MUST resolve to an [absolute-IRI](#rfc3987) (without a fragment). | ||
valid [IRI reference](#rfc3987). This IRI reference SHOULD be normalized per RFC | ||
3987, section 5.3, and MUST resolve to an [absolute IRI](#rfc3987) (without a | ||
fragment). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the normalization requirement is a requirement on the schema author and should be avoided. I think this should be a requirement on an implementation's internal schema registry. The implementation should accept an IRI that isn't normalized, but SHOULD normalize as needed to retrieve schemas.
So, I like the improvement to the wording, but I think this whole requirement needs to go elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a think about this. I don't disagree, but I'll need to figure out where to put such a requirement. This section does warrant a note to authors that implementations will be performing this normalization; probably a link to wherever this ends up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving it in place and changing to
If present, the value for this keyword MUST be a string, and MUST represent a valid IRI reference. When processing this IRI reference, implementations SHOULD normalize it per RFC 3987, section 5.3, so that it resolves to an absolute IRI (without a fragment).
seems to work. It puts the requirement on the implementation, but it also lets the user (if they're even reading this) know what they can expect from the implementation.
I do fear, though, that this gives license to users to put IRIs with fragments in $id
knowing that the implementation will simply ignore it. (It's almost like the output needs the ability to report a warning as well as an error. "Hey, uh, there's a fragment on that IRI, guy. You sure you want to do that? Probably don't.") I think we do have or planned to have some language that disallowed fragments in $id
, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think leaving it in place works because $id
isn't the only way a schema might be assigned an IRI. For example, implementations often also allow users to assign IRIs to schemas when they're registered. The normalization requirement should also apply to those user provided IRIs and therefore needs to be somewhere where it doesn't only apply to $id
.
One place I think it could work is in "Base IRI, Anchors, and Dereferencing" above. Here's a suggestion.
To differentiate between schemas in a vast ecosystem, schemas are identified by absolute IRIs (without a fragment).
Schemas can embed references to other schemas by specifying their IRI. When implementations dereference these references, they SHOULD use the normalization procedures defined in RFC 3987, section 5.3 when comparing URIs.
Several keywords can accept an IRI reference. The schema's identifier serves as the base IRI for resolving relative references.
Then I'd remove everything about normalization and fragments from this paragraph. If we want, we could include an informal warning that fragments are ignored because schemas are identified by absolute IRIs, but it shouldn't technically be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need information on valid values for $id
, though. I feel that's what at least the first sentence does. I'll play with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I was only suggesting removing the part about normalization and fragments. The part about it being an IRI reference needs to stay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the normalization up to the previous section as suggested. I had to play with the wording a bit.
Co-authored-by: Jason Desrosiers <[email protected]>
…son-schema-spec into gregsdennis/id-updates
@notEthan @jdesrosiers I've almost completely reworked the I moved the syntactic definition of the keyword up to the top sentence next to its purpose, and I reworked how it talks about defining a resource, needing to be resolved against the current base IRI, and then providing a base IRI for child resources. This relationship is necessarily recursive (the |
What kind of change does this PR introduce?
Cleans up some language around
$id
.Issue & Discussion References
Summary
Just some clean-up and clarification that we've previously identified.
I decided to leave this in for now. We can discuss and change separately if needed.
Does this PR introduce a breaking change?
No