`$id` updates #1537

gregsdennis · 2024-10-01T00:32:23Z

What kind of change does this PR introduce?

Cleans up some language around $id.

Issue & Discussion References

Related to Clean-up of stable features and removal of unstable features #1444

Summary

Just some clean-up and clarification that we've previously identified.

Schema document roots SHOULD have an $id (maybe relax this? ...) - from #1444 (comment)

I decided to leave this in for now. We can discuss and change separately if needed.

Does this PR introduce a breaking change?

No

jviotti

Nice! Most of the changes are simple one-character edits, and the rest read very well to me

notEthan · 2024-10-01T20:05:32Z

jsonschema-core.md

+Due to the potential break in functionality described above, the behavior for
+using JSON Pointer fragments that point to or cross a resource boundary is
+undefined. Schema authors SHOULD NOT rely on such IRIs, as using them may
+reduce interoperability.


Stating here that this behavior is undefined seems to me to contradict the above "an embedded schema resource and its subschemas can be identified by JSON Pointer fragments relative to either its own canonical IRI, or relative to any containing resource's IRI".

This also leaves [^8] with no reference to it.

And, minor, but two spaces after a period isn't consistent with the rest of the document.

This also leaves [^8] with no reference to it.

I'll add that back in.

It was discussed that we need to make pointers that cross boundaries an undefined behavior rather than explicitly allowing implementations to support it. Of course, implementations can still support it if they want, but the spec should push schema authors a bit more forcefully toward using the proper $id instead.

I think the critical point here is "an embedded schema resource and its subschemas". It's arguable that an embedded schema resource isn't strictly a subschema since it can be refactored out without any change in behavior. If we need to clarify that, I'm happy to.

Right, I get making this undefined is the main point of this PR (I edited my comment a bit to clarify that, had it slightly wrong initially ... I'm not sure this should be undefined but if that is the way it is to be, okay).

To me "an embedded schema resource ... can be identified by JSON Pointer fragments ... relative to any containing resource's IRI" reads as defined behavior and not consistent with making it undefined some paragraphs later, and needs some change. I see your point on ambiguity whether "subschema" includes embedded schemas but I don't see language present that actually calls embedded schemas subschemas.

I read that paragraph as informative, not prescriptive. It's setting up the reasoning behind the requirement.

Embedded schemas can be referenced this way, but because reasons it's a bad idea to actually do it, so we're gonna say probably don't. We're also letting implementations decide whether they want to support it, no pressure either way.

jsonschema-core.md

jdesrosiers

Thanks for tackling these issues!

jdesrosiers · 2024-10-01T22:40:17Z

jsonschema-core.md

 If present, the value for this keyword MUST be a string, and MUST represent a
-valid [IRI-reference](#rfc3987). This IRI-reference SHOULD be normalized, and
-MUST resolve to an [absolute-IRI](#rfc3987) (without a fragment).
+valid [IRI reference](#rfc3987). This IRI reference SHOULD be normalized per RFC
+3987, section 5.3, and MUST resolve to an [absolute IRI](#rfc3987) (without a
+fragment).


I think the normalization requirement is a requirement on the schema author and should be avoided. I think this should be a requirement on an implementation's internal schema registry. The implementation should accept an IRI that isn't normalized, but SHOULD normalize as needed to retrieve schemas.

So, I like the improvement to the wording, but I think this whole requirement needs to go elsewhere.

I'll have a think about this. I don't disagree, but I'll need to figure out where to put such a requirement. This section does warrant a note to authors that implementations will be performing this normalization; probably a link to wherever this ends up.

Leaving it in place and changing to

If present, the value for this keyword MUST be a string, and MUST represent a valid IRI reference. When processing this IRI reference, implementations SHOULD normalize it per RFC 3987, section 5.3, so that it resolves to an absolute IRI (without a fragment).

seems to work. It puts the requirement on the implementation, but it also lets the user (if they're even reading this) know what they can expect from the implementation.

I do fear, though, that this gives license to users to put IRIs with fragments in $id knowing that the implementation will simply ignore it. (It's almost like the output needs the ability to report a warning as well as an error. "Hey, uh, there's a fragment on that IRI, guy. You sure you want to do that? Probably don't.") I think we do have or planned to have some language that disallowed fragments in $id, though.

I don't think leaving it in place works because $id isn't the only way a schema might be assigned an IRI. For example, implementations often also allow users to assign IRIs to schemas when they're registered. The normalization requirement should also apply to those user provided IRIs and therefore needs to be somewhere where it doesn't only apply to $id.

One place I think it could work is in "Base IRI, Anchors, and Dereferencing" above. Here's a suggestion.

To differentiate between schemas in a vast ecosystem, schemas are identified by absolute IRIs (without a fragment).

Schemas can embed references to other schemas by specifying their IRI. When implementations dereference these references, they SHOULD use the normalization procedures defined in RFC 3987, section 5.3 when comparing URIs.

Several keywords can accept an IRI reference. The schema's identifier serves as the base IRI for resolving relative references.

Then I'd remove everything about normalization and fragments from this paragraph. If we want, we could include an informal warning that fragments are ignored because schemas are identified by absolute IRIs, but it shouldn't technically be necessary.

We do need information on valid values for $id, though. I feel that's what at least the first sentence does. I'll play with it.

Right, I was only suggesting removing the part about normalization and fragments. The part about it being an IRI reference needs to stay.

I've moved the normalization up to the previous section as suggested. I had to play with the wording a bit.

jsonschema-validation.md

jsonschema-core.md

Co-authored-by: Jason Desrosiers <[email protected]>

…son-schema-spec into gregsdennis/id-updates

gregsdennis · 2024-10-11T19:35:27Z

@notEthan @jdesrosiers I've almost completely reworked the $id section. There were some redundancies in it that didn't sit right with me.

I moved the syntactic definition of the keyword up to the top sentence next to its purpose, and I reworked how it talks about defining a resource, needing to be resolved against the current base IRI, and then providing a base IRI for child resources.

This relationship is necessarily recursive (the $id value needs a base IRI but then also serves as a base IRI), so it's quite difficult to state simply.

gregsdennis added 5 commits October 1, 2024 10:44

replace 'IRI' ABNF symbol references with plain language

10fa342

resolves #1349 - add explicit pointer to IRI normalization process

a260c77

pointers across resource boundary is undefined

9af8b27

remove paragraphs about ref-ing into unknown keywords

430bb83

fix line wrapping

d41808d

gregsdennis requested a review from a team October 1, 2024 00:33

jviotti approved these changes Oct 1, 2024

View reviewed changes

notEthan reviewed Oct 1, 2024

View reviewed changes

jsonschema-core.md Show resolved Hide resolved

notEthan reviewed Oct 1, 2024

View reviewed changes

jsonschema-core.md Outdated Show resolved Hide resolved

gregsdennis added this to the stable-release milestone Oct 2, 2024

add comment ref back; update appendix format per PR discussion

76505b9

jdesrosiers requested changes Oct 2, 2024

View reviewed changes

gregsdennis and others added 4 commits October 3, 2024 08:24

Apply suggestions from code review

4315260

Co-authored-by: Jason Desrosiers <[email protected]>

Update jsonschema-core.md

17db555

reorganize the $id section and remove redundancies

86647d1

Merge branch 'gregsdennis/id-updates' of github.com:json-schema-org/j…

8268e40

…son-schema-spec into gregsdennis/id-updates

gregsdennis added 2 commits October 12, 2024 08:45

some more clarification on when implementations should normalize IRIs

46dd739

more clarity

94bdd23

gregsdennis requested review from notEthan, jviotti and jdesrosiers October 11, 2024 19:50

gregsdennis self-assigned this Oct 12, 2024

add that the base IRI is also a base for nested resources

3f547ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`$id` updates #1537

`$id` updates #1537

gregsdennis commented Oct 1, 2024

jviotti left a comment

notEthan Oct 1, 2024 •

edited

Loading

gregsdennis Oct 1, 2024

gregsdennis Oct 1, 2024 •

edited

Loading

notEthan Oct 1, 2024

gregsdennis Oct 2, 2024

jdesrosiers left a comment

jdesrosiers Oct 1, 2024

gregsdennis Oct 2, 2024

gregsdennis Oct 6, 2024 •

edited

Loading

jdesrosiers Oct 7, 2024

gregsdennis Oct 8, 2024

jdesrosiers Oct 8, 2024

gregsdennis Oct 11, 2024

gregsdennis commented Oct 11, 2024

$id updates #1537

Are you sure you want to change the base?

$id updates #1537

Conversation

gregsdennis commented Oct 1, 2024

What kind of change does this PR introduce?

Issue & Discussion References

Summary

Does this PR introduce a breaking change?

jviotti left a comment

Choose a reason for hiding this comment

notEthan Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gregsdennis Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdesrosiers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gregsdennis Oct 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gregsdennis commented Oct 11, 2024

`$id` updates #1537

`$id` updates #1537

notEthan Oct 1, 2024 •

edited

Loading

gregsdennis Oct 1, 2024 •

edited

Loading

gregsdennis Oct 6, 2024 •

edited

Loading