-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add provisional metadata ser in named field pos #544
Conversation
Regarding the collision problem: Binding the meta to the struct name as well and not just field name significantly reduces the risk of collision, but does not eliminate it. Since the type isn't accessible from serialization however, the only way to fully eliminate the collision problem is by not relying on type binding in the first place. One alternative might be to supply the meta as a structured hierarchy. This requires the serializer to keep track of which field(s) are currently entered. fn main() {
#[derive(Serialize)]
struct A {
a: usize,
b: inner::A,
c: usize,
}
mod inner {
use serde_derive::Serialize;
#[derive(Serialize)]
pub struct A {
pub a: usize,
}
}
let mut config = PrettyConfig::default();
{
let meta = &mut config.meta;
{
let field = meta.field_mut_or_default("a");
field.set_meta("field a");
}
{
let field = meta.field_mut_or_default("b");
field.set_meta("field b");
field.set_inner({
let mut fields = Fields::new();
fields.field_mut_or_default("a").set_meta("inner field a");
fields
});
}
{
let field = meta.field_mut_or_default("c");
field.set_meta("field c");
}
}
let value = A {
a: 0,
b: inner::A { a: 0 },
c: 3,
};
let s = ron::ser::to_string_pretty(&value, config).unwrap();
println!("{s}");
}
|
Alright, I think that should clear up the collision problem. Support for more positions can be added later seamlessly. Going with |
Thank you @voidentente for your PR and the extensive motivation! I absolutely agree with the motivation for adding attributes in general and doc comments as a first step towards them. What I'm unsure about is the focus on type-based docs instead of value-based docs. As you very thoroughly lay out, there are almost no uniquely type-based places for docs to appear, with struct-like fields being the exception, most other places feel more like they would be used for value-docs. Furthermore, the API for type-based docs feels a bit brittle as there is a possibility for name clashes or the structure of the type names needs to be separately encoded again. What I would prefer is a value-based docs (later generalised to attributes) API that is more general and can be utilised to then build a type-based API on top of it (but probably not inside RON but another crate). In particular, I would suggest that stylistically, doc comments generally go on the line above the value and we add a special case for fields so that they are serialised as follows: (
/// my value comment
a: 42,
) What I think might work well is an API similar to
Types that would always like to serialise the same doc comments (type-based) could then update their serialisation code to always serialise the same comment. Your proposed API of taking an existing data structure and serialising it with separate type comments could be built by wrapping the serialiser and injecting metadata structs whenever a matching struct is serialised. What are your thoughts? |
Well, I expressly wanted to keep the extent of this PR as scoped as possible. It's based entirely on top of the comment syntax, because I'm unsure of the idiomacy of deserialization of metadata. If metadata is capable of deserialization, it wouldn't be just for human readability. At that point, the difference between metadata and normal data becomes muddy. You bring up other attribute kinds, which would require a new syntax and thus support by the deserializer. I'm hesitant about this, because it would be a breaking change (unless some piggybacking happens, like The primary motivation (at least for me here) is serializing documentation. There's no need to be capable of deserializing metadata in order to update it; the user should be able to dictate what metadata to put where regardless of the previous state of the document. Since this PR should be entirely non-breaking, it'd be a first good step to adding support. It wouldn't be forwards-compatible with an extended format, but that's fine if this part of the API is marked as unstable. I'm not knowledgeable enough about the serde ecosystem to transform this into a fully-featured, integrated, stable API from the get-go.
Type position is usually a subset of field position, which is why I chose to neglect it. Serializing in a way that differentiates it between field position metadata would hurt readability. /// type position
Type (
/// field position
a:
/// type position
0,
/// field position
opt:
/// type position
Some(13),
/// field position
inner:
/// type position
Other {
/// field position
value:
/// type position
(),
}
) The collision problem should be entirely resolved by using hierarchy. The serializer internally keeps track of which field is entered, allowing to differentiate fields with the same name based on context. The (field position) metadata for the above RON is currently expressed like this:
|
I appreciate the minimal initial approach and the focus on doc comments as a first step! And it’s true that treating metadata as data in the serde model might not be suited for all use cases, where your proposed more ad-hoc API would work very well. Your proposed API is thus growing on me and I think it could be supported long-term, even if it’s internal implementation might at some point switch to a proper attribute system.
Ron already has attributes, though they’re currently only used to enable features and are only allowed at the top of the document. This could simply be expanded, and just like in Rust, doc comments would be supported as /// and #[doc = “”]. |
What this PR would still need is tests to ensure full coverage and to test how field docs can be added e.g. to a struct nested in an enum inside a vec |
Oh, I didn't know about that! I'll stick with the current plan now, but extending the deserializer later might be worth looking into. I'll add some tests, let me know of any API nitpicks if you have them :) |
TL;DR: The hierarchy is not absolute, but relative to entered named fields. The test file asserts that addressing fields of a struct nested in an enum inside a vec is structurally equivalent to addressing a struct. |
API-wise, I think the meta field on PrettyConfig should be of an new Meta type, which then exposes a fields() method, so that we could add support for different accessors later. Could there be a test for a tuple (Person, Pet) and how the name clashes are handled there? |
@juntyr Any more thoughts on this? |
Sorry for the delay in the review, things have been hard |
The path traversal should now be constant time. I'm not super happy with the public API, but it's prone to change anyway. |
I'll also throw in that |
Thank you @voidentente for your continuous work on this PR - just a few more nits and I'm happy to land this |
I moved the PR to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have two minor nits, but everything else looks great :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @voidentente for your work on this feature! |
Motive
It would be useful to be able to serialize additional metadata, i.e. data that is not scoped to the data representation itself (specifically, comments/documentation regarding the implications of a field/its possible values/valid range/etc).
Combined with crates like documented, this would (for example) allow for auto-documented configuration files.
This PR aims to provide a very lightweight (provisional) support for serializing metadata in named field position.
Serialization
Metadata should be provided to the serializer through
PrettyConfig
, since it's targeted at human readability.The only relevant serialization call is
<Compound as ser::SerializeStruct>::serialize_struct
.Why not in type position?
Distinguishing field position metadata from type position metadata is awkward. It's possible, but awkward:
Why not in unit structs?
Unit structs are values:
Without type position, since there's no fields, there's nothing to do.
Why not in tuple structs?
Tuple structs have unnamed fields, which would make it appear as if values had metadata:
Why not in newtype structs?
Same as with tuple structs.
Why not in variants?
Variants are values, and suffer from the same problem as type position:
Deserialization
Metadata is skipped during deserialization. This is given by using the comment syntax.
Open Questions
Where should the metadata live during serialization? In
Serializer
?PrettyConfig
? Should it be skipped when serializingPrettyConfig
?What to use to mark metadata? To keep backwards compatibility, either
//
or///
are suitable, with///
additionally offering forwards compatibility to distinguish metadata from plain comments.How to avoid overlaps? If two structs have a field that share a name, there’s no way to differentiate them. Additional data would have to be tracked to distinguish these namespaces. Because this might be surprising, I'd say this should stay a draft until this has a solution.
Example of an overlap:
Example Usage