Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json: dynamic field omission #52803

Closed
smikulcik opened this issue May 9, 2022 · 24 comments
Closed

proposal: encoding/json: dynamic field omission #52803

smikulcik opened this issue May 9, 2022 · 24 comments

Comments

@smikulcik
Copy link

smikulcik commented May 9, 2022

When structuring json types, it would be very helpful to allow the custom types define if the value is to be omitted. This is similar to a javascript value may set to "undefined".

Just as we allow custom types to implement "MarshalJSON" to provide a custom value: ("null", multivariate, etc). So too should we allow these custom types to be set to "Omitted" as well.

This proposal may supersede other highly desired omitnil proposals as one may leverage this abstraction to accomplish the same goal.

Proposal

Update encoding/json to check if a struct field implements an Omittable interface where Omittable is defined as follows:

type Omittable struct{
  Omitted() (bool, error)
}

If a fieldType implements this interface, omit the filed if the response of Omitted() is true.

Example (Omittable String)

type Foo struct{
  Field1 OmittableString `json:"field_1"`
}

type OmittableString struct {
  IsSet bool
  Value string
}
func (o OmittableString) UnmarshalJSON(input []byte) error {o.IsSet=true; return json.Unmarshal(input, o.Value)}
func (o OmittableString) MarshalJSON() ([]byte, error) {return json.Marshal(o.Value)}
func (o OmittableString) Omitted() (bool, error) {return !o.IsSet, nil}

x := &Foo{}

_ = json.Unmarshal([]byte(`{}`), x)
// x.Field1.IsSet = false
out, _ = json.Marshal(x)
// {}

_ = json.Unmarshal([]byte(`{"field_1": ""}`), x)
// x.Field1.IsSet = true
// x.Field1.Value = ""
out, _ = json.Marshal(x)
// {"field_1": ""}

_ = json.Unmarshal([]byte(`{"field_1": "foo"}`), x)
// x.Field1.IsSet = true
// x.Field1.Value = "foo"
out, _ = json.Marshal(x)
// {"field_1": "foo"}

Example (Omit-nil)

type Foo struct{
  Field1 OmitNilStruct `json:"field_1"`
}

type OmitNilStruct struct {
  Value *MyStruct
}
func (o OmitNilStruct) UnmarshalJSON(input []byte) error {return json.Unmarshal(input, o.Value)}
func (o OmitNilStruct) MarshalJSON() ([]byte, error) {return json.Marshal(o.Value)}
func (o OmitNilStruct) Omitted() (bool, error) {return o.Value == nil, nil}

type MyStruct struct {}

x := &Foo{}

_ = json.Unmarshal([]byte(`{}`), x)
// x.Field1.Value = nil
out, _ = json.Marshal(x)
// {}

_ = json.Unmarshal([]byte(`{"field_1": {}}`), x)
// x.Field1.Value = MyStruct{}
out, _ = json.Marshal(x)
// {"field_1": ""}

Compatibility

This may cause issues with existing custom types that happen to implement this new method. Though I would guess that the number of uses with such a pattern is low.

Implementations

I've implemented this locally and it seems to work fine. If approved I can submit a PR.

Related Proposals

@gopherbot gopherbot added this to the Proposal milestone May 9, 2022
@seankhliao
Copy link
Member

see also #50480

@smikulcik
Copy link
Author

I like what's proposed in #50480, but the implementation was considerably easier for this proposal as we can leverage the code next to omitempty.

https://cs.opensource.google/go/go/+/refs/tags/go1.18.1:src/encoding/json/encode.go;l=750

@rittneje
Copy link

Personally, I think it might make more sense to add a method like IsEmpty() bool. That way, you can still use the omitempty tag on the struct field, and decide on a case-by-case basis whether it should omit, and it keeps it consistent with how "normal" types get omitted.

The other nice thing is it side-steps the issue of what happens when you directly pass your custom type to json.Marshal. For instance, referencing the above sample code, what does json.Marshal(OmittableString{}) do?

@smikulcik
Copy link
Author

@rittneje Thanks for the feedback.

IMHO, the concept of a property being "empty" seems to be a flawed abstraction. True, we do have existing patterns around "emptiness" (omitempty etc.). Here I'm making that claim that the model that we use to describe a json document should more clearly expose the concept of "property omission" to the object model.

Though, to directly answer your question, I would expect json.Marshal(OmittableString{}) to follow the MarshalJSON method and return "" as it is not being composed with an object or an array and as such cannot be omitted.

@smikulcik
Copy link
Author

smikulcik commented May 10, 2022

I see "helpers" like omitempty and omitnil as shorthands for a policy around omit or not to omit.

We should have a way to describe a policy around property omission without making assumptions around policy as property omission is a core part of the json data structure.

With this policy-based view in mind, we would then be able to phrase the policy of omitempty within a pattern like this:

function (x Foo) Omitted() (bool, error){
  switch reflect.TypeOf(x){
    case reflect.String:
      return x == ""
    case reflect.Int:
      return x == 0
    case reflect.Ptr:
      return x == nil
    ...
    }
}

@rsc
Copy link
Contributor

rsc commented May 11, 2022

/cc @dsnet

@rsc
Copy link
Contributor

rsc commented May 11, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@dsnet
Copy link
Member

dsnet commented May 11, 2022

This feels like a duplicate of #45669.

A prototype implementation of #45669 supports an omitzero option that omits a Go struct field if either:

  • the field value is the zero Go value (per reflect.Value.IsZero) OR
  • the field value implements IsZero() bool and it reports true.

This definition is simple to implement and easy to explain.

@joeshaw
Copy link
Contributor

joeshaw commented May 11, 2022

I implemented a prototype of the approach similar to #45669 for #11939 in CL 13977, though it applies to omitempty rather than a new omitzero tag. (And there's no reason why IsZeroer needs to be exported.)

Another option oft-mentioned in other proposals would be to add a sentinel error value to indicate that a value should be omitted from the MarshalJSON method, though this would presumably also have to be opt-in with a new struct tag to avoid inadvertent potential backward compatibility issues with types like time.Time.

@hherman1
Copy link

Is the benefit over #50480 just an easier implementation?

@dsnet
Copy link
Member

dsnet commented May 11, 2022

The difficulty lies not with the implementation, which is rather straightforward. The difficulty is in the details and making sure it interacts well with the overall ecosystem.

The MarshalJSON method is frequently called directly by user code where users expect to receive a valid JSON representation of the value. This fact alone suggests that:

are not acceptable approaches.

Also, we can't change the meaning of omitempty (at least for the current "encoding/json" package), so anything that tries to modify the semantics of that is also off the table.

@joeshaw
Copy link
Contributor

joeshaw commented May 12, 2022

Potential issues with using reflect.Value.IsZero on a struct type:

  • Nonzero unexported fields will cause it to return false
  • It returns true only if it is recursively true for all fields of the struct. This could be confusing if a struct member is not reflect.Value.IsZero but it does implement IsZero() bool itself and would return true from that. (Playground example)

encoding/json has its own isEmptyValue function which performs most of the same checks (and predates reflect.Value.IsZero) but most notably does not recurse into structs.

@dsnet
Copy link
Member

dsnet commented May 12, 2022

@joeshaw: Thanks. Those are legitimate concerns, but what you're suggesting is getting increasingly more complicated. The more complicated the rules, the harder it is to use correctly.

If we go with a more complicated definition of what "zero" means, anything about being "exported" should not appear in the definition. It seems the intent is not so much that we should ignore unexported fields, but rather that we should ignore fields that have no JSON serialization. It would be odd to ignore an unexported field, but consider the zero-ness of a field marked as json:"-". But even that isn't enough. What does it mean when a field implements MarshalJSON? What does zero mean there?

Using reflect.Value.IsZero has the major advantage that it is well defined and simple. It is good enough for most simple structs (e.g., netip.Addr). To work around concerns where it is insufficient, that is why we also check the bool IsZero() bool, which allows the type's author to define what zero actually means (e.g., time.Time.IsZero).

@icholy
Copy link

icholy commented May 16, 2022

The MarshalJSON method is frequently called directly by user code where users expect to receive a valid JSON representation of the value.

What if, instead of a sentinel error, there was a sentinel value.

func (x T) MarshalJSON() ([]byte, error) {
	return json.Empty, nil
}

Where Empty is:

var Empty = []byte("null")

@dsnet
Copy link
Member

dsnet commented May 16, 2022

That's an interesting idea, but it assumes that the output of MarshalJSON is never mutated by the caller. Unfortunately, I have definitely seen code depend on that property before.

@seankhliao
Copy link
Member

I'm not sure why sentinel errors aren't acceptable? Callers should be handling errors, and any old caller code used with new MarshalJSON code should just surface it (fail) as another error, which can prompt people to update caller code.

@hherman1
Copy link

hherman1 commented May 16, 2022

@dsnet could it be good that the caller is able to mutate and thus block (if they so choose) this behavior?

@joeshaw
Copy link
Contributor

joeshaw commented May 16, 2022

@seankhliao

I'm not sure why sentinel errors aren't acceptable? Callers should be handling errors, and any old caller code used with new MarshalJSON code should just surface it (fail) as another error, which can prompt people to update caller code.

Returning a sentinel error value here indicates behavior, not an error condition. Existing code should not treat it as an error, but as it has no concept of the sentinel value it will, and this will break existing behavior.

If it was only the encoding/json package that called MarshalJSON, then the change in behavior could be added at the same time as the new sentinel value. Unfortunately other code can (and does) call MarshalJSON and this change would break nearly all of that code.

@icholy
Copy link

icholy commented May 16, 2022

In that case, it seems like a new interface is the best option:

func (x T) EmptyJSON() bool { return true }

@rsc
Copy link
Contributor

rsc commented May 25, 2022

Does anyone think this isn't a duplicate of #45669, which is in turn on hold for #22480?
I'd be happy to take #45669 off hold if people want to try to converge there.

@rsc
Copy link
Contributor

rsc commented Jun 1, 2022

This proposal is a duplicate of a previously discussed proposal, as noted above,
and there is no significant new information to justify reopening the discussion.
The issue has therefore been declined as a duplicate.
— rsc for the proposal review group

@rsc rsc closed this as completed Jun 1, 2022
@smikulcik
Copy link
Author

@rsc I see this as an alternative to the omitzero proposal. Instead of baking the omission policy into the struct tag, this proposal allows us to customize the policy on if we want to omit a field. It also lends itself towards the JSON-native "omission" notion instead of the go-specific "zero-value" or the go-json-specific notion of "empty-value".

@dsnet
Copy link
Member

dsnet commented Jun 1, 2022

There are quite a number of "omit this field" proposals. #11939 proposes special-casing an IsZero() bool method, which I think is a superior proposal than this proposal since it a much simpler interface than Omitted() (bool, error).

@smikulcik
Copy link
Author

@dsnet Yeah the IsZero() bool is pretty good too. I added the error param in my proposal to go along with the "everything may fail" idiom. I'd be open to simplifying down to Omitted() bool if it makes a difference.

@golang golang locked and limited conversation to collaborators Jun 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants