Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node/bindnode: reduce verifyCompatibility() calls, consider caching some type info #416

Open
rvagg opened this issue May 19, 2022 · 2 comments
Assignees
Labels
P2 Medium: Good to have, but can wait until someone steps up

Comments

@rvagg
Copy link
Member

rvagg commented May 19, 2022

Currently we run verifyCompatibility() on every call to Prototype() and Wrap() (unless it's inferring a schema for you), and it's fairly expensive. Unfortunately we currently can't cache this because we have no place to cache it in.

But, we could consider something like having a bindnode version of schema.Type that you get from a Prototype().Type(). So, a workflow where you do an initial Prototype() and then reuse the resultant Type() for each further Wrap() call. If Wrap() finds a bindnode version of schema.Type it could then assume that verifyCompatibility() has already been performed if the Go type is the same.

Further, we could take the opportunity to cache information about various properties of the Go type(s), such as field names and whether something is a pointer or not. Anything that's a little expensive to do at Wrap() and assemble (on the Prototype). Even things like fieldNameFromSchema() calls could be cached because they're showing up in profiles as a concern.

@rvagg rvagg added the P2 Medium: Good to have, but can wait until someone steps up label May 19, 2022
@mvdan
Copy link
Contributor

mvdan commented May 19, 2022

This is briefly mentioned in a TODO:

// Plus, one map per call means we don't reuse work.

I'll add some context, as it doesn't seem like I wrote it down anywhere. We can globally cache any information with reflect.Type as a key, and that is what other encoding libraries do already. For instance, encoding/json prepares efficient marshaler and unmarshaler funcs for each struct type, and stores them globally to be quickly reused later. This is fine because the set of Go types is finite - they are loaded into the runtime forever, so it's already "leaky" to keep creating new distinct Go types forever. Using a global map[reflect.Type]T does not make the problem worse.

That kind of thing can certainly be done for inferSchema, as the input is just a reflect.Type. It's harder for verifyCompatibility and inferGoType, because schema.Type is now part of the input, and there isn't a finite set of IPLD schema types.

I should also note that I don't think this is a problem for Prototype. That API is meant to be used once upfront, and then the prototype is reused, by e.g. storing it as a global variable or struct field. I can't see why we would ever need to call the API with the same inputs again.

Wrap is different because its entire purpose is to call it with different Go values as needed, so repeated calls with the same Go type and IPLD schema type are natural. Akin to your suggestion, I had thought of an alternative API like:

func WrapPrototype(ptrVal interface{}, prototype schema.TypedPrototype) schema.TypedNode

It would require the prorotype to have been created by its Prototype API, and so it would skip the verifyCompatibility step entirely. It would also not support any form of inferrence - the purpose of it being that you've done that work already, if needed.

@rvagg
Copy link
Member Author

rvagg commented May 20, 2022

func WrapPrototype(ptrVal interface{}, prototype schema.TypedPrototype) schema.TypedNode

👍 I was considering proposing basically this as well, for this reason. But the only excuse for needing the entire prototype is that you want to ensure the first-pass is already done. But that does seem reasonable considering the cost savings.

Thanks for the input!

@rvagg rvagg moved this to 🗄️ Backlog in IPLD team's weekly tracker Jul 19, 2022
@rvagg rvagg moved this from 🗄️ Backlog to 🥞 Todo in IPLD team's weekly tracker Aug 2, 2022
@rvagg rvagg self-assigned this Aug 2, 2022
@rvagg rvagg moved this from 🥞 Todo to 🗄️ Backlog in IPLD team's weekly tracker Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Medium: Good to have, but can wait until someone steps up
Projects
Status: 🗄️ Backlog
Development

No branches or pull requests

2 participants