Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dag-cbor: add ability to validate input without fully deserializing #158

Open
Tracked by #1048
raulk opened this issue Nov 4, 2022 · 9 comments · May be fixed by #159
Open
Tracked by #1048

dag-cbor: add ability to validate input without fully deserializing #158

raulk opened this issue Nov 4, 2022 · 9 comments · May be fixed by #159

Comments

@raulk
Copy link
Member

raulk commented Nov 4, 2022

To implement filecoin-project/FIPs#483 securely in the FVM, we need to validate that entry values are well-formed DAG-CBOR payloads. All we need is to perform syntatical validation without incurring in any serde costs/overheads.

@vmx
Copy link
Member

vmx commented Nov 7, 2022

Based on https://github.com/filecoin-project/FIPs/pull/483/files#r1013446172 comment, it sounds being more than syntactical validation, e.g. key ordering. I guess using the Serde code path wouldn't add much overhead, if the deserialized values are dropped (I assume something like that is possible, I haven't tried).

If a custom validation function is added, it should be benchmark against a Serde version, to see if the performance justifies having a separate code with potentially introduces bugs.

@raulk
Copy link
Member Author

raulk commented Nov 7, 2022

@vmx How would a serde path work here? This is a dynamic data structure whose schema we don't know.

@vmx
Copy link
Member

vmx commented Nov 7, 2022

@raulk: You can deserialize to an Ipld enum. See https://github.com/ipld/serde_ipld_dagcbor/blob/ea9b594421a47ac431627781a65d641ff54a3f2b/tests/de.rs#L88-L98 for a full example.

@raulk
Copy link
Member Author

raulk commented Nov 7, 2022

Yes I know, but validating the whole input would imply deserialising into ipld::Ipld, which uses owned data, so it's not zero-copy syntactical validation?

@raulk
Copy link
Member Author

raulk commented Nov 7, 2022

Here's a PR: #159 I'm working on removing the recursion. I have some ideas.

@vmx
Copy link
Member

vmx commented Nov 10, 2022

Yes I know, but validating the whole input would imply deserialising into ipld::Ipld, which uses owned data, so it's not zero-copy syntactical validation?

Correct. Though I think zero-copy should be possible with Serde, it's just not implemented in serde_ipld_dagcbor.

@raulk
Copy link
Member Author

raulk commented Nov 10, 2022

@Stebalien hinted that deserializing into IgnoredAny may do the trick here.

@Stebalien
Copy link
Collaborator

Unfortunately, we've found that that doesn't quite work as upstream (cbor4ii) doesn't validate minimality.

@vmx
Copy link
Member

vmx commented Nov 10, 2022

as upstream (cbor4ii) doesn't validate minimality.

We already kind of patch upstream in serde_ipld_dag_cbor, could it be integrated there? I'd surely be interested in pushing as much as possible upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants