Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding symmetrical serialization to eds, row and sample #495

Conversation

citizen-stig
Copy link
Contributor

Follow up of #493

@citizen-stig
Copy link
Contributor Author

Tagging @zvolin for visibility

type Error = Error;

fn try_from(raw: RawExtendedDataSquare) -> std::result::Result<Self, Self::Error> {
ExtendedDataSquare::from_raw(raw, AppVersion::V2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can assume any particular AppVersion here, and this is likely the reason why there is no Deserialize derived here nor the try_from. I don't have any idea if we could do better there than just require using ExtendedDataSquare::from_raw

Copy link
Member

@zvolin zvolin Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the rpc we work this around by requiring whole ExtendedHeader instead of just height when requesting EDS (and other share related stuff). We made an exception for the RowNamespaceData as requesting it for parity shares through celestia-node rpc is not supported

Copy link
Member

@zvolin zvolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking through the PR, I'm not sure if there is a way to support this correctly. It's a problem we're facing porting celestia to rust that the structures here aren't self contained and often require additional information to make sure everything is sound. See celestiaorg/celestia-node#3977 for EDS and celestiaorg/CIPs#128 for shwap types.

I'm happy to revisit this if you find a way to make it correct, but I'm afraid it's not possible with current state as types are not self contained. With current way celestia operates, the only correct (and binary) encoding of those is through .encode and .decode, passing extra infos through corresponding Id types

Comment on lines +174 to +177
let mut shares = Vec::with_capacity(raw.shares_half.len());
for shr in raw.shares_half {
shares.push(Share::try_from(shr)?);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't correct and again I'm not sure if we can do better than Row::from_raw. Row is a structure which holds the whole row from EDS, but in it's serialized form, it drops a half which needs to be reconstructed using leopard RS. However with just a plain shares_half, it is not possible to know which side needs to be reconstructed.

type Error = Error;

fn try_from(raw: RawSample) -> std::result::Result<Self, Self::Error> {
let share = raw.share.ok_or(Error::MissingShares)?.try_into()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't work for parity shares, and there is again not much information to judge if share should be treated as parity or not

@citizen-stig
Copy link
Contributor Author

After looking through the PR, I'm not sure if there is a way to support this correctly. It's a problem we're facing porting celestia to rust that the structures here aren't self contained and often require additional information to make sure everything is sound. See celestiaorg/celestia-node#3977 for EDS and celestiaorg/CIPs#128 for shwap types.

I'm happy to revisit this if you find a way to make it correct, but I'm afraid it's not possible with current state as types are not self contained. With current way celestia operates, the only correct (and binary) encoding of those is through .encode and .decode, passing extra infos through corresponding Id types

Ok, I see the issue with hard coding particular version of app into the deserialization.

But what can be workaround here? We rely on support of serde to serialize/deserialize those things in binary format.

If .encode is used in reality anyway, is it possible to remove #[serde(into = "RawExtendedDataSquare", try_from = "RawExtendedDataSquare")] so at least simmetrical binary serialization/deserialization works?

I guess we can use .encode and .decode in out types, if we manually implement serde derives. I will need to look into it. Thank you for advices!

@zvolin
Copy link
Member

zvolin commented Jan 2, 2025

If .encode is used in reality anyway, is it possible to remove #[serde(into = "RawExtendedDataSquare", try_from = "RawExtendedDataSquare")] so at least simmetrical binary serialization/deserialization works?

There is no symmetry in case of EDS because Deserialize is not implemented for it at all. I think that'd be the correct way also for Sample and Row, to just remove the Deserialize from them. They all serialize into Raw format, and then can be deserialized into Raw again but requiring additional info to reconstruct original type using from_raw.

If you need serde, I think you could wrap them in a struct containing also additional information required to make the deserialization correct

@zvolin
Copy link
Member

zvolin commented Jan 2, 2025

We don't like this too really... and we would love to have all the types self contained. It's mostly an idiomacy differences between go and rust, where in Go they happily serialize / deserialize everything and validate constrains where needed, and in Rust the idiomatic way is to not allow constructing invalid state at all

@citizen-stig
Copy link
Contributor Author

If .encode is used in reality anyway, is it possible to remove #[serde(into = "RawExtendedDataSquare", try_from = "RawExtendedDataSquare")] so at least simmetrical binary serialization/deserialization works?

There is no symmetry in case of EDS because Deserialize is not implemented for it at all. I think that'd be the correct way also for Sample and Row, to just remove the Deserialize from them. They all serialize into Raw format, and then can be deserialized into Raw again but requiring additional info to reconstruct original type using from_raw.

If you need serde, I think you could wrap them in a struct containing also additional information required to make the deserialization correct

Good point about EDS, missed that it does not have Deserialize.
And totally agree on removing Deserialize from Row and Sample.

I will try to go that route and encapsulate all needed information in custom deserialization for types that are used in our adapter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants