You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
com.hedera.pbj.runtime.Codec has a method to measure how many bytes a record would take, when serialized. This is measureRecord() This method is used in Virtual Mega Map prototype hashgraph/hedera-services#17007 to estimate virtual node cache size in memory, which is required to flush data to disk in proper times.
There are a couple issues with measureRecord(), though:
Performance. These methods are slow
The size includes protobuf tags, lengths, and so on
Var ints/longs may take less or more bytes when serialized than in memory
Default values are not taken into consideration, yet they take some bytes in memory
It all makes this method not very suitable to estimate virtual node cache size in memory. This ticket is to provide a new method in the Codec interface for this purpose. It doesn't have to be very precise, but it has to be fast. For example, it's really hard to understand how many bytes a String field uses in memory, but length() * 2 is a fast and very conservative estimation. Bytes is easy, they are 16 + byte array length. Boxed booleans/integers seem to be 16 bytes, boxed longs are 24 bytes, and so on. Some research is needed here to find good memory estimations for all field types. Focus should be on speed rather than precision.
The text was updated successfully, but these errors were encountered:
com.hedera.pbj.runtime.Codec
has a method to measure how many bytes a record would take, when serialized. This ismeasureRecord()
This method is used in Virtual Mega Map prototype hashgraph/hedera-services#17007 to estimate virtual node cache size in memory, which is required to flush data to disk in proper times.There are a couple issues with
measureRecord()
, though:It all makes this method not very suitable to estimate virtual node cache size in memory. This ticket is to provide a new method in the
Codec
interface for this purpose. It doesn't have to be very precise, but it has to be fast. For example, it's really hard to understand how many bytes a String field uses in memory, butlength() * 2
is a fast and very conservative estimation. Bytes is easy, they are 16 + byte array length. Boxed booleans/integers seem to be 16 bytes, boxed longs are 24 bytes, and so on. Some research is needed here to find good memory estimations for all field types. Focus should be on speed rather than precision.The text was updated successfully, but these errors were encountered: