-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FRAME core] Clawback PoV Weights For Dispatchables #209
Comments
@paritytech/sdk-node would be good to get some input here how the proof recording is currently working in the client side, and what else can be done to assist this. For example, how expensive is it to know the accurate proof as we are building the block? Can the client measure and report back the actual post-dispatch ( |
we can’t realistically enable proof size benchmarks as current without big regression on the block size. this is a major blocking issue. |
Relative "cheap": https://github.com/paritytech/substrate/blob/5b253f8757de20cd95f274eb35531a5b0a683987/primitives/trie/src/recorder.rs#L155-L157 However, we currently don't take the compressed size into account. |
We could start with the compacted proof, that should already bring advantages in cases that keys are accessed repetitively, and i think is quite cheap to compute. |
I think the better solution and much easier is to come up with some kind of factor between the storage proof and the compressed storage proof. @cheme was always arguing that the compression isn't bringing that much as we have binary data that can not be compressed that much anymore. So, maybe we should collect all the factors of storage proof vs compressed in the benchmarking and then taking the worst factor to calculate the compressed storage proof size based on the storage proof size returned by the host function. |
Some of the data can be compressed well such as But I think the main contributor of big size is difference between |
This is not really relevant to what I said?
Yeah for sure, but when you access some balance, there are probably more trie nodes that contribute to the storage proof size than the actual balance. Even if we only allow to read 5MiB of "uncompressed data" it should be good enough, assuming that the compression ratio is somewhere between 1-2 in average. |
This is also not estimating the compacted proof (I did try to have primitive for compacted proof, it means recording parent-child relation too in order to reduce estimated size, it was not too simple or cheap so I drop the branch). For compression, it does not compress much all trie nodes (mostly crypto hash), but the values in the proof can be compressed (not all value of course, but it depends on the chain here). But there might be something to do by compressing the value only for estimation (at least a proof format could be design with two files: one with trie material and one with indexed values, then we only compress the second one). |
I would propose that we check the current storage proof size after applying an extrinsic. Then we would return the unused (worst_case - actual) proof size as we return the ref time. This should be implemented in the following way: We introduce a custom host function into Cumulus:
In Substrate this will mean that we will introduce a new While the host function isn't used by Polkadot itself, we should still create some RFC for adding it. There are four different "stages" in which this host function will be called.Block buildingWhen we are building a block, we already collecting the storage proof. So, we will just need to register the extension and it should work. Block importCurrently we don't collect a storage proof while importing a block, as we don't need it. However, we would need to start doing this to have Another more complex solution would be to alter FRAME to have some new hook Block validationDepending how we implement the solution for block import, block validation would do it similar. If we enable storage proof recording while block import, it would require to add some new field to If we choose the other solution as outlined above, block validation will not require any kind of special handling and will just work™ Offchain callsFor any kind of offchain call that maybe is hitting We now only need to discuss on which block import solution to implement. I would propose @skunert as the one implementing the changes to make the host function work. The actual FRAME integration should then be relative easy. |
I thought the compacted proof is just de-duplicating node entries from the proof? Or is that already happening?
Is there an easy way to keep host functions extensible?
The issue states multiple situations where this PoV clawback would be good. I assume this host-function is transactional aware and could just called within something like |
We don't have duplicated nodes by default. This is by design. Compaction removes an intermediate node when we have all the childs available to recreate it.
I'm not really convinced that we need this. Especially if we would add a new enum variant, we should also bump the host function version. Stuff like compaction could probably be implemented using some kind of iterative algorithm. However, we would then still need to compress the compact result which can not be iterative, as we could have removed a node in between using compaction.
The storage recorder is transactional, but this isn't working as you expect it. We can only discard a layer if the extrinsic isn't part of the block. The moment the transaction is part of the block, we will need to record all the accessed nodes and can not throw away anything. Otherwise the validation will not work.
It should work, but I don't see the reason? You can just do it after
That makes sense. I think in general you didn't understand what I meant with these "stages". These are seen from the node side, the runtime doesn't care and can call the host function as it likes. |
I did try to implement it inline (when writing a new node we pass its parent reference then we can remove one hash size if the parent in the proof still have this hash unaccessed). Actually, right now I just realize that every time you include a new trie node in the node proof set, it has a parent, so if it is its first insert, you can say that we remove one hash from the size of the compact proof, so I think it is fairly ok to do (just realize that , maybe there is a caveat): the only cost is the insert in the proof recorder need to be able to check if the node is already present (so I think we are plain good here with the existing node hashmap of the current recorder). Algo being simply on insert if node did not exists, remove ~32 from proof size (except the very first node (root)). Edit: actually can also be calculated simply from the number of nodes currently recorded: (nb_node recorded - 1) * 32 to remove from current proof size. |
Ah yea, that was a stupid example 🙈
Okay. I was only concerned that it would not update the PoV estimation immediately after a storage access in the runtime, but it seems to be the case. |
The |
So it's not just extrinics, but also Jacob mentioned correctly that we don't need to correct-down the weight tracker on every operation (extrinsic/task/message), but only when we believe we have reached some limit. At this point and only here shoudl we correct-down to the actual amount of weight used and recheck against the limit properly. In the case of scheduler to which a chain might allocate 80% of block space, the first five tasks might use, say, 75%, with the sixth task estimated at 10%. It would only be at this point that the weight is corrected-down to, say, 45% and the sixth task executed. Block-builder will need to be properly aware, but that's not really the concern of Frame. From the Frame team's point of view, the host function is really all that is needed. |
@bkchr I presume "Block import" is what happens in the collator node, whereas "Block validation" is what happens in the validator node? Seems sensible to go the |
Yeah, sorry! I should have been more clear. By block import I mean when any node imports this block to its database. By block validation I meant the relay chain running
What you mean by the
We don't support distributing any data alongside the block, at least not currently. In the database we store the parts of the block separately, aka the |
I briefly talked with @gavofyork. We came to the following conclusion/idea:
|
@bkchr we also briefly discussed the possibility of (re)evaluating the storage proof size during execution on later validation/execution by marking the parts of the proof which get used as they get entered into a cache. This would obviate the need for this extra inherent phase, the new requirements of Is this viable? |
The idea was that in The only question would be on what to do with |
Once we have this, we can do some kind of a "auto-refund" scheme for all dispatchable's This will help increase the block utilization, and slightly reduce fees for end users. |
The pallet-contracts team is looking into this issue, as decreasing
One idea discussed in this thread is to use systemic over-estimation, using strategies like imposing max PoV weight of 1/10th of a block's capacity, ensuring we can fill up to 90% of available block space. Would love to hear your thoughts on this. |
This would not include the trie nodes and thus, we would underestimate the size of the PoV.
This would require to let the node abort the execution if any storage access would use over the maximum pov weight. |
isn't counting key + value size an over-estimation, since trie path are repeated in each key?
Can't we just exclude the transaction when this happens? |
No. Trie nodes are consisting up to 16 * 32byte hashes and you don't know how many are there in between the root and your value from the runtime.
It is about the block producer, who will find out about this while building a block and not before like it is right now. You need to abort the transaction, but the block producer still would need to get paid for this as he has done work. |
Make sense, thanks for your answers. I think the important thing here is being able to correct down the weight after each extrinsic. |
I have started looking into this issue some time ago already and my current draft mostly follows what has been sketched earlier in this issue. I plan to start opening PRs for this soon. Outline of the current draft:
|
That is indeed going to be very useful. Having done recently many benchmark, I think some worst case scenario (specially the system.account or locks....) are too far from reality (due to the worst case estimation of the tree branches). |
Following-up on the pending PR, as it's expected to fix srlabs_findings/issues/276, @skunert please let me know when you have the PR and its expected ETA for merge. Thanks in advance! |
The PR is available here #1462, I am currently waiting on some more reviews (I know some are in progress). However, I can sadly not access the issue you linked, so not sure what it is about. |
I would like to discuss the reclaiming mechanism on the runtime side. I think there are two main scenarios worth discussing. ExtrinsicsWhen we apply extrinsics we can just perform a check of the proof size before and after the application. The diff between the values tells us how much proof size was actually consumed. The difference between the consumed proof size and the benchmarked one can be reclaimed. I have a simple HooksFor reclaiming in hooks the situation is a bit more complicated.
One way to allow easy manual reclaiming would be to provide a cumulus-supplied AlternativesIn an ideal world, we could just set the current weight to the proof size directly, without the diff calculation steps. However, this would require a change in how we handle These are my thoughts about it, but I am not too experienced with frame, so happy to hear further opinions. |
Agree with that. In a world with Clawback PoV, I would probably adjust the logic, and add an extra check, where I would call the client api to get the adjusted PoV, before trying one more time to make progress and eventually break from the loop. Alternatively, if the client call is pretty cheap, we could also just call it every time we mutate the remaining_weight, to get the PoV component adjusted to it's real value. polkadot-sdk/substrate/frame/contracts/src/lib.rs Lines 410 to 427 in 951bcce
|
I think for the hooks we can for now let the implementer do the metering. The big user-facing part are the Transactions.
This is fine and i dont see the problem. In this case, "manually reclaiming" would just mean to call the new HostFn after each loop iter. For example: fn on_idle(limit: Weight) -> Weight {
for item in work {
item();
if new_host_call() > limit.proof {
break;
}
}
return // Whatever we estimate.
} The current With For the signed extension: I think it must be in a specific order, so that the Weight SE deduces the correct amount? |
Okay, so you guys agree that manual reclaiming for the hooks is the way to go.
Yeah something like that, but since we are getting the remaining block weight passed into fn on_idle(limit: Weight) -> Weight {
// The naming tbd
// ProofAwareWeightMeter internally stores the storage limit as of now by calling hf
let meter = ProofAwareWeightMeter::new(limit);
for item in work {
// Some checking logic that the item actually fits into the remaining weight via benchmark
...
item();
// This would downcorrect the consumed storage weight on the meter based on diff of now vs initialization time.
meter.refund_proof_size()
}
return meter.consumed()
}
What scenarios do you have in mind? You mean other extension already modify the proof weight and it could lead to wrong results? Was not on my radar until now.
From what I saw until now, I think its attractive to have this in a SE. It stays nicely modular and uses the same mechanism as the |
This PR provides the infrastructure for the pov-reclaim mechanism discussed in #209. The goal is to provide the current proof size to the runtime so it can be used to reclaim storage weight. ## New Host Function - A new host function is provided [here](https://github.com/skunert/polkadot-sdk/blob/5b317fda3be205f4136f10d4490387ccd4f9765d/cumulus/primitives/pov-reclaim/src/lib.rs#L23). It returns the size of the current proof size to the runtime. If recording is not enabled, it returns 0. ## Implementation Overview - Implement option to enable proof recording during import in the client. This is currently enabled for `polkadot-parachain`, `parachain-template` and the cumulus test node. - Make the proof recorder ready for no-std. It was previously only enabled for std environments, but we need to record the proof size in `validate_block` too. - Provide a recorder implementation that only the records the size of incoming nodes and does not store the nodes itself. - Fix benchmarks that were broken by async backing changes - Provide new externalities extension that is registered by default if proof recording is enabled. - I think we should discuss the naming, pov-reclaim was more intuitive to me, but we could also go with clawback like in the issue. ## Impact of proof recording during import With proof recording: 6.3058 Kelem/s Without proof recording: 6.3427 Kelem/s The measured impact on the importing performance is quite low on my machine using the block import benchmark. With proof recording I am seeing a performance hit of 0.585%. --------- Co-authored-by: command-bot <> Co-authored-by: Davide Galassi <[email protected]> Co-authored-by: Bastian Köcher <[email protected]>
This PR provides the infrastructure for the pov-reclaim mechanism discussed in paritytech#209. The goal is to provide the current proof size to the runtime so it can be used to reclaim storage weight. ## New Host Function - A new host function is provided [here](https://github.com/skunert/polkadot-sdk/blob/5b317fda3be205f4136f10d4490387ccd4f9765d/cumulus/primitives/pov-reclaim/src/lib.rs#L23). It returns the size of the current proof size to the runtime. If recording is not enabled, it returns 0. ## Implementation Overview - Implement option to enable proof recording during import in the client. This is currently enabled for `polkadot-parachain`, `parachain-template` and the cumulus test node. - Make the proof recorder ready for no-std. It was previously only enabled for std environments, but we need to record the proof size in `validate_block` too. - Provide a recorder implementation that only the records the size of incoming nodes and does not store the nodes itself. - Fix benchmarks that were broken by async backing changes - Provide new externalities extension that is registered by default if proof recording is enabled. - I think we should discuss the naming, pov-reclaim was more intuitive to me, but we could also go with clawback like in the issue. ## Impact of proof recording during import With proof recording: 6.3058 Kelem/s Without proof recording: 6.3427 Kelem/s The measured impact on the importing performance is quite low on my machine using the block import benchmark. With proof recording I am seeing a performance hit of 0.585%. --------- Co-authored-by: command-bot <> Co-authored-by: Davide Galassi <[email protected]> Co-authored-by: Bastian Köcher <[email protected]>
The PoV benchmarking aims to estimate the worst-case scenario, which is unlikely to occur on average.
Too large over-estimations hurt block utilization. There is currently no way to retroactively measure the actual used proof size.
Having such a retroactive measurement in place would enable a multi iteration approach; alternating between dispatch and assessment phase to reach maximum block utilization.
It would therefore be nice to have a way to measure the weight of a batch of dispatchables after their execution. For example in:
Basically in every situation where we need to fit a batch of Dispatchables into a weight limit, and assume that the pre- and post-dispatch PoV weight differ greatly.
The text was updated successfully, but these errors were encountered: