-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update database benchmarks to take trie cache into account #6131
Comments
Sorry did not see the ping, the code is here:
You can use it from the polkadot or polkadot-omni-node binary: |
Do you mostly need this for Polkadot+Kusama AssetHub or something else as well? |
Yes, it is needed only for AssetHub. But it would be good to also have those weights in polkadot-sdk so we can test on westend AssetHub. I would assume we add some This flag would be the same as
Would be awesome if the runtime somehow could advise the client which of those flags to set. But I guess this is not in the cards. |
I would break this issue into several parts:
POC:
I’m not entirely sure I understand your point. If you mean setting I believe there’s an alternative approach where the runtime determines the node’s behavior, rather than the other way around. For example, the runtime specifies Am I missing something here, or does this approach make sense? |
This is why I wrote you are only supposed to set this value when you made sure your nodes all start with said cli switch. It is a convention. Just like all the rest of the weights (you need to have a certain hardware spec).
Yes this would be a good safe guard. But I wouldn't set it automatically. Rather I would error out if the command is not set or a warning that needs to be overwritten. It is still a heuristic. Regarding your plan. I would expect to see "changing the benchmarks to measure in-memory access" somewhere. We also need to find out of the trie cache is write back or write through. If the former is the case the cache is probably too slow. Benchmarking will reveal that. |
4.
5. Investigate trie cache is write back or write through Ok, let's say we release new polkadot-fellows runtime AssetHubPolkadot/AssetHubKusama/Westend with benchmarked
Could you please elaborate on this, specifically on "when you made sure your nodes all start with the said CLI switch" and "your nodes"? How would you achieve or ensure this for AssetHubPolkadot/AssetHubKusama? I mean, anyone can run a node/collator with a different node version and/or without the |
The on disk format shouldn't be relevant as it will not hit rocksdb synchronously (assuming write back cache). Maybe
Yeah it is a good idea to inspect how the trie cache scales with size.
How would you safeguard against somebody running their node on a slow computer? You can't. It is exactly the same thing here. We are raising the hardware requirements essentially. So we communicate this new requirement and any safeguard we can add is nice. But not required. We then update the runtime after enough people upgraded their node (but at least all collators).
Light clients and validator nodes are not affected because they are stateless. Collators and full nodes are affected. |
@athei I played a bit with
Notes/Hints:
no trie cache
default trie cache: (67108864 Bytes)
1GB trie cache
|
This does not look very promising. A cached vs disk access should be a much bigger difference. Probably those benchmarks are not measuring the right thing. Have you looked into those benchmarks that Basti wrote for the trie cache? |
As of right now the weight for database access is hard coded to 25us and 100us for read and write respectively. Those numbers are way too high given that we have a trie cache now.
We should write a benchmark that benchmarks storage access with a warm cache. Meaning that every access in the benchmark hits the cache. This is okay to assume as we will require the cache to be larger than the state size.
The results of that benchmark can be used for chains where we require collators to set a large
--trie-cache-size
. For example, AssetHub.@ggwpez Do we have a storage access benchmark already in the codebase somewhere?
We also need to add code that reads the whole database on node startup. This makes sure that all the state is in RAM.
The text was updated successfully, but these errors were encountered: