-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend: add experimental defrag txn limit flag #15511
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Steven Johnson <[email protected]> Signed-off-by: sjdot <[email protected]>
Signed-off-by: Steven Johnson <[email protected]> Signed-off-by: sjdot <[email protected]>
ffdf4c3
to
e9b810b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for proposing this idea @sjdot. The maintainers can likely give some insight on the original intention for the limit, just in terms of the code I have one suggestion below.
It sounds to me like a batch limit, so I suggest the new variable is called DefragBatchLimit
and ExperimentalDefragBatchLimit
. That makes it clear we are dealing with a batch versus overall limit.
Finally, you mention the performance of defrag can be influenced with this flag, are you able to provide some benchmarks when you're ready for wider review? Thanks!
/cc @cenkalti |
Signed-off-by: sjdot <[email protected]>
Thanks @ahrtr, sounds good. Do you anticipate your change would appear in a 3.5.x release, or would this be further down the line? Would it make sense to have this in place before then so it can be passed in when the refactor is done? |
Firstly, it depends on whether the change on bbolt.Compare is backward compatible. I think YES, but the interface may be a little ugly. Secondly, based on the first point mentioned above. Technically speaking, it's OK to backport the change to 3.5.x. But since it's a feature, and usually we don't backport feature. Please share your test result on the performance comparison. |
Here is some synthetic bench-marking I ran through. Methodology was:
Let me know if there is a better/standard way to do this.
|
Key size varies between different workloads. Does it make more sense to set the limit as transaction size value instead of key count? Similar to bbolt.Compact implementation? |
Thanks @sjdot for sharing the test data. Smaller Peak RSS on 1K (the flag value) as compared to 10K is align with my understanding. But it's interesting on the smaller duration on 1K, it may be related to the key size (50kb in your example). I agree that we should set a transaction size instead of key count. I think we may can get consistent test result, which should be independent on the key size. |
Ok @ahrtr @cenkalti I made a branch that uses the size as the limit instead of number of keys and it looks like it's more consistent regardless of the key size/count distribution. Tested two different 1GB DBs, one which is made up of 20k 50kb keys as per above tests and another which is 2k 500kb keys. Kept the size limit and the key count limit the same in both test cases and size is clearly more consistent for performance here. Seems it's just the number of bytes to get through vs the number of keys that influences the performance. Tests below:
|
After both etcd-io/bbolt#422 (comment) and #15470 are resolved, then we can continue to work on this PR. But I suggest to set the defragLimit to the transaction max bytes instead of number of keys, something like below,
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
I've been looking into defrag performance and noticed there is a hardcoded value
defragLimit
that controls the frequency at which transactions are committed during a defrag (currently set to 10k).Seems this value has been present since the original defrag implementation and I couldn't find any docs or comments in the PR that state why 10k is used.
Through some brief synthetic testing I've observed both memory usage and defrag run time can be influenced by changing this value so figured it might be useful as a configurable via cli flag.
Is this reasonable or is there a reason for why this should stay at 10k?