Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: update defrag logic to use bolt.Compact() #15470

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cenkalti
Copy link
Member

@cenkalti cenkalti commented Mar 14, 2023

They are pretty much same implementation. Iterate over buckets, iterate over keys and copy from old db to new db.

One difference is that etcd implementation was setting FillPercent to 0.9 while boltdb implementation sets it to 1.0. After this change the defrag duration on my database dropped from 10.5 to 9 seconds probably because of this change.

The other difference is that boltdb implementation also handles nested buckets. Etcd does not have any nested buckets at the moment but it's nice to have that covered.

With this change, defrag limit changes from 10,000 keys to 10 MB data because bolt.Compact() API wants byte size.

@cenkalti cenkalti marked this pull request as draft March 14, 2023 01:06
@cenkalti cenkalti marked this pull request as ready for review March 14, 2023 02:35
@ahrtr
Copy link
Member

ahrtr commented Mar 14, 2023

I support reusing bolt.Compact, but not now. The big difference for now is the FillPercent. It's 0.9 in etcd, but it's 1 in boltDB. Filling the entire page (setting FillPercent as 1) will have the best compaction, but on the other hand, it may also cause the page to be split on any single K/V insertion.

We should keep the FillPercent consistent in etcd, namely 0.9. On the other hand, we need consider to improve the boltDB to let users to configure compaction options (e.g. FillPercent).

@cenkalti
Copy link
Member Author

My understanding is that etcd uses revision numbers are keys in boltdb. Revisions are always incrementing so etcd only inserts to the latest page. Am I missing something here?

@chaochn47
Copy link
Member

Cenk, that may be true for key bucket etcd writes into. But for other lease, auth bucket, those are in place update which breaks the append only assumption. 0.9 sounds reasonable for the mostly (but not all) append only workload.

@cenkalti
Copy link
Member Author

We should keep the FillPercent consistent in etcd, namely 0.9.

@ahrtr I don't agree with this. When I compact the database, I want the best compaction. The reason I compact is to reduce the database size so it wouldn't reach backend quota limit soon. Because defrag is a blocking operation and takes many many number of seconds, I prefer to do it less frequently as possible. This is especially important for large databases (10 GB). Changing it from 0.9 to 1.0 would save additional 1 GB. If my argument is correct, insertions into keys bucket will never cause page split.

@chaochn47 Usually it is keys bucket that takes up most of the space. I wouldn't expect auth and lease buckets to grow so big that it would cause a performance issue on insert due to page split.

@ahrtr
Copy link
Member

ahrtr commented Mar 15, 2023

Revisions are always incrementing so etcd only inserts to the latest page.

It seems true for the key bucket.

@ahrtr I don't agree with this. When I compact the database, I want the best compaction.

If we need the best compaction (setting FillPercent to 1.0), why don't we set it to 1.0 in the first place? Still makes sense to be consistent at all places? In this case, does it make sense to explicitly set a const FillPercent = 1.0 in one place? @ptabor any comment?

Changing it from 0.9 to 1.0 would save additional 1 GB

Have you verified this? Please share you test result.

@ahrtr
Copy link
Member

ahrtr commented Apr 26, 2023

If we need the best compaction (setting FillPercent to 1.0), why don't we set it to 1.0 in the first place? Still makes sense to be consistent at all places? In this case, does it make sense to explicitly set a const FillPercent = 1.0 in one place?

We can do this in a separate PR after etcd-io/bbolt#422 (comment) is resolved.

Changing it from 0.9 to 1.0 would save additional 1 GB

Have you verified this? Please share you test result.

Do you have any test result to share?

@cenkalti
Copy link
Member Author

I will test this and share the results here.

@serathius
Copy link
Member

cc @ptabor

@stale
Copy link

stale bot commented Aug 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 12, 2023
@stale stale bot closed this Sep 17, 2023
@ahrtr ahrtr reopened this Sep 17, 2023
@ahrtr ahrtr removed the stale label Sep 17, 2023
Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 17, 2024
@ahrtr
Copy link
Member

ahrtr commented May 20, 2024

My understanding is that etcd uses revision numbers are keys in boltdb. Revisions are always incrementing so etcd only inserts to the latest page. Am I missing something here?

The revisions are indeed always incrementing, but when more child pages are allocated/added, then new items (pointing to the child pages) will be added into branch(non-leaf) nodes. The statement "etcd only inserts to the latest page" isn't correct.

@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link

@cenkalti: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-unit-test-amd64 70a4ce2 link true /test pull-etcd-unit-test-amd64
pull-etcd-unit-test-arm64 70a4ce2 link true /test pull-etcd-unit-test-arm64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants