Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: ensure topo calls use --topo_read_concurrency #17275

Closed
timvaillancourt opened this issue Nov 22, 2024 · 4 comments · Fixed by #17276
Closed

Feature Request: ensure topo calls use --topo_read_concurrency #17275

timvaillancourt opened this issue Nov 22, 2024 · 4 comments · Fixed by #17276
Assignees

Comments

@timvaillancourt
Copy link
Contributor

timvaillancourt commented Nov 22, 2024

Feature Description

The --topo_read_concurrency flag is intended (based on it's help message and usage in the code) to limit how many reads a Vitess component does concurrently to the topo

Unfortunately in #14693 and #17071 (and at least 2 x more PRs for VTOrc to come) Vitess code is not always respecting this concurrency limit and is simply launching N x goroutines with sync.WaitGroups

Another issue is sometimes methods do respect the --topo_read_concurrency limit, but only local to the the method execution, and that code is called concurrently, significantly exceeding the limit

Instead of having the "callers" of go/vt/topo methods (internal or external) deal with concurrency limits all over the code, this feature request discusses making the concurrency limit built-in to all read-related struct methods of go/vt/topo's Server struct

I believe if this is done correctly the concurrency limit should be respected by all methods, including if they're stacked or called repeatedly/concurrently in other methods/funcs

Your thoughts appreciated 🙇

Use Case(s)

Regular vitess usage

@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented Nov 22, 2024

A question for all: the default --topo_read_concurrency is 32

Do we want both local and global topo calls to use that same limit, or should each have their own semaphore == to --topo_read_concurrency (each has 32)? I'm leaning towards the later but we should make it clear

@deepthi
Copy link
Member

deepthi commented Nov 22, 2024

I'd lean towards each having their own semaphore. And yes, we should document it.

@timvaillancourt
Copy link
Contributor Author

@deepthi sounds good, this is implemented here: #17276

The "cell" topo reads share a single semaphore that defaults to 32. A drawback to this approach is if 1 x cell is really slow it may starve the semaphore for "cell" reads. One way to avoid this is to give each cell connection a new semaphore that is a subset of the total. I'm curious what your thoughts were?

@deepthi
Copy link
Member

deepthi commented Dec 8, 2024

Each cell should have its own limit so that different cells can be read in parallel. The intent of cell topo is that it is independent of global topo, and ideally each cell has its own topo server. I realize that people don't always deploy it that way, but we should not put a limit across cells.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants