-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor zoekt-webserver flags parsing #756
base: main
Are you sure you want to change the base?
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Dmitry Gruzd.
|
879e443
to
dc2a60e
Compare
We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added. Sourcegraph teammates should refer to Accepting contributions for guidance. |
dc2a60e
to
829dc48
Compare
We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added. Sourcegraph teammates should refer to Accepting contributions for guidance. |
@DylanGriffith Could I ask you to review this PR and assign it to a maintainer if you're happy? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already the web
package which exists for this purpose, I suspect a bit too much has grown into the zoekt-webserver
command? What key things do you need at GitLab that we should maybe move to the web package instead?
@keegancsmith Thank you for taking a look!
That's my impression as well. I'd like to make the code responsible for the command itself much more compact and reusable.
I'd say it's almost all the code from Overall, these are our current draft plans:
Please let me know what you think 🤝 |
Ok great. Let me tackle this command a bit, it's been on the back of my mind for a while to clean this up since it has grown so much. We may end up in a place where you don't need to introduce your new package. Is your main goal to have all the same flags? Do you actually expect at gitlab for people to change most of those values? If it is like the sourcegraph use case, the flags we specify are pretty much static.
Cool, we have thought about this in the past as well to simplify the release pipeline.
Not exactly sure if that is something we should have in upstream. I'd recommend if you need that you use the web.Server handler and wrap it to add in the authorization checks in your own binary.
FYI we have code to do this in Sourcegraph but its quite tied to how we do service discovery/telemetry. Additionally the license for it right now is not permissively licensed, but going back in history to a time when it was may be a viable way to copy-paste :) At the end of the day the code is quite straightforward to do TBH, you can probably copy-paste the code used to aggregate across shards inside of the shards pkg.
I'm interested to learn more about what problems you are observing here. Do you have examples of repositories with large number of shards and have found an issue around deletion? At the end of the day though we have thought a bit about introducing more synchronization between indexer and webserver, but have always just decided to keep it simple. |
Sounds great, thank you!
We'll probably have our own flags. I've opened this PR to add some value by the refactoring (options struct) as well as start this discussion.
I'll take a look 🤝
It started with this incident https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17770. We had to reallocate a lot of repositories from one Zoekt node to another. That generated a ton of deletions. Each deletion needed to scan through more than 500K index files ( We have some additional details like cloud profiling graphs from the time of the incident in this MR's description https://gitlab.com/gitlab-org/gitlab-zoekt-indexer/-/merge_requests/173. This MR should alleviate the pain, so in general I'd say that this problem is going to have lower priority after this fix is merged. I'm pretty sure this problem only affects |
This PR is a small flag parsing refactoring to start extracting functions from
cmd/zoekt-webserver/main.go
to a newwebserver
package.The underlying idea here is to make the creation of webserver-like binaries easier. This is something we're considering doing at GitLab for the Zoekt integration. That way we can reduce the amount of duplicate code between the binaries.
Please let us know what do you think. Thank you!