-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ctags: monitor symbol analysis and report stuck documents #678
Conversation
This adds a monitor which will report every minute the progress of symbol analysis. Additionally, if a document is taking too long to analyse (10s) we report it. At first this is just reporting via stdlog. However, once we are comfortable with thresholds around this we can likely also include a way to kill analysis for a file. Test Plan: Adjusted monitorReportStatus to 1s then indexed the sourcegraph repo and inspected the output $ go run ./cmd/zoekt-git-index -require_ctags ../sourcegraph/ 2023/11/03 16:03:10 attempting to index 14533 total files 2023/11/03 16:03:13 DEBUG: symbol analysis still running for shard statistics: duration=1s symbols=15805 bytes=44288971 2023/11/03 16:03:14 DEBUG: symbol analysis still running for shard statistics: duration=2s symbols=26189 bytes=51564417 2023/11/03 16:03:15 DEBUG: symbol analysis still running for shard statistics: duration=3s symbols=55613 bytes=64748084 2023/11/03 16:03:16 DEBUG: symbol analysis still running for shard statistics: duration=4s symbols=86557 bytes=93771404 2023/11/03 16:03:17 DEBUG: symbol analysis still running for shard statistics: duration=5s symbols=125352 bytes=116319453 2023/11/03 16:03:18 symbol analysis finished for shard statistics: duration=5s symbols=142951 bytes=129180023 2023/11/03 16:03:22 finished shard github.com%2Fsourcegraph%2Fsourcegraph_v16.00000.zoekt: 283983298 index bytes (overhead 2.8), 14533 files processed I then added a random sleep for a minute in a file to see the stuck reporting: $ go run ./cmd/zoekt-git-index -require_ctags ../sourcegraph/ 2023/11/03 16:14:57 attempting to index 14533 total files 2023/11/03 16:15:15 WARN: symbol analysis for README.md (3485 bytes) has been running for 14s 2023/11/03 16:15:25 WARN: symbol analysis for README.md (3485 bytes) has been running for 24s 2023/11/03 16:15:45 WARN: symbol analysis for README.md (3485 bytes) has been running for 44s 2023/11/03 16:16:00 DEBUG: symbol analysis still running for shard statistics: duration=1m0s symbols=958 bytes=624329 2023/11/03 16:16:00 symbol analysis for README.md (size 3485 bytes) is done and found 4 symbols 2023/11/03 16:16:06 symbol analysis finished for shard statistics: duration=1m5s symbols=142951 bytes=129180023 2023/11/03 16:16:10 finished shard github.com%2Fsourcegraph%2Fsourcegraph_v16.00000.zoekt: 283983299 index bytes (overhead 2.8), 14533 files processed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! It will be super useful for debugging slow reindexes.
I noticed we have a lockedParser
struct, has some conceptual overlap. For example, it uses a timer to enforce a parse timeout of 1 minute. I wonder if it'd make sense to fold some of this logic into that wrapper struct.
case <-statusTicker.C: | ||
now := time.Now() | ||
m.mu.Lock() | ||
log.Printf("DEBUG: symbol analysis still running for shard statistics: duration=%v symbols=%d bytes=%d", now.Sub(m.start).Truncate(time.Second), m.totalSymbols, m.totalSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't have the concept of log level, we just use these informal prefixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have our structured logger, but it isn't used in our zoekt-git-index binary. Given this is only consumed by humans (at the moment) I stuck with the builtin logger. But zoekt should unify its logging. I would suggest using the new stdlib structured logger with an adapter for sourcegraph/log.
nice. indeed that does make sense to somehow use together. Funnily enough it looks like I implemented the lockedParser and forgot it exists. What is surprising to me is that it errors out the whole build if it comes across parsing which lasts longer than a minute and yet we see what looks like stuck builds. Maybe the logic is broken somehow? Either way the logic here will give overall status updates which should help us understand. I think I will leave this as independent for now since this requires the view of time taken to parse symbols of a full shard, while the lockedParser seems to be created once per process (which I will file a follow up issue about). |
Here is the issue about lockedParser https://github.com/sourcegraph/sourcegraph/issues/58112 |
This adds a monitor which will report every minute the progress of symbol analysis. Additionally, if a document is taking too long to analyse (10s) we report it.
At first this is just reporting via stdlog. However, once we are comfortable with thresholds around this we can likely also include a way to kill analysis for a file.
Test Plan: Adjusted monitorReportStatus to 1s then indexed the sourcegraph repo and inspected the output
I then added a random sleep for a minute in a file to see the stuck reporting: