shards: only trigger rescan on .zoekt files changing #801

keegancsmith · 2024-08-02T09:32:35Z

Any write to the index dir triggered a scan. This means on busy instances we are constantly rescanning, leading to an over-representation in CPU profiles around watch. The events are normally writes to our temporary files. By only considering events for .zoekt files (which is what scan reads) we can avoid the constant scan calls.

Just in case we also introduce a re-scan every minute in case we miss an event. There is error handling around this, but I thought it is just more reliable to call scan every once in a while.

Note: this doesn't represent significant CPU use, but it does muddy the CPU profiler output. So this makes it easier to understand trends in our continuous cpu profiling.

Test Plan: CI

Any write to the index dir triggered a scan. This means on busy instances we are constantly rescanning, leading to an over-representation in CPU profiles around watch. The events are normally writes to our temporary files. By only considering events for .zoekt files (which is what scan reads) we can avoid the constant scan calls. Just in case we also introduce a re-scan every minute in case we miss an event. There is error handling around this, but I thought it is just more reliable to call scan every once in a while. Note: this doesn't represent significant CPU use, but it does muddy the CPU profiler output. So this makes it easier to understand trends in our continuous cpu profiling. Test Plan: CI

stefanhengl · 2024-08-02T09:48:04Z

shards/watcher.go

+			}
+		}
+
+		ticker := time.NewTicker(time.Minute)


That seems fairly frequent for a fail-safe? Isn't this roughly in the order of magnitude we scanned before this PR? I might be totally off though ;-)

We are always scanning right now on dotcom. IE as soon as one scan is done another starts. The scanning doesn't take long (~50ms?) but effectively we do this for { scan() }. So once a minute seems good?

keegancsmith requested a review from a team August 2, 2024 09:32

cla-bot bot added the cla-signed label Aug 2, 2024

keegancsmith force-pushed the k/use-events branch from 17274a6 to 0a3142e Compare August 2, 2024 09:36

stefanhengl approved these changes Aug 2, 2024

View reviewed changes

keegancsmith merged commit acacc5e into main Aug 2, 2024
9 checks passed

keegancsmith deleted the k/use-events branch August 2, 2024 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shards: only trigger rescan on .zoekt files changing #801

shards: only trigger rescan on .zoekt files changing #801

keegancsmith commented Aug 2, 2024

stefanhengl Aug 2, 2024

keegancsmith Aug 2, 2024

shards: only trigger rescan on .zoekt files changing #801

shards: only trigger rescan on .zoekt files changing #801

Conversation

keegancsmith commented Aug 2, 2024

stefanhengl Aug 2, 2024

Choose a reason for hiding this comment

keegancsmith Aug 2, 2024

Choose a reason for hiding this comment