-
Notifications
You must be signed in to change notification settings - Fork 1.3k
zoekt: 40% of CPU cycles spent on regexp.String #61462
Comments
The above graph indicates on march 12th we went from this being 0% to 60% of CPU. Checking our deployment logs that was the first time we deployed with a version compiled with go1.22. This is likely a regression related to that. Here are the commits that changed in that deployment as well, I don't see anything that would change the behaviour of how we construct a regexp matchtree:
|
Alright it is almost certainly this change which is the root cause golang/go@98c9f27 It introduces the use of unicode.SimpleFold over all runes in your literal. As mentioned in the commit description this is supposed to be only used for debugging. However, we need it so we can convert a syntax.Regexp into a regexp.Regexp. It has always bothered me we went via .String, so will investigate if we can do something else. |
@keegancsmith What impact on the end-user would this have? Does it impact user perf or just is using too much CPU overall? |
This has a noticeable impact, but not quite as much as half as slow given that IO is often the bottleneck and CPU usage is bursty. But for example I just checked our continuous performance monitoring and you can notice a regression in the graphs when this was deployed to dotcom: https://ui.honeycomb.io/sourcegraph/datasets/search/result/bLDGcUQRjgh |
Here is a temporary fix, just use a version of regexp.String with the slowdown reverted out of it: sourcegraph/zoekt#753 Bigger change: We rely on
Now that we control the fork I started adding things I wish we could do:
I want to finish up this change next week. But before I do that some other simplifications to zoekt:
|
Fairly confident the workaround in sourcegraph/zoekt#753 will solve this problem in the shortterm. My goal today related to this work is to:
The goal is that we don't ship this regression in the next release which is being cut tomorrow. This is why I am also marking this as a release blocker. |
|
https://console.cloud.google.com/profiler;timespan=4h;end=2024-03-27T19:47:49.196Z/zoekt-webserver/cpu;history=percentage,show:github%5C.com%2Fsourcegraph%2Fzoekt%5C.%5C~28%5C*indexData%5C~29%5C.List$?referrer=search&project=sourcegraph-dev
https://sourcegraph.slack.com/archives/C05RHQLMH2M/p1711569390014179
/cc @sourcegraph/search-platform
The text was updated successfully, but these errors were encountered: