Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

zoekt: tech debt in use of syntax.Regexp.String #61524

Open
keegancsmith opened this issue Apr 2, 2024 · 0 comments
Open

zoekt: tech debt in use of syntax.Regexp.String #61524

keegancsmith opened this issue Apr 2, 2024 · 0 comments
Labels
team/search-platform Issues owned by the search platform team tech-debt Technical Debt

Comments

@keegancsmith
Copy link
Member

keegancsmith commented Apr 2, 2024

To solve https://github.com/sourcegraph/sourcegraph/issues/61462 we introduced a vendored version of regexp.String

This issue is to ensure we move to a better fix. Some work that was mentioned in that issue:

Introduce IR between query.Q and zoekt.MatchTree

There is actually a lot of duplicated work done for each shard to execute a query. The use of regexp.String was just the most egregious of them after its regression. But there are plenty of other parts were we duplicate work.

Remove syntax.Regexp from public zoekt API

It is exposed in query.Q directly. This was likely done since in the MatchTree construction we want to be able to visit the syntax.Regexp to construct substring matchtrees.

A part of this is also removing our support for gob and net/rpc. This introduces lots of uses of syntax.Regexp which make it hard to interact.

Propose upstream API to convert syntax.Regexp into regexp.Regexp

When stdlib slowed down .String they said it was only for debug purposes.
However, that is not the case since there are uses of programmatically
building a syntax.Regexp or mutating one before passing it onto
regexp.Compile.

I've already experimented with this API in a fork of the stdlib and it makes sense. However, there are some downsides to it since regexp.Regexp.String and the Split function both want the original string.

Fork grafana/regexp

We rely on grafana/regexp but it hasn't been updated in 2 years. So I:

Now that we control the fork I started adding things I wish we could do:

/cc @sourcegraph/search-platform

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
team/search-platform Issues owned by the search platform team tech-debt Technical Debt
Projects
None yet
Development

No branches or pull requests

2 participants