-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ranking: standardize ctags kind names before scoring #674
Conversation
type SymbolKind uint8 | ||
|
||
const ( | ||
Accessor SymbolKind = iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on this PR got me thinking ... it'd be nice to just use SCIP as the exchange format instead of ctags output, and have a tool to map universal-ctags onto SCIP. That would get us closer to an actual spec, unlike the ctags output which has inconsistent naming and an unknown universe of values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree with this. I think this was discussed when scip-ctags was added and that was seen as an part of the end goal.
@@ -815,6 +815,16 @@ func TestScoring(t *testing.T) { | |||
t.Fatal(err) | |||
} | |||
|
|||
examplePython, err := os.ReadFile("./testdata/example.py") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a follow-up, I'll try to pull in SCIP ctags so we can run the same tests using that binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
type SymbolKind uint8 | ||
|
||
const ( | ||
Accessor SymbolKind = iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree with this. I think this was discussed when scip-ctags was added and that was seen as an part of the end goal.
SCIP ctags can output different kind names than universal-ctags (for example `typeAlias` instead of `talias`). This change makes sure we handle different names for the same kind. To do so, it refactors the logic so we first match strings to standard kinds, then decide how these are scored for each language. That way, you don't need to remember to cover all the possible kind names each time you adjust scoring for a new language. Also added basic tests for Ruby and Python to ensure we don't accidentally change the scoring.
To implement `select:symbol.enum` filters, we look at each symbol's ctags 'kind' and check if it matches the filter value `enum`. We accidentally didn't include 'enum' in this match logic, so all these symbols were filtered away. This PR fixes that, and adds a few improvements: * Use a shared map between `symbol.LSPKind` and `symbol.SelectKind`, to avoid drift between these two conversions. * Audit the ctags mapping from [sourcegraph/zoekt#674](sourcegraph/zoekt#674) and add other missing kinds (besides enum) Closes SPLF-178
SCIP ctags can output different kind names than universal-ctags (for example
typeAlias
instead oftalias
). This change makes sure we handle differentnames for the same kind. To do so, it refactors the logic so we first match
strings to standard kinds, then decide how these are scored for each language.
That way, you don't need to remember to cover all the possible kind names each
time you adjust scoring for a new language.
Also added basic tests for Ruby and Python to ensure we don't accidentally
change the scoring.
Relates to https://github.com/sourcegraph/sourcegraph/issues/57659