-
Notifications
You must be signed in to change notification settings - Fork 1.3k
☂️ Search: solidify content-based language filtering #60341
Comments
I'd appreciate your thoughts on the new behavior for
Downsides of new behavior:
For me the trade-off is acceptable. It feels like a natural mental model for each file to have a single language. And C++ users can always put together a special context like |
There were 14 files in the results that I'm seeing right now. I've annotated the files for which we could improve the results.
Agreed, if go-enry checked for 'extern "C"` that would help fix the results for those files.
I agree, it's a big improvement compared to what we have, and we can iterate on further improvements down the line. |
We made a nice round of improvements and should feel comfortable recommending the feature to customers who have issues with our current lang filters. However, more work is required to really "complete" the feature and enable it by default. I filed https://github.com/sourcegraph/sourcegraph/issues/60676 to track that work. |
Currently, we translate
lang
filters to filters on file extensions. Because multiple languages may share the same extension, this can lead to errors. For example,lang:matlab
also matches Objective C files, since they also end in.m
.We have a feature
search-content-based-lang-detection
that instead matches against the actual language of the file, as determined by go-enry. This issue tracks work to solidify the feature so we're comfortable recommending it to customers.-lang
work correctlyNote: this feature defaults to off, more work is needed to enable it by default (https://github.com/sourcegraph/sourcegraph/issues/60676).
/cc @sourcegraph/search-platform
The text was updated successfully, but these errors were encountered: