-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement exact-boundary match type #3967
Conversation
fd2f463
to
b1b6da2
Compare
Could you please describe this in more detail? It's not clear for me what is the problem here. Also I suspect that work of '-suffix in I would like to propose few alternatives:
The second option would also allow quicker enabling of exact-boundary-match. Regarding partial boundary matches, what would it look like? I guess it would be natural if a quote enabled both exactness and word boundary check for that end of term where it is placed. Everything else looks and works great. I have done a short manual testing of it. |
I was worried about match type being changed multiple times against the user's will.
The transition can surprise the user, although if the final query is
This is just a draft, no documentation, no tests yet.
The first option is probably the least confusing one, but the users will experience unintended mode changes in
|
I don't know if this is really needed but it could be a good option. Though, if implemented it may become a precedent for other requests for regexp-like modifiers. 😉 BTW, I noticed that some people asked for a toggle of case sensitiveness. It is what I would like to have as well (e.g. in the way it is done in vim with modifiers "\C", "\c"). Regarding your implementation of word boundary check (based on bonus), now words cannot include underscore. As far as I understand, this is uncommon (see https://stackoverflow.com/questions/1324676/what-is-a-word-boundary-in-regex). And this is also not what I would expect. |
Right. I don't want that.
Thanks for pointing it out. But I don't know, expectations are subjective. When I see Expectations also depend on context. If a project of your interest uses the That said, I'm not going to make the decision right away, I'm going to take some time to use it myself to see how I like it better. |
b1b6da2
to
9ce81f3
Compare
After experimenting with this for a while, one thing I can agree is that |
d8b702f
to
a78f152
Compare
I understand your point and I agree with it partially but this approach does not solve the problem for me (yes, most of the projects I work with use the
This does not help. Consider the following examples which are quite common, I think (at least, I see them very often and it was one of the reasons to request this feature) :
The query you suggested will not show them at all while these results may be very important if I search for all places where term 'win' is used. Use of logical OR (i.e. Could you please consider possibility of making a configurable list of characters that would be treated as part of words? For example, storing it in an environment variable would allow different setups for projects based on different languages. I believe, it would satisfy most of people. If not, is there a proper and simple way of hard-coding it in a local copy of fzf repository (I mean something better than what I implemented in my fork)? Also I noticed something weird in behavior of the current logic: using the above examples as the input data query |
Have you tried getting used to the |
Yes but typing a regexp every time is inconvenient. So, now this is my backup option. Besides, I would like to have the same way of working with any source of data, not only for what is produced by grep-like tools. And inventing various wrappers/workarounds for different situations is not what anyone would like to do with fzf.
This contradicts with the behavior of fzf extended search mode, doesn't it? I mean that switches like ^-prefix hide irrelevant results. |
You can find examples in https://github.com/junegunn/fzf/blob/master/ADVANCED.md where you can toggle between two modes.
Anything you have in mind in particular?
My point is that it's not the main focus of fzf. Yes, you can filter out some stuff using basic prefix, suffix patterns, but that's about it. They are most of the time incapable of completely removing irrelevant matches, and fzf was never designed to do that. If that was the point, fzf would have supported proper regular expressions for precise filtering. If you need it, there are other interactive filter programs that support it, such as peco, but fzf chose not to. Instead, fzf has focused on improving the scoring mechanism to make more relevant matches appear first. It's an interactive filter program unlike traditional
That is not expected. Let me look into it. |
I believe that the only purpose of scoring is to show more suitable results first but all results should be relevant matches. And I don't understand why you say that fzf was not designed to remove irrelevant matches completely. Based on my experience with its modern version, this is what it does most of the time, even if one uses just fuzzy matching mode. Otherwise we would see all lines of source data every time, they would be just sorted in a curtain way. So, what we are really discussing here is the set of instruments for filtering out irrelevant input data. I see that you personally are satisfied with the current set and trying to avoid its further extending. Ok, you are the author and you decide how to develop and support the tool. I'm not going to continue debating this. Regarding the suggestions of using regexp (with other tools) for more precise matching, this way looks not so attractive. I really like how it is done in fzf extended mode: you just enter words or their parts that you remember or think of and then you can make just few very quick and simple modifications to reduce number of results if needed. Construction of regular expressions is not what I want to switch back. |
It is clear form me that scoring helps to present the most relevant (to the intention) matches at the top of the resulting list but it has nothing to do with filtering out irrelevant (to the query) parts of input data. If I understand it correctly, you confirmed that the list of results does always meet the query criteria. Thus, scoring cannot make including irrelevant to the query results acceptable. According to the current behavior of fzf they just should be hidden. When I requested this feature, I had not anticipated that "word boundary" might have different meaning for different projects/people. Later your changes and our discussion raised this problem. While it is quite fine for you to have underscore as a word delimiter, for me results like Now the question is how to deal with term "word boundary". If you state in the documentation that it is a specific version of boundary matching, probably that will be formally ok but it will not help people like me. I'm testing the following local modification to suit my needs:
|
I disagree. I mentioned above how a user with a different expectation can still benefit from it because matches around If we decide not to see |
Yes, but this makes the behaviour a bit inconsistent and confusing because considering underscore as a possible part of words is more common. At least, grep and ag do not see it as a word splitter. Similarly people complaints about "exact match" being not fully exact (i.e. matching the case is not enabled by quoting which is not obvious from the documentation). So, I suggest stating these peculiarities in the man page explicitly.
In my turn I cannot agree with this. 😄 A larger number of users would be served better if a greater number of their needs were satisfied. An option or a setting or a configuration/environment variable would make it possible to provide a different behaviour for those who need it and not to affect others. You could develop it so that it would not consider underscore as a word splitter by default. Excuse me, please, but not adding it with a reference to possible complexity is precisely a cop-out. I don't think that an additional option would break the compromise between complexity and quality of the tool, look at vim which is much more popular: it is complex but the number of features and its flexibility is its power. |
Well, it's a philosophical difference in software product design so we'll have to agree to disagree, but thanks for sharing your thoughts. |
Thank you for your efforts! 👍 |
372236d
to
1d25dcc
Compare
Only requiring '-suffix in --exact mode is confusing and not straightforward. Requiring '-prefix in --exact mode means that the users can experience unintended mode switches while typing. e.g. 'it -> fuzzy (many results) 'it' -> boundary (few results) 'it's -> fuzzy (many results) However, user who intends to input a boundary query should not be interested in the intermediate results, and the number of matches decreases as she types, so it should be okay. On the other hand, user who does intend to type "it's" will be surprised by the sudden decrease of the match count, but eventually get the right result.
1d25dcc
to
8f92b94
Compare
Close #3963
This implements the
exact-boundary
match type, which finds exact matches for a search term where both ends are at word boundaries.Exact-boundary
term should start with'
and end with'
."TERM"
because I often use such patterns to search for string literals in the source code.'TERM'
before, so we're not breaking a use case here.--exact
mode,'
prefix enables fuzzy matching, soexact-boundary
term should be inTERM'
form.it's
.