Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making NO_INTERVALS to be used by clients of Lucene #13385

Merged
merged 3 commits into from
May 20, 2024

Conversation

expani
Copy link
Contributor

@expani expani commented May 20, 2024

Description

A bug was reported in OpenSearch which caused Interval Queries containing sub-query in rules to fail with Sub-iterators of ConjunctionDISI are not on the same document if an no_match query was generated after going through the search analyser.

A similar issue was found in Lucene a few years back and was fixed by this PR

I am proposing to make NO_INTERVALS present in Lucene as public so it can be used by it's clients like OpenSearch rather than creating a clone of the same.

@jainankitk
Copy link
Contributor

Makes sense for IntervalBuilder clients to use the existing IntervalsSource implementation. Also, safe to make public given the variable is static final. Approved!

@romseygeek romseygeek self-assigned this May 20, 2024
@romseygeek
Copy link
Contributor

LGTM. Can you add an entry to CHANGES.txt?

@expani
Copy link
Contributor Author

expani commented May 20, 2024

@romseygeek Should I add only under Lucene 10.0.0 or other versions as well ?

@romseygeek
Copy link
Contributor

We can backport this to 9x so add the entry under the latest 9x version.

@expani
Copy link
Contributor Author

expani commented May 20, 2024

@jainankitk @romseygeek Requesting an approval.

@@ -362,6 +362,8 @@ Other

* GITHUB#13077: Add public getter for SynonymQuery#field (Andrey Bozhko)

* GITHUB#13385: Make NO_INTERVALS source as public to be used by Lucene clients instead of creating clones themselves.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add your name here, so that we can assign credit? Github account name is fine if that's your preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@romseygeek romseygeek merged commit 22d50be into apache:main May 20, 2024
3 checks passed
romseygeek pushed a commit that referenced this pull request May 20, 2024
This is generally useful for clients building their own Intervals
implementations.
Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I have one addition I stumbled over serveral times. There's no "official" way to get an IntervalSource that matches no documents/intervals (I have this sometimes in a query parser for Solr to match complex patent restrictions). This class is pkg private:
https://github.com/expani/lucene/blob/c1c70de765cc7ec67762b40846c7be1e1375eb1c/lucene/queries/src/java/org/apache/lucene/queries/intervals/NoMatchIntervalsSource.java

To get an instance there's a workaround, but its undocumented: public static IntervalsSource Intervals#atLeast(int minShouldMatch, IntervalsSource... sources). You can misuse it by passing 1 and an empty array: Intervals.atLeast(1)

I wonder how this differs from this one. The implementation there has no iterator, while this one here has.

I'd suggest to merge those implementations and instead of exposing a constant to end user add another static method like Intervals#noMatch("reason").

Unfortunately this is already merged, but IntervalBuilder looks like wrong class for that.

@uschindler
Copy link
Contributor

Should I open a new issue, or should we revert this one and give a better solution?

@romseygeek
Copy link
Contributor

Let's open a new issue and we can discuss a bit more about when you need such a thing and where it should live.

@uschindler
Copy link
Contributor

OK. Just for brevity: How does the one made public here differs from the already existing class (except that one returns null iterator while the other one returns an empty iterator)?

@uschindler
Copy link
Contributor

See #13388

@romseygeek
Copy link
Contributor

From a quick look we can indeed replace everything with NoMatchIntervalIterator. I think there are some other places that we can make use of the null shortcut as well - for example, DisjunctionIntervalsSource.create() can return null if its input collection is empty.

@uschindler uschindler added this to the 9.11.0 milestone May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants