Add support for certain non-trivial Stream searching #34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
By non-trivial I mean searching which uses some kind of "contextual" condition in addition to a matching predicate.
The feature currently offered is to find a certain element, which is traditionally done using
.filter(..)
, but with the additional constraint that the element must not be followed by a certain other element.The usage can look something like this:
(see "Example - analyzing events" below for a more complete example)
API design
The search is realized with a Collector which is built by first expressing the subject element you are interested in, using
DiggCollectors.find(Predicate)
. The returned object offers the API to compose the full compound search condition.Currently, and introduced in this PR, the returned object offers methods to specify that the element you search for can not be followed by a certain other element. In addition you also specify if you want the first or last encountered element, should there be multiple applicable elements which is not followed by any "cancelling" element.
This should allow for further extending the API with other search cases. For example it should be possible to extract a sub-collection of successive events given the initial subject predicate, and then specifying the condition for when to end the extraction.
It follows that the use cases for this typically requires Streams which are ordered and sequential, which must be ensured by the developer, or you may get unpredictable results. One example of such source of elements may be a chronological stream of events, and you are interested in finding an event but only if it has not been "voided" by a certain following event.
Tradeoffs
The existing standard collectors offered by the JDK and in widespread use (e.g.
.toList()
,.groupingBy(..)
, etc) generally communicate that the entire stream will be traversed to produce a result. A collector has no facilities to send a "halting signal", i.e. to decide that the result is complete at any point, and cancel any further elements to be offered to the collector for processing/collecting. It can of course choose to discard any offered element at any point.Currently, I consider the API to communicate that the entirety of the Stream must be traversed, because of the condition of the "any following element" which may affect the result. But an API such as this may evolve to communicate that once a result is obtained, the rest of the Stream is discarded and not processed, and this is not the case. As with all uses of collectors, the Stream pipeline should be forged to filter only relevant elements, and must be finite.
This feature would probably be a better fit as a
Gatherer
, currently in preview in not-yet-released OpenJDK 22, as it would allow termination of the Stream traversal. And having it as an intermediate operation could also enable more flexibility for further processing after the find-operation.Example - analyzing events
Given this model of events:
And the following events happened resolved in chronnological order:
The following tests shows resolving the first applicable failure event, and the last applicable failure event, both happening after the restoring which happened as the third event.
Lastly, we append yet another restored-event, which voids all the contained failures.