-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path #11088
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Compatibility status:Checks if related components are compatible with change f1b39cd Incompatible componentsIncompatible components: [https://github.com/opensearch-project/performance-analyzer.git] Skipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git] |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #11088 +/- ##
============================================
+ Coverage 71.15% 71.22% +0.06%
- Complexity 58783 58829 +46
============================================
Files 4885 4885
Lines 277199 277205 +6
Branches 40285 40286 +1
============================================
+ Hits 197247 197440 +193
+ Misses 63448 63258 -190
- Partials 16504 16507 +3 ☔ View full report in Codecov by Sentry. |
I think the non-blocking is not really what I meant (and it does not apply here to be fair), the
The usage of |
@reta @jed326 Thanks for the discussion here. Sorry for my delayed response (recovering from cold). So the other option which you are mentioning was also explored but I didn't found a good way to handle it. The crux of the problem is this special ValueSource used in this test for
Regarding the inefficiency of the |
...ession/src/main/java/org/opensearch/script/expression/ReplaceableConstDoubleValueSource.java
Outdated
Show resolved
Hide resolved
@sohami thanks for the explanation! Originally I was thinking since Taking a step back though I do see the bigger picture in this
and I agree with the conclusion that this specific type of |
…t search path Signed-off-by: Sorabh Hamirwasia <[email protected]>
…eConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]>
❕ Gradle check result for f1b39cd: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Thanks a lot for exploring different routes, @sohami
Aha, I would argue this is not what this change is doing: why don't we make |
@reta Thread aware is better than leaf aware since it has to initialize and create |
@reta Checking back to see if we can get this merged in ? |
@sohami thread is a low level implementation detail here where leaf is not - it has very precise meaning and place in the algorithm (concurrent search). |
Your call folks, I have not changed my opinion about this change (== patching), but if you feel it is way to go, I won't object |
@reta I understand about the low level implementation detail but that is something we have considered in other places too for concurrent search flow that a slice is mapped to a thread. Using that if there is scope of improvement which in this case is by re-using the Value object for a slice thread then I feel it should still be fine. The class name also reflects the intent of per thread for this specific use case. |
@sohami I don't like it either, I think we took shortcuts and now taking even more shortcuts using previous ones as justification. |
@reta I am not sure what you are referring to as shortcut. It is an invariant which we are using for concurrent flow. We tried to use the approach which we can come up with and think is the best known at any given time and considering performance impact in mind as well. If you feel there are better ways to implement, please let us know, we can definitely evaluate and take it up. For this specific change, I have cited 2 reasons on why I am pushing for threadId based implementation compared to leaf based ValueSource which IMO is fair. Would like to hear thoughts from others as well @andrross @jed326 so that we can find the best path forward here. |
This is definitely the crux of the issue. The concurrent map + caching by thread ID is tying the data structure implementation to the low level details of how search is being executed by the platform. I agree with @reta that changing the data structures to be thread safe using well defined domain constructs (e.g. leaf) is more maintainable and extensible. @sohami makes good points about the performance considerations (caching per-thread creates fewer instances and doesn't change the way non-concurrent search works), though having concrete benchmarks would help show how significant those points are in practice. I feel the same way as @reta here...this solution isn't perfect but we generally want progress not perfection so I won't block this. The bigger concern is that if the "one slice is always executed on one thread" invariant continues to be baked into many places throughout the code then we may end up in a bad spot down the road (like, for example, we find a way to use virtual threads that breaks that invariant). |
Today this "one slice is always executed on one thread" invariant is essentially enforced by the Maybe in general it's better to not to rely on the thread id itself, especially in cases where there are data structures that shouldn't be re-used in the same thread across thread executions, but in the specific concurrent search case because these data structures are specific to a single request/query and we have the slice -> thread invariant I don't think this usage poses a problem. We have done similar in other cases for concurrent search (for example #10352) but in this specific case it seems like using the thread id is actually an optimization over the per-leaf data structure and perhaps more importantly doesn't change how the existing non-concurrent search path functions. Given that, in my opinion the solution provided in this PR looks like the best way forward for now. |
@andrross Thanks for your inputs.
Caching by
I don't think changes like this will show much difference while measuring the performance via benchmarks. It is about avoiding creation of unnecessary objects. In theory, comparing 2 approaches, I was seeing if we can optimize then why not. For keeping code maintainable, we are adding the reasons in java doc as to why threadId is chosen.
I don't see that we can break this invariant, as it is fundamental to all the collectors implementation. At a slice level there are different collector instances which are created and all the data structures being used inside the collector implementation are not thread safe. I will keep it as is for now as there are no strong objections. Thanks a lot for all the inputs and discussion. |
…t search path (#11088) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path Signed-off-by: Sorabh Hamirwasia <[email protected]> * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]> --------- Signed-off-by: Sorabh Hamirwasia <[email protected]> (cherry picked from commit f4b25d6) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…t search path (opensearch-project#11088) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path Signed-off-by: Sorabh Hamirwasia <[email protected]> * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]> --------- Signed-off-by: Sorabh Hamirwasia <[email protected]>
…t search path (#11088) (#11309) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource --------- Signed-off-by: Sorabh Hamirwasia <[email protected]>
…t search path (opensearch-project#11088) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path Signed-off-by: Sorabh Hamirwasia <[email protected]> * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]> --------- Signed-off-by: Sorabh Hamirwasia <[email protected]>
…t search path (opensearch-project#11088) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path Signed-off-by: Sorabh Hamirwasia <[email protected]> * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]> --------- Signed-off-by: Sorabh Hamirwasia <[email protected]>
…t search path (opensearch-project#11088) * Fix flaky test MoreExpressionIT.testSpecialValueVariable in concurrent search path Signed-off-by: Sorabh Hamirwasia <[email protected]> * Rename ReplaceableConstDoubleValueSource class to PerThreadReplaceableConstDoubleValueSource Signed-off-by: Sorabh Hamirwasia <[email protected]> --------- Signed-off-by: Sorabh Hamirwasia <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
fix the flaky test
org.opensearch.script.expression.MoreExpressionIT.testSpecialValueVariable {p0={"search.concurrent_segment_search.enabled":"true"}}
Related Issues
Resolves #10079
Check List
[ ]
New functionality has been documented.Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)Commit changes are listed out in CHANGELOG.md file (See: Changelog)Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.