-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TermsAggregatorTests flaky test fix #13567
Conversation
❌ Gradle check result for a358230: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Sandesh Kumar <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #13567 +/- ##
============================================
+ Coverage 71.42% 71.60% +0.18%
- Complexity 59978 61094 +1116
============================================
Files 4985 5052 +67
Lines 282275 287126 +4851
Branches 40946 41604 +658
============================================
+ Hits 201603 205594 +3991
- Misses 63999 64595 +596
- Partials 16673 16937 +264 ☔ View full report in Codecov by Sentry. |
Looks like SpotLess check failed. Can you take a look? |
While you are at it, could also add some info about the validation you did for the change? |
server/src/test/java/org/opensearch/search/aggregations/bucket/terms/TermsAggregatorTests.java
Show resolved
Hide resolved
Is the test dependent on the number of segments? Is that the assertion failing in the original bug? |
If the test depends on a specific number of segments then can we [1] add an assertion for the segment count and [2] force merge to the correct number of segments within the test? Not sure about the viability of [2] since I see a test path for deleted docs. |
[1] No its not feasible to add assertion of segment count - I could not find a utility to do so in our codebase. Introducing such utility may not be within the scope of this change. Added more details in description now. Tests are not failing locally (most likely running in silo). But looking at failures, it looks like if we can force the documents are added in a single call, it will result in a single segment like the one in DateHistogramTest given we already have no merge policy in place during index creation. The failures in flaky tests reported in different CI calls can only occur when there are multiple segments so this is more of a calculated change that we can do to solve the flakiness with no side-effects. |
Can you try to run the test locally until failure with and without this change? And report back the number of successful runs without failure? |
So I have iterated through the test 250 times in a loop for both scenarios (with & without my change) with 0 failures in each case. So still no luck reproducing it in present way. |
Can you explicitly issue |
Hey @jainankitk, thanks for suggesting this. In the original code, I added Meaning cases where there will 2 segments + deleted document case and optimization will kick in for only 1 of the 2 segments and not both - test will fail with the same CI exception. So this validates my suspect of multiple segments being the issue. |
@sandeshkr419 - Thanks for confirming the issue. Approved! |
Since you have an |
I think I missed where I can check the number of segments using
The cases which are failing are when one of the segment can use optimized flow (where |
Signed-off-by: Sandesh Kumar <[email protected]> (cherry picked from commit d26cd46) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@msfroh I see this is merged. Would you recommend reverting this back and have a multiple segment case as we discussed offline. Tbh, I like it in present state since its much more atomic as the improvements were on segment level and we are forcing the index creation with a single segment. If required, we can have a multiple segment case separately independent of the tests in this PR. |
Signed-off-by: Sandesh Kumar <[email protected]>
As per offline discussion with @msfroh - will keep this change and will add a separate test to cover multi-segment scenario. |
(cherry picked from commit d26cd46) Signed-off-by: Sandesh Kumar <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: Sandesh Kumar <[email protected]>
Description
Adding documents in a single chunk in unit test to avoid multiple segments in the unit test.
The following assertion was failing:
The number of collect() calls is used to check whether the optimization has kicked in or not. The tests written with tight assertions assuminga single segment.
In local development, running this test in isolation, it is not feasible to get the test failing as it might be creating a single segment. However, in gradle CI (see linked issue) and running the test with other tests, it may happen that multiple segments are created (multiple commits after individual document addition), which is the only explanation why the collect() count assertion is failing case.
So inspired from https://github.com/opensearch-project/OpenSearch/blob/main/server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregatorTests.java#L1655-L1668 editing the test to add all documents in a single call to avoid committing changes to index and avoid multiple segments.
Did not add changelog as this is a flaky test fix.
Related Issues
Resolves #12640
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.