-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Data Stream for Reporting storage #176022
Conversation
A documentation preview will be available soon.
Request a new doc build by commenting
If your PR continues to fail for an unknown reason, the doc build pipeline may be broken. Elastic employees can check the pipeline status here. |
9893121
to
7345ea1
Compare
ad926b1
to
55d32f8
Compare
0801774
to
2cc94a1
Compare
ae03a84
to
35c76e6
Compare
e0a657f
to
a491ebc
Compare
## Summary In previous code, Reporting fetched for a list of pending reports and compared the status of each result to a stored record of pending reports. The flakiness occurred because sometimes the results don't contain a record for a very "fresh" report since the refresh interval hadn't ticked and the stored change wasn't reflecting in search results. With [switching Reporting to use a data stream](#176022), there was new flakiness in serverless tests. The area of flakiness was in getting Reporting to show the "Job completed" toast notification. The main difference faced with data streams is a lower index refresh interval, which is lower even more for serverless. This PR fixes the poller to loop through the list of stored pending reports and compare each to the result from the search. If a stored report isn't present in the search, we consider that to still be in the pending state. Other changes: - Improvements the class responsible for managing updates to the stored list of pending reports, so that each operation performs atomically and go through a queue to mitigate the chance of race conditions. - Update function names and variable names for readability - Remove the unused `removeJob` function. - Move helper code into private methods of the `StreamHandler` class and simplify `x-pack/plugins/reporting/public/plugin.ts` ## Release note Fixed an issue in Reporting with consistently showing the toast message for completed report jobs. --------- Co-authored-by: Kibana Machine <[email protected]>
a491ebc
to
c40121d
Compare
Pinging @elastic/appex-sharedux (Team:SharedUX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review only LGTM 👍
Needs followup, the data stream configuration set in ES must a data stream lifecycle policy
Do we need to wait for that first before merging this PR?
expect(job.payload.title).equal('A Saved Search With a DATE FILTER'); | ||
|
||
// wait for index refresh | ||
await sleep(3000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how risky this is to introduce flakiness. Wouldn't it be better to pass a param to postJobCSV({ refreshIndex: true })
and avoid the sleep
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
- Passing many test runs with the flaky test runner is definitely a requirement for this PR.
- Calling for a refresh on the index whenever a job is posted seems expensive enough to merit seconds thoughts.
The amount of time to sleep should be derived from the refresh interval for data streams, which I think is 3 seconds (this is integration test is for stateful). However, in serverless it is 15 seconds. I will confirm these things in a sync with @elastic/es-data-management
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling for a refresh on the index whenever a job is posted seems expensive enough to merit seconds thoughts
I meant to only set refreshIndex: true
for api integration tests to ensure consistency and not rely on a time value that can change at any moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(disclaimer about not really understanding the typescript code)
Could this use the wait_for
option when indexing things so that you only wait for as long as is necessary for the refresh to happen naturally? I.e.: https://www.elastic.co/guide/en/elasticsearch/reference/8.12/docs-refresh.html#docs-refresh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebelga I think I understand now. But, I'm very hesitant to add a flag in the API that would only serve the purpose of tests.
@dakrone I will try your suggestion to use wait_for
, and see if that allows for removing or lowering the sleep call.
If the sleep is needed, my goal will to make sure it is set to a calculated amount of time. It should be low enough to wait for ES and avoiding an understood race condition. It shouldn't be perceived as something to address flakiness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, I'm very hesitant to add a flag in the API that would only serve the purpose of tests
I see. My point on this is that we shouldn't have to test ES behaviour (in this case that it does refresh the index "at some point") and add some timeout for it. We should assume ES will do it. It's like mocking a debounce
in jest, I know it will eventually execute the debounced function.
I know it's a trade off, I understand where you are coming from, but I think it is worth it if (1) it prevents flackiness (2) it reduce time to execute the tests. Just my 2 cents 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the suggestion to change to refresh=wait_for
in the operations to index new documents and update documents, and it seems to resolved this.
## Summary In previous code, Reporting fetched for a list of pending reports and compared the status of each result to a stored record of pending reports. The flakiness occurred because sometimes the results don't contain a record for a very "fresh" report since the refresh interval hadn't ticked and the stored change wasn't reflecting in search results. With [switching Reporting to use a data stream](elastic#176022), there was new flakiness in serverless tests. The area of flakiness was in getting Reporting to show the "Job completed" toast notification. The main difference faced with data streams is a lower index refresh interval, which is lower even more for serverless. This PR fixes the poller to loop through the list of stored pending reports and compare each to the result from the search. If a stored report isn't present in the search, we consider that to still be in the pending state. Other changes: - Improvements the class responsible for managing updates to the stored list of pending reports, so that each operation performs atomically and go through a queue to mitigate the chance of race conditions. - Update function names and variable names for readability - Remove the unused `removeJob` function. - Move helper code into private methods of the `StreamHandler` class and simplify `x-pack/plugins/reporting/public/plugin.ts` ## Release note Fixed an issue in Reporting with consistently showing the toast message for completed report jobs. --------- Co-authored-by: Kibana Machine <[email protected]>
## Summary In previous code, Reporting fetched for a list of pending reports and compared the status of each result to a stored record of pending reports. The flakiness occurred because sometimes the results don't contain a record for a very "fresh" report since the refresh interval hadn't ticked and the stored change wasn't reflecting in search results. With [switching Reporting to use a data stream](elastic#176022), there was new flakiness in serverless tests. The area of flakiness was in getting Reporting to show the "Job completed" toast notification. The main difference faced with data streams is a lower index refresh interval, which is lower even more for serverless. This PR fixes the poller to loop through the list of stored pending reports and compare each to the result from the search. If a stored report isn't present in the search, we consider that to still be in the pending state. Other changes: - Improvements the class responsible for managing updates to the stored list of pending reports, so that each operation performs atomically and go through a queue to mitigate the chance of race conditions. - Update function names and variable names for readability - Remove the unused `removeJob` function. - Move helper code into private methods of the `StreamHandler` class and simplify `x-pack/plugins/reporting/public/plugin.ts` ## Release note Fixed an issue in Reporting with consistently showing the toast message for completed report jobs. --------- Co-authored-by: Kibana Machine <[email protected]>
54b2239
to
7204d46
Compare
* This method is automatically called on the Stack Management > Reporting page, by the `` API for users with | ||
* privilege to manage ILM, to notify them when attention is needed to update the policy for any reason. | ||
*/ | ||
public async checkIlmMigrationStatus(): Promise<IlmPolicyMigrationStatus> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relocated this code from x-pack/plugins/reporting/server/lib/deprecations/check_ilm_migration_status.ts
return IlmPolicyManager.create({ client }); | ||
} | ||
|
||
private async createIndex(indexName: string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to create the index or manually apply index settings in Reporting! That is handled in ES now.
/ci |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#5999[✅] x-pack/test/functional/apps/discover/config.ts: 33/33 tests passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data Discovery changes LGTM 👍
Great work!
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested locally in both on prem and serverless. LGTM!!
Summary
Closes #161608
Release Note
Reporting internal storage has been changed from using regular indices to a data stream configuration for a more efficient sharding strategy. This change is not expected to have any impact to users.
Screenshots
Upgrade test (manual process)
Using a report generated before this change, and a report generated after "upgrading":
Even though the two reports are in different types storage, they are still managed by the same policy:
Looking at the details of the policy shows how the different types of storage are used:
Log lines
Initial startup in clean environment
Kibana restart with ES running continuously
Checklist
See https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5302 (internal link)