Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify ESQL specific operations using new ESQL runner #484

Merged
merged 2 commits into from
Oct 20, 2023

Conversation

craigtaverner
Copy link
Contributor

Also removed timeout since it was previously only used for _search queries which often took much longer. And made the pragma body a one-liner for easier reading.

@gbanasiak gbanasiak self-requested a review October 19, 2023 10:51
Copy link
Contributor

@gbanasiak gbanasiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rewrite LGTM. Do we have a rough idea what is the duration of each request?

@craigtaverner
Copy link
Contributor Author

In my local testing of these queries on a 5% ingest using --track-params="ingest_percentage:5" I saw query times up to 5-6 seconds. If the performance is linear, which I would expect, then I presume we'll see 100-120s, right at the timeout threshold we're removing.

@gbanasiak gbanasiak self-requested a review October 20, 2023 09:41
The only new query remaining is esql_stats_enrich_control which runs as fast as all previous enrich queries, presumably because it groups by an existing indexed field.
All the rest of the stats_enrich queries group by an enriched field, which presumably prevents some kind of optimization from being used, because they take literally 100x the time to execute.

Locally the control take about 30ms, while the others take between 3s and 9s.
@craigtaverner
Copy link
Contributor Author

Additional local testing on 5% ingest shows that all existing enrich queries take something close to 30ms on my laptop, and the new esql_stats_enrich_control also takes that time. But the last 7 queries, the remaining esql_stats_enrich_* queries, all take between 3000 and 9000ms, 100x to 300x longer. The key difference between the control (fast) and the rest (slow) is that the control groups by an existing indexed field, while all the rest group by an enriched field. It would seem that this must prevent an optimization from kicking it.

So I've now removed all 7 slow queries from the schedule. I'll discuss with the ESQL team how best to handle this. One option is to bring back just one slow query, give it a very low iteration rate, and very long timeout. If this performance problem can be fixed, then we can bring back the remaining 6.

Copy link
Contributor

@gbanasiak gbanasiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM.

@gbanasiak gbanasiak merged commit cbdff9b into elastic:master Oct 20, 2023
11 checks passed
@gbanasiak gbanasiak deleted the esql_simplify branch October 20, 2023 13:06
inqueue pushed a commit to inqueue/rally-tracks that referenced this pull request Dec 6, 2023
* Simplify ESQL specific operations using new ESQL runner

* Removed new stats_enrich queries that take long and cause timeouts

The only new query remaining is esql_stats_enrich_control which runs as fast as all previous enrich queries, presumably because it groups by an existing indexed field.
All the rest of the stats_enrich queries group by an enriched field, which presumably prevents some kind of optimization from being used, because they take literally 100x the time to execute.

Locally the control take about 30ms, while the others take between 3s and 9s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants