Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Streaming job does exist even the query has syntax error #116

Closed
penghuo opened this issue Oct 30, 2023 · 2 comments
Closed

[BUG] Streaming job does exist even the query has syntax error #116

penghuo opened this issue Oct 30, 2023 · 2 comments
Labels
0.1.1 bug Something isn't working

Comments

@penghuo
Copy link
Collaborator

penghuo commented Oct 30, 2023

  • failed query
CREATE INDEX mys3_default_http_logs ON mys3.default.http_logs (status) WITH (auto_refresh = true),query_execution_result_mys3
  • error log
23/10/30 23:44:56 ERROR FlintJob: Fail to write result, cause: Checkpoint location is mandatory for incremental refresh if spark.flint.index.checkpoint.mandatory enabled
java.lang.IllegalStateException: Checkpoint location is mandatory for incremental refresh if spark.flint.index.checkpoint.mandatory enabled
@penghuo penghuo added bug Something isn't working untriaged 0.1.1 labels Oct 30, 2023
@dai-chen
Copy link
Collaborator

Investigated with @kaituo and found root cause maybe https://github.com/opensearch-project/opensearch-spark/blob/main/spark-sql-application/src/main/scala/org/apache/spark/sql/FlintJob.scala#L87

Currently no matter if streaming job created or not (syntax/semantic error when creating), we let main thread wait for any streaming job termination. This may cause main thread wait forever.

kaituo added a commit to kaituo/opensearch-spark that referenced this issue Nov 14, 2023
This commit specifically tackles the issue detailed in opensearch-project#116. Previously, regardless of whether a streaming job was successfully initiated or encountered syntax/semantic errors during creation, the main thread would indefinitely wait for the streaming job's termination. This behavior often led to situations where the main thread would hang indefinitely, waiting for a streaming job that had not even started due to errors.

This update introduces a change: spark.streams.awaitAnyTermination() is now invoked only if the streaming query is executed without encountering any exceptions.

Testing Performed:
* Replicated the original issue to confirm its presence.
* Applied the fix and verified that the issue of indefinite waiting has been successfully resolved.

Signed-off-by: Kaituo Li <[email protected]>
kaituo added a commit that referenced this issue Nov 15, 2023
This commit specifically tackles the issue detailed in #116. Previously, regardless of whether a streaming job was successfully initiated or encountered syntax/semantic errors during creation, the main thread would indefinitely wait for the streaming job's termination. This behavior often led to situations where the main thread would hang indefinitely, waiting for a streaming job that had not even started due to errors.

This update introduces a change: spark.streams.awaitAnyTermination() is now invoked only if the streaming query is executed without encountering any exceptions.

Testing Performed:
* Replicated the original issue to confirm its presence.
* Applied the fix and verified that the issue of indefinite waiting has been successfully resolved.

Signed-off-by: Kaituo Li <[email protected]>
@kaituo
Copy link
Collaborator

kaituo commented Nov 16, 2023

resolved in above PR

@kaituo kaituo closed this as completed Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.1.1 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants