Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Decouple Flint index monitor from Flint Spark API layer #435

Open
dai-chen opened this issue Jul 18, 2024 · 0 comments
Open

[Refactor] Decouple Flint index monitor from Flint Spark API layer #435

dai-chen opened this issue Jul 18, 2024 · 0 comments
Labels
maintenance Code refactoring

Comments

@dai-chen
Copy link
Collaborator

Is your feature request related to a problem?

Currently, FlintSparkIndexMonitor is placed within the Flint Spark integration module and is started automatically by the Flint refresh and recover API. This causes the following problems:

  1. Monitoring logic is coupled with the Flint Spark API.
  2. The Flint Spark application code has to bypass Flint SQL and call the awaitMonitor API directly: https://github.com/opensearch-project/opensearch-spark/blob/main/spark-sql-application/src/main/scala/org/apache/spark/sql/JobOperator.scala#L96
  3. Index monitor is unable to reuse error handling code in application: Store error message for streaming job execution in Flint metadata log #433 (comment)

What solution would you like?

I propose moving the index monitor to the Flint Spark application layer. The application code should control the start and stop of the index monitor. This would:

  1. Decouple the monitoring logic from the Flint Spark API.
  2. Allow for more flexible and explicit control of index monitoring within the application code.

Key considerations include evaluating if the benefits are worth the effort and any associated risks, as well as determining how the application code should decide to start monitoring after the Flint index creation statement completes.

What alternatives have you considered?

  • Keeping the FlintSparkIndexMonitor in the integration module but refactoring the awaitMonitor call mechanism to reduce complexity.
  • Creating a separate monitoring service that can be invoked independently by the Flint Spark application and Flint SQL.

Do you have any additional context?

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Code refactoring
Projects
None yet
Development

No branches or pull requests

1 participant