Fix recover index bug when Flint data index is deleted accidentally #241

dai-chen · 2024-02-01T21:16:51Z

Description

To address the 2 issues related:

Quick fixed recover index API to clean up metadata log entry if Flint data index is gone. This prevents index stuck in refreshing state and infinite attempt on recover index API.
Added check to prevent FlintJob hang even though no streaming job launched by recover statement.

Documentation

Updated user manual: https://github.com/dai-chen/opensearch-spark/blob/fix-recover-index-for-index-data-deleted/docs/index.md#index-job-management

TODO

Integration test with FlintJob [Checking with @kaituo if possible to add IT]
[BUG] Gracefully terminate index refresh job when Flint index deleted accidentally #244

Testing

Manual test to double confirm. First of all, replicate the problematic scenario as outlined below:

CREATE SKIPPING INDEX ON stream.lineitem_tiny
(l_shipdate VALUE_SET)
WITH ( auto_refresh = true );

# Delete Flint data index
DELETE flint_myglue_stream_lineitem_tiny_skipping_index

# Check index state metadata log
GET .query_execution_request_myglue/_search
      {
        "_index": ".query_execution_request_myglue",
        "_id": "ZmxpbnRfbXlnbHVlX3N0cmVhbV9saW5laXRlbV90aW55X3NraXBwaW5nX2luZGV4",
        "_score": 1,
        "_source": {
          "version": "1.0",
          "latestId": "ZmxpbnRfbXlnbHVlX3N0cmVhbV9saW5laXRlbV90aW55X3NraXBwaW5nX2luZGV4",
          "type": "flintindexstate",
          "state": "refreshing",
          "applicationId": "unknown",
          "jobId": "unknown",
          "dataSourceName": "myglue",
          "jobStartTime": 1706916312007,
          "lastUpdateTime": 1706916452474,
          "error": ""
        }
      }

Now verify the enhanced recover index API:

spark-sql> RECOVER INDEX JOB flint_myglue_stream_lineitem_tiny_skipping_index;
24/02/02 23:48:31 WARN FlintSpark: Cleaning up metadata log as index data has been deleted

# The metadata log is gone
GET .query_execution_request_myglue/_search

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chen Dai <[email protected]>

vmmusings · 2024-02-06T23:30:11Z

@dai-chen RECOVER INDEX JOB flint_myglue_stream_lineitem_tiny_skipping_index;
What is the output of this command? Will it be the same in all the cases.

dai-chen · 2024-02-07T17:25:05Z

@dai-chen RECOVER INDEX JOB flint_myglue_stream_lineitem_tiny_skipping_index; What is the output of this command? Will it be the same in all the cases.

We're following Spark DDL and return empty result if success in all Flint DDL statement.

opensearch-trigger-bot · 2024-02-07T17:25:41Z

The backport to 0.1 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-spark/backport-0.1 0.1
# Navigate to the new working tree
pushd ../.worktrees/opensearch-spark/backport-0.1
# Create a new branch
git switch --create backport/backport-241-to-0.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f4744abf38caeb4f22758f112a32c8881842efad
# Push it to GitHub
git push --set-upstream origin backport/backport-241-to-0.1
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-spark/backport-0.1

Then, create a pull request where the base branch is 0.1 and the compare/head branch is backport/backport-241-to-0.1.

…pensearch-project#241) * Clean up metadata log in recover index API Signed-off-by: Chen Dai <[email protected]> * Await termination only if there is streaming job running Signed-off-by: Chen Dai <[email protected]> * Update user manual Signed-off-by: Chen Dai <[email protected]> --------- Signed-off-by: Chen Dai <[email protected]>

… accidentally (#247) * Fix recover index bug when Flint data index is deleted accidentally (#241) * Clean up metadata log in recover index API Signed-off-by: Chen Dai <[email protected]> * Await termination only if there is streaming job running Signed-off-by: Chen Dai <[email protected]> * Update user manual Signed-off-by: Chen Dai <[email protected]> --------- Signed-off-by: Chen Dai <[email protected]> * Cherry pick vacuum index changes Signed-off-by: Chen Dai <[email protected]> --------- Signed-off-by: Chen Dai <[email protected]>

Clean up metadata log in recover index API

39c9adb

Signed-off-by: Chen Dai <[email protected]>

dai-chen added bug Something isn't working 0.2 backport 0.1 labels Feb 1, 2024

dai-chen self-assigned this Feb 1, 2024

Await termination only if there is streaming job running

de0e9ca

Signed-off-by: Chen Dai <[email protected]>

dai-chen marked this pull request as ready for review February 2, 2024 23:50

dai-chen requested review from rupal-bq, vmmusings, penghuo, anirudha, kaituo and YANG-DB as code owners February 2, 2024 23:50

Update user manual

33cffa9

Signed-off-by: Chen Dai <[email protected]>

dai-chen changed the title ~~Fix recover index bug when index data is deleted~~ Fix recover index bug when Flint data index is deleted accidentally Feb 3, 2024

vmmusings approved these changes Feb 6, 2024

View reviewed changes

penghuo approved these changes Feb 7, 2024

View reviewed changes

dai-chen merged commit f4744ab into opensearch-project:main Feb 7, 2024
4 checks passed

dai-chen mentioned this pull request Feb 7, 2024

Fix Session state bug and improve Query Efficiency in REPL #245

Merged

dai-chen mentioned this pull request Feb 7, 2024

[Backport 0.1] Fix recover index bug when Flint data index is deleted accidentally #247

Merged

dai-chen deleted the fix-recover-index-for-index-data-deleted branch February 7, 2024 23:48

dai-chen mentioned this pull request May 23, 2024

[FEATURE] Enhance handling of Flint index data deletion to prevent dangling metadata log entry #356

Closed

dai-chen mentioned this pull request Sep 30, 2024

Lazy clean up dangling index metadata log entry #558

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix recover index bug when Flint data index is deleted accidentally #241

Fix recover index bug when Flint data index is deleted accidentally #241

dai-chen commented Feb 1, 2024 •

edited

Loading

vmmusings commented Feb 6, 2024

dai-chen commented Feb 7, 2024

opensearch-trigger-bot bot commented Feb 7, 2024

Fix recover index bug when Flint data index is deleted accidentally #241

Fix recover index bug when Flint data index is deleted accidentally #241

Conversation

dai-chen commented Feb 1, 2024 • edited Loading

Description

Documentation

TODO

Testing

Issues Resolved

vmmusings commented Feb 6, 2024

dai-chen commented Feb 7, 2024

opensearch-trigger-bot bot commented Feb 7, 2024

dai-chen commented Feb 1, 2024 •

edited

Loading