Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Materailized View / Covering Index gets stuck in refreshing state #764

Closed
A-Gray-Cat opened this issue Oct 10, 2024 · 11 comments
Closed
Labels
bug Something isn't working Core:MV

Comments

@A-Gray-Cat
Copy link

A-Gray-Cat commented Oct 10, 2024

What is the bug?
After creating an auto refresh materialized view or covering index, the created acceleration would stay in Refreshing state forever, with no data gets ingested.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create a materialized view / covering index and set it auto_refresh=true.
  2. Go to Data sources -> Click the flint data source -> Click Accelerations to monitor the state of the acceleration that was just created.

What is the expected behavior?
The status should reflect the correct state, e.g. failed, succeeded, or it's actually refreshing the index.

What is your host/environment?

  • OS: [e.g. iOS]
  • Version 2.13
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

@A-Gray-Cat A-Gray-Cat added bug Something isn't working untriaged labels Oct 10, 2024
@YANG-DB
Copy link
Member

YANG-DB commented Oct 10, 2024

@penghuo @noCharger can you please take a look ?

@noCharger
Copy link
Collaborator

What is the problem, and what is the expected behavior? Both components in this issue are identical.

@A-Gray-Cat
Copy link
Author

@noCharger Updated the statement. Sorry I thought the expected behavior means the expected behavior of reproducing this issue 😅

@dai-chen
Copy link
Collaborator

Could you provide more info, such as your create MV/CV statement or Spark log? The context currently provided makes troubleshoot impossible ...

@A-Gray-Cat
Copy link
Author

A-Gray-Cat commented Oct 16, 2024

@dai-chen

Sure thing. Sorry I should've included that in the first place. The reason why I didn't include any specific example was because this issue happens regardless of what's inside the MV/CI statement. In addition, manual refresh works fine for the same statement.

CREATE MATERIALIZED VIEW last_7day_ct_mv AS
    SELECT time_dt,
        actor,
        accountid,
        region,
        src_endpoint,
        api,
        http_request,
        is_mfa,
        class_uid,
        resources
     FROM amazon_security_lake_glue_db_us_east_1.amazon_security_lake_table_us_east_1_cloud_trail_mgmt_2_0
     WHERE time_dt BETWEEN CURRENT_TIMESTAMP - INTERVAL '15' MINUTE AND CURRENT_TIMESTAMP
        AND accountid in ('{account id}')
        AND region = '{region}' 
WITH ( auto_refresh = true, refresh_interval = '15 Minute', checkpoint_location = 'CREATE MATERIALIZED VIEW test_mv_auto_refresh AS
    SELECT time_dt,
        actor,
        accountid,
        region,
        src_endpoint,
        api,
        http_request,
        is_mfa,
        class_uid,
        resources
     FROM amazon_security_lake_glue_db_us_east_1.amazon_security_lake_table_us_east_1_cloud_trail_mgmt_2_0
     WHERE time_dt BETWEEN CURRENT_TIMESTAMP - INTERVAL '15' MINUTE AND CURRENT_TIMESTAMP
        AND accountid in ('522536594833')
        AND region = 'us-east-1' 
WITH ( auto_refresh = true, refresh_interval = '15 Minute', checkpoint_location = 's3://{bucket name}/AWSLogs/checkpoint/')')

@dai-chen
Copy link
Collaborator

dai-chen commented Oct 16, 2024

@A-Gray-Cat Thanks for the info!

QQ: is amazon_security_lake_table_us_east_1_cloud_trail_mgmt_2_0 Iceberg table? FYI, we've not fully tested and released for Iceberg support yet.

Meanwhile, could you help quick check the following things:

  1. Make sure the checkpoint folder is empty and not reused by different Flint index
  2. Remove the WHERE clause and see if this is the root cause. I saw you created [BUG] timestamp comparison using "INTERVAL" literal doesn't work against ingested data #763, did you get it work here?

@A-Gray-Cat
Copy link
Author

Thanks for your prompt response @dai-chen

Yes. That's an iceberg table, and I suspect that's probably one of the root causes.

It's a good reminder that the issue you linked could also impact this. I tried different statements and found the INTERVAL literal was the cause of the issue. When I removed that and replaced it with timestamp() function, the first run of auto-refresh went through. Interestingly though, manual refresh worked fine with the INTERVAL literal.

Another observation: The same statement (see below) I used to create manual and auto-refresh materialized view ingested different amount of documents, and after the first run, I haven't seen the auto-refresh materialized view added any more documents, and it's stuck in refreshing statement again. I'm wondering if there's a built-in deduplication mechanism at play here.

One question: If I omit the where clause for the timestamp field, will the materialized view ingest all the data from the table every 15 minutes?

CREATE MATERIALIZED VIEW test_mv_auto_refresh AS
    SELECT time_dt,
        actor,
        accountid,
        region,
        src_endpoint,
        api,
        http_request,
        is_mfa,
        class_uid,
        resources
     FROM amazon_security_lake_glue_db_us_east_1.amazon_security_lake_table_us_east_1_cloud_trail_mgmt_2_0
     WHERE time_dt > timestamp('2024-10-15 00:00:00')
        AND accountid in ('{account id}')
        AND region = 'us-east-1' 
WITH ( auto_refresh = true, refresh_interval = '15 Minute', checkpoint_location = 's3://{bucket}/AWSLogs/checkpoint/')

@dai-chen
Copy link
Collaborator

dai-chen commented Oct 16, 2024

@A-Gray-Cat I see. Just to confirm my understanding:

  1. Create MV using INTERVAL ends up with query error. (I need to reproduce if so)
  2. Create auto refresh MV using timestamp function succeeds to refresh once
    Q1. Was there no new data coming to MV even after 15 mins? (your refresh interval)
    Q2. Is the issue gone if we remove time_dt > timestamp('2024-10-15 00:00:00')? (just to confirm the root cause)
  3. Create manual refresh MV can succeed for either interval or timestamp function

For your question, yes, MV will start from the very beginning of the source dataset if we remove the timestamp filtering condition.

Thanks!

@A-Gray-Cat
Copy link
Author

@dai-chen

  1. Yes, but it wouldn't show the MV refreshing as failed.
  2. Removing the timestamp constraints completely can make the return results exceed the size limit, so I'm not sure if that's a good method to test it.
  3. Yes. Manual refresh works.

@A-Gray-Cat
Copy link
Author

Quick update @dai-chen

I created an auto-refresh MV today, and it actually kept refreshing as expected, and even the INTERVAL literal also worked somehow.

The statement I put their tries to retrieve all the logs in past 1 day, but it doesn't go back long enough to collect all the logs. I will open another issue for it.

I think we can close this one since it somehow got resolved, not sure if you already implemented some fixes for this. Thanks again for the help!

@dai-chen
Copy link
Collaborator

@A-Gray-Cat Great to hear it’s finally working! :) Feel free to track the new issue separately. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Core:MV
Projects
None yet
Development

No branches or pull requests

4 participants