Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Re-process feature #878

Open
msenechal opened this issue Nov 14, 2024 · 6 comments
Open

[Bug] Re-process feature #878

msenechal opened this issue Nov 14, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@msenechal
Copy link
Collaborator

The re-processing feature seems to not work and be stuck in a "Reprocess" status.

Steps to reproduce:

  1. Clear schema config in the graph enhancement
  2. Load any document
  3. Go to graph enhancement, Pull the schema and make some modifications
  4. click on the reprocess
    It stays in Reprocess forever

Backend logs:
Looks like it is being stuck here:
[INFO]{'api_name': 'retry_processing', 'db_url': 'neo4j+s://a77ed0fa.databases.neo4j.io:7687', 'userName': 'neo4j', 'database': 'neo4j', 'file_name': 'ms.pdf', 'retry_condition': 'delete_entities_and_start_from_beginning', 'logging_time': '2024-11-14 12:17:48 UTC'}
2024-11-14 12:17:49,039 - <src.entities.source_node.sourceNode object at 0x39effed10>
Base Param value 1 : {'props': {'fileName': 'ms.pdf', 'status': 'Reprocess', 'nodeCount': 0, 'relationshipCount': 0, 'is_cancelled': False, 'processed_chunk': 0, 'retry_condition': 'delete_entities_and_start_from_beginning'}}
2024-11-14 12:17:49,039 - Update source node properties
INFO: 127.0.0.1:50138 - "POST /retry_processing HTTP/1.1" 200 OK

Nothing being logged after this last line and nothing happen, the entities are not being re-generated

@msenechal msenechal added the bug Something isn't working label Nov 14, 2024
@karanchellani
Copy link
Collaborator

Hi @msenechal this seems intermittent issue, can you please check with some more examples.

@msenechal
Copy link
Collaborator Author

Hi, doesn't look intermittent on my side, it always happen on any document types, see screenshot attached, I tried on different PDFs, wikipedia, url, youtube etc
I tried one re-process at a time with waiting time in between and you can see in the logs nothing happen after the Update source node properties :
image

Logs for trying the re-process on 4 files, with delays between re-process:

backend | 2024-11-21 11:09:07,768 - <src.entities.source_node.sourceNode object at 0xfffeffe38c40>
backend | 2024-11-21 11:09:07,769 - Update source node properties
backend | 2024-11-21 11:09:25,025 - <src.entities.source_node.sourceNode object at 0xfffeffc1e6b0>
backend | 2024-11-21 11:09:25,025 - Update source node properties
backend | 2024-11-21 11:13:30,225 - <src.entities.source_node.sourceNode object at 0xfffefb3882e0>
backend | 2024-11-21 11:13:30,225 - Update source node properties
backend | 2024-11-21 12:13:49,858 - <src.entities.source_node.sourceNode object at 0xfffeffe3a200>
backend | 2024-11-21 12:13:49,860 - Update source node properties

@karanchellani
Copy link
Collaborator

"Reprocess" is a state, after this you need to click on Generate Graph Button to process the files. May be we need to change the label to "Ready to Reprocess" to avoid the confusion.

@msenechal
Copy link
Collaborator Author

Ahhhh yeah let's change it to either Ready to reprocess or automatically run the processing because right now it is not intuitive for users to click on generate when they already clicked on reprocess

@kartikpersistent
Copy link
Collaborator

@jexp
What will be better changing the state to ready to reprocess or reprocessing immediately on click of save ??

@jexp
Copy link
Contributor

jexp commented Nov 22, 2024

Change the status name.
as you might want to change model, schema and reprocess multiple files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants