Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Datasets disappear from the status page before they are completed` #3846

Open
Rosencrantz opened this issue Aug 12, 2024 · 0 comments
Open
Assignees
Labels
bug Things that should work, but don’t

Comments

@Rosencrantz
Copy link
Contributor

Describe the bug
When uploading a large dataset with large raw documents that produce small numbers of tasks to be completed there is a high chance that there will be a period of time when a dataset has no pending tasks (after existing tasks have finished and no new tasks have started). When a dataset has no pending tasks it is treated as completed and is removed from the status page. This resets all the metrics. After the next document has been uploaded the dataset is added to the status page again, but it looks as though this in a new 'job'.

Expected behavior
Ideally when a dataset is in progress it remains on the status page until all the documents have been uploaded. It may be enough to add a delay to deal with 80% of these issues when a document is ready shortly (less than a minute?) after the previous set of tasks has completed

Aleph version
All

@Rosencrantz Rosencrantz added bug Things that should work, but don’t triage These issues need to be reviewed by the Aleph team and removed triage These issues need to be reviewed by the Aleph team labels Aug 12, 2024
@catileptic catileptic self-assigned this Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things that should work, but don’t
Projects
None yet
Development

No branches or pull requests

2 participants