Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: Difficult to tell when tasks fail in successful workflows #987

Open
fiver-watson opened this issue Jul 9, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@fiver-watson
Copy link
Contributor

Description

Right now, it is possible for a Preservation action workflow to complete successfully, despite the fact that some Archivematica tasks may fail - for example, parsing a premis.xml file included in a PIP and writing it to the AIP METS file.

Because Archivematica outputs so many tasks, a single error can easily be overlooked in the long list (generally over 100) of tasks in an AIP creation workflow. There is also no indication at the Preservation action header level that any issues or failures can be found in the tasks below, making it easy for archivists to potentially miss critical information along the way while evaluating the success of an AIP creation workflow.

This is compounded by the fact that currently the Package Statuses legend visible as an expandable on the Packages browse page lists the definition of the "Done" status as:

The current workflow or task has completed without errors.

(emphasis on "without errors" added).

Combined, this means that while potentially business critical preservation tasks might have failed, a user may never notice this, and may continue to preserve content with undesirable errors in the AIP creation process.

We should clarify the wording of the Done status, and better highlight when there are non-critical errors in the tasks of a given workflow, so archivists can decide for themselves whether this invalidates the AIP or can be safely ignored.

To reproduce

  1. Use enduro main branch no later than commmit 5763d35ebf1df28a5a827411509e39c6549aa5b1
  2. Run a Vecteur SIP or AIP through
  3. Inspect the results - check that the status is "Done"
  4. Expand the preservation actions, scroll down and around task 33 or so, notice that the "Load PREMIS events from metadata/premis.xml" task has failed
  5. Return to the Packages browse page, expand the Statuses legend, and read the definition of the "Done" status

Resulting error

  • There is a task error buried in the many tasks of the AIP creation workflow
  • It is easy to miss amid all the other tasks
  • There is no indication of this failure elsewhere on the Package details page
  • The Statuses legend defines the Done status as a workflow that completes "without errors", yet it has been applied to a workflow that included an error

Expected behavior

  • The "Done" status definition should be updated to remove the "without errors" condition
  • A task error should be easier to find amid many other successful tasks
  • An indication that one or more errors has occurred in a given workflow that otherwise completed successfully should be visible on the Package details page without needing to expand the relevant workflow tasks and find the specific error

Additional context and proposed resolution

I propose the following changes:

  • Revise the Done status definition
  • Add a new ⚠️ status that can be combined with other statuses to indicate that there is a task error, but the workflow continues (e.g. Done, In Progress, Pending)
  • When a task errors, highlight the entire task row in the preservation actions in red, similar to how Archivematica microservices appear when they error
  • Use the new ⚠️ status icon wherever relevant, i.e. wherever the Done status would normally be shown
  • On hover, show a count of task errors in the relevant workflow
  • Include a count of task errors in the summary information included below a Preservation action's header

Here are the proposed changes to the Status legend and the definitions. You can also see an example of the new ⚠️ status being included on an otherwise successful status in the table below the legend:

enduro-pkg-statuses-legend-w-warning

Here is a package details page, showing a Create AIP preservation action that includes an error:

enduro-pres-actions-w-warning

Here is the same page, when hovering over the ⚠️ icon:

enduro-pres-actions-w-warning-hover

@fiver-watson fiver-watson added the bug Something isn't working label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🛠 Refining
Development

No branches or pull requests

1 participant