You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, OCR workflows must be installed into Production in advance by placing the ocrd process script files into ./kitodo/data/ocr_workflows with a .sh suffix, and then configuring them in the projects settings (thereby tying them to new processes).
But what if a user wants to use a different OCR workflow for some processes in a project, or change the workflow for existing processes (because they did not work / run through or the results do not look good)?
For now, one would need to edit the file ocr-workflow.sh in the process directory, and re-trigger the OCR script. (OCR Processing itself is already incremental, so the workflow will then continue to build what ever is still necessary or out-of-date.) But that is tedious and requires access to the file system (Manager share).
The user experience could be much better if we made workflows configurable on the web pages of the Monitor. Crucially, we should allow editing and re-running OCR workflows:
create a volume for kitodo/data/ocr_workflows to be shared by Production, Manager and Monitor
add an endpoint (and reference it on th index page) for listing existing workflows
make workflows editable (in a simple text form field, perhaps with syntax highlighting), create a new version when saving
in the workspace view, make workspaces multi-selectable and add an action button for (re-)processing with a selectable workflow
in the job view, add an action button for re-processing with a selectable workflow
So if a task cannot be finished, because the OCR workflow failed (which in the future could also mean that it did not meet the configured quality threshold), then one will manually trigger said re-processing.
We could even provide a null workflow that will always fail and therefore force you to choose your custom workflow dynamically (per-process).
Saved workflows could also be version-controlled. The workflows should have a free-form description, but their file name should be a hash of their (non-comment, non-whitespace) content.
Also, the Manager should collect statistics about all workflows (which ones ran how often and with what success or quality level), so the Monitor can show them.
The text was updated successfully, but these errors were encountered:
Currently, OCR workflows must be installed into Production in advance by placing the
ocrd process
script files into./kitodo/data/ocr_workflows
with a.sh
suffix, and then configuring them in the projects settings (thereby tying them to new processes).But what if a user wants to use a different OCR workflow for some processes in a project, or change the workflow for existing processes (because they did not work / run through or the results do not look good)?
For now, one would need to edit the file
ocr-workflow.sh
in the process directory, and re-trigger the OCR script. (OCR Processing itself is already incremental, so the workflow will then continue to build what ever is still necessary or out-of-date.) But that is tedious and requires access to the file system (Manager share).The user experience could be much better if we made workflows configurable on the web pages of the Monitor. Crucially, we should allow editing and re-running OCR workflows:
kitodo/data/ocr_workflows
to be shared by Production, Manager and MonitorSo if a task cannot be finished, because the OCR workflow failed (which in the future could also mean that it did not meet the configured quality threshold), then one will manually trigger said re-processing.
We could even provide a null workflow that will always fail and therefore force you to choose your custom workflow dynamically (per-process).
Saved workflows could also be version-controlled. The workflows should have a free-form description, but their file name should be a hash of their (non-comment, non-whitespace) content.
Also, the Manager should collect statistics about all workflows (which ones ran how often and with what success or quality level), so the Monitor can show them.
The text was updated successfully, but these errors were encountered: