Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change queue job fails on missing content file #70

Open
geoffroy-noel-ddh opened this issue Sep 10, 2024 · 3 comments
Open

Change queue job fails on missing content file #70

geoffroy-noel-ddh opened this issue Sep 10, 2024 · 3 comments
Assignees
Labels
bug Something isn't working SHOULD

Comments

@geoffroy-noel-ddh
Copy link
Member

geoffroy-noel-ddh commented Sep 10, 2024

The change-queue job running on github is failing due to a missing file:

jeff@j3470:~/src/prj/crossreads/tools$ node run-change-queue.mjs 
../annotations/http-sicily-classics-ox-ac-uk-inscription-isic001473-isic001473-jpg.json
../annotations/https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001447-isic001447-jpg.json
file:///home/jeff/src/prj/crossreads/tools/run-change-queue.mjs:34
      for (let annotation of content) {

jeff@j3470:~/src/prj/crossreads/tools$ grep -rin 'api-dts' ../annotations/change-queue.json 
20:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001447-isic001447-jpg.json"
33:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001408-isic001408-jpg.json"
37:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001408-isic001408-jpg.json"
41:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic030002-isic001408-jpg.json"
49:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic030002-isic001408-jpg.json"
61:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001445-isic001445-jpg.json"
65:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001464-isic001464-jpg.json"
69:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001445-isic001445-jpg.json"
73:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001471-isic001471-jpg.json"
77:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001471-isic001471-jpg.json"
81:          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001471-isic001471-jpg.json"

11 out of 36 changes in the queue have a different format for the filename. With -api-dts in it. Why?

    {
      "annotations": [
        {
          "id": "https://crossreads.web.ox.ac.uk/annotations/ac784b4f-1569-4a9b-8f69-30935a3a1964",
          "file": "http-sicily-classics-ox-ac-uk-inscription-isic001473-isic001473-jpg.json"
        }
      ],
      "tags": [
        "one-mid-bar"
      ],
      "creator": "https://api.github.com/users/simonastoyanova",
      "created": "2024-09-05T07:24:28.844Z"
    },
    {
      "annotations": [
        {
          "id": "https://crossreads.web.ox.ac.uk/annotations/b2aab90c-c28c-4dc2-96e0-b066d6d97028",
          "file": "https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001447-isic001447-jpg.json"
        }
      ],
      "tags": [
        "one-mid-bar"
      ],
      "creator": "https://api.github.com/users/simonastoyanova",
      "created": "2024-09-05T07:25:00.040Z"
    },

That annotation exists in this file instead:

jeff@j3470:~/src/prj/crossreads/tools$ l ../annotations/*1447*
-rw-rw-r-- 1 jeff jeff 20K Apr 10 00:21 ../annotations/http-sicily-classics-ox-ac-uk-inscription-isic001447-isic001447-jpg.json
  • https-crossreads-web-ox-ac-uk-api-dts-documents-id-isic001447-isic001447-jpg.json (WRONG)
  • http-sicily-classics-ox-ac-uk-inscription-isic001447-isic001447-jpg.json (CORRECT)

Annotation in the correct file:

  {
    "@context": "http://www.w3.org/ns/anno.jsonld",
    "type": "Annotation",
    "body": [
      {
        "type": "TextualBody",
        "purpose": "describing",
        "format": "application/json",
        "value": {
          "script": "greek-1",
          "components": {
            "apex": {
              "features": [
                "rounded"
              ]
            },
            "bottom-bar": {
              "features": [
                "curved",
                "sans-serif",
                "diagonal",
                "touching",
                "below-baseline"
              ]
            },
            "middle-bar": {
              "features": [
                "diagonal",
                "sans-serif",
                "touching",
                "curved"
              ]
            },
            "top-bar": {
              "features": [
                "sans-serif",
                "curved",
                "diagonal",
                "touching"
              ]
            }
          },
          "tags": null,
          "character": "Σ"
        }
      }
    ],
    "target": [
      {
        "source": "https://apheleia.classics.ox.ac.uk/iipsrv/iipsrv.fcgi?IIIF=/inscription_images/ISic001447/ISic001447_tiled.tif",
        "selector": {
          "type": "FragmentSelector",
          "conformsTo": "http://www.w3.org/TR/media-frags/",
          "value": "xywh=pixel:212.23928833007812,111.04105377197266,292.1890869140625,164.00289154052734"
        }
      },
      {
        "source": "https://crossreads.web.ox.ac.uk/api/dts/documents?id=ISic001447",
        "selector": {
          "type": "XPathSelector",
          "value": "//*[@xml:id='BsΤAe']",
          "refinedBy": {
            "type": "TextPositionSelector",
            "start": 0,
            "end": 1
          }
        }
      }
    ],
    "id": "https://crossreads.web.ox.ac.uk/annotations/b2aab90c-c28c-4dc2-96e0-b066d6d97028",
    "generator": "https://github.com/kingsdigitallab/crossreads#2023-09-01-00",
    "creator": "https://api.github.com/users/simonastoyanova",
    "created": "2023-11-22T17:11:44.697Z",
    "modifiedBy": "https://api.github.com/users/simonastoyanova",
    "modified": "2023-11-22T17:13:03.988Z"
  },

It looks like the reference to the file in the change queue was incorrectly constructed by the search page from the textual source, rather than the image source. That's most likely due to the wrong assumption that the second target is always the image?

@geoffroy-noel-ddh geoffroy-noel-ddh added the bug Something isn't working label Sep 10, 2024
@geoffroy-noel-ddh geoffroy-noel-ddh self-assigned this Sep 10, 2024
@geoffroy-noel-ddh
Copy link
Member Author

geoffroy-noel-ddh commented Sep 10, 2024

The search page reconstructs the filename using getAnnotationFileNameFromItem(item). Made from doc & img path in the annotation.

That method is wrong. It should be the same as the one used by the annotator. Made from DTS member id and the image file name in the TEI.

  • Q1. how to get the input at that point in the search?
  • Q2. why does the wrong method works in some cases and not in others?
  • Q3. how can we fix the paths in the change-queue?

Answer 2

At some point we switched the reference to the text from a link to the sicily domain to the DTS request path for that document. The old reference pattern and the DTS object ID share the same substring. Which makes the wrong method work with older annotations.

$ grep -rin 'source":' annotations/ | grep -v 'IIIF'

annotations/http-sicily-classics-ox-ac-uk-inscription-isic000176-isic000176-jpg.json:51:        "source": "http://sicily.classics.ox.ac.uk/inscription/ISic000176.xml",
annotations/http-sicily-classics-ox-ac-uk-inscription-isic000186-isic000186-jpg.json:61:        "source": "http://sicily.classics.ox.ac.uk/inscription/ISic000186.xml",
annotations/http-sicily-classics-ox-ac-uk-inscription-isic020300-isic020300-jpg.json:36:        "source": "https://crossreads.web.ox.ac.uk/api/dts/documents?id=ISic020300",
annotations/http-sicily-classics-ox-ac-uk-inscription-isic020300-isic020300-jpg.json:108:        "source": "https://crossreads.web.ox.ac.uk/api/dts/documents?id=ISic020300",

Answer 3

See fix below, I've made the change queue script tolerant to invalid references. So no need to update the change queue itself.

Answer 1

Resolution for that is still pending.

@simonastoyanova
Copy link
Collaborator

@geoffroy-noel-ddh Interesting, the two files 1447 and 1473 were annotated a while back. Are they the only files that throw up this issue? The majority of the archaic texts were done at the same time. Let me know if you need me to test anything.

@geoffroy-noel-ddh
Copy link
Member Author

geoffroy-noel-ddh commented Sep 16, 2024

Hi Simona, the change queue (i.e. your bulk tag edits from the search page) has been successfully processed now. And the error shouldn't occur any more as I have made the automated script tolerant to invalid references. I'll soon fix the search so it doesn't produce those invalid references any more. (Although invalid, they uniquely refer to the right inscription & annotation, but not the exact correct format for the file name; so no risk of tags going to the wrong place).

If you want to help, just check that the change queue message says 'no change(s) pending' (see below) at least once a day and the tags you've applied are indeed reflected in the annotator (just check a few). If you notice anything wrong, please let me know. Thank you.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SHOULD
Projects
None yet
Development

No branches or pull requests

2 participants