Interlinear alignment off in .eaf stories #14

maksymilian-dabkowski · 2020-06-05T17:24:26Z

sciepsilon · 2020-09-26T02:29:59Z

The story shown above is "12 de diciembre", viewable here. The problem seems consistent throughout that story.

It might be a general problem with how LingView handles ELAN files in preprocessing/preprocess_eaf.js, but the demo ELAN file is just fine. Maybe those two files were processed by slightly different versions of LingView, although their most recent Build & Deploy ran at about the same time ("last month") for both of them.

When I open the "12 de diciembre" file in ELAN, it looks correct, with the transcription spanning several shorter annotations. The translation is in a "symbolic associaton" (untimed, 1-to-1) relationship with the transcription, and the morphemes and their glosses are in a "symbolic subdivision" (untimed, many-to-1) relationship with the transcription.

We actually have very few examples of ELAN files with morpheme breakdowns in LingView right now. At this moment, the Yucatec Maya site uses only ELAN files and very few of them have morpheme breakdowns; most just have a transcription and free translation. The Cofan site uses only FLEx files.

sciepsilon · 2020-10-23T06:54:33Z

Waait a minute... the ELAN screenshot (above) shows that there are actually two non-divided copies of "Tene' in k'aaba'e' Maricruz Kuyoc." and one non-divided Spanish translation. This is exactly what is displayed by LingView. So I think there's no LingView bug here; the fix is just to reorder or delete tiers from the ELAN file so that it looks better on the LingView site.

elisharf · 2020-11-16T17:54:03Z

Should we close this issue, or should we leave it open to perhaps add some code in preprocess_eaf.js to detect and fix issues in ELAN files that might cause formatting problems in LingView?

E.g. do you think we should make sure that aligned tiers are right next to each other?

sciepsilon · 2020-11-16T20:10:11Z

We should fix the ELAN file and then close this issue. You can reorder or hide tiers in ELAN by right-clicking on the tier name, if I remember correctly. When you save, those changes will be recorded in the .pfsx file that LingView reads to determine tier order.

We can also explore changing LingView's default tier ordering, but I'm not sure how much that would help in this case. Feel free to explore it if you think there's something here or if it seems like a fun way to get familiar with the codebase.

Duplicate tiers are a large part of why this story looks "wrong", but I wouldn't want LingView to overzealously or mysteriously hide duplicates.

Putting aligned tiers next to each other would be a very nice default, but unless there's a formal relationship between the tiers that says they're aligned, it could be hard for LingView to detect aligned tiers. Some kind of best-effort "check the first three timestamps and assume from there" might actually work well enough, especially since the user can override the ordering by including a .pfsx file. I'm sure exhaustive checking is possible, but it would likely make the preprocessing step slower, and we'll have to weigh the benefit against the cost.

We might also be able to prettify the default ordering just by looking at how many subdivisions there are (which LingView stores as the "num_slots" property), although I'm not sure what the right rule would be. Most slots to least slots? No, that puts the morphemes tier above the words tier. Least to most? No, that puts free glosses at the top, when they ideally should go at the bottom.

maksymilian-dabkowski added the bug label Jun 5, 2020

maksymilian-dabkowski assigned jackwalke Jun 29, 2020

sciepsilon unassigned jackwalke Sep 26, 2020

sciepsilon added the <2 weeks label Sep 26, 2020

sciepsilon added <1 day and removed <2 weeks labels Oct 23, 2020

elisharf self-assigned this Nov 16, 2020

sciepsilon unassigned elisharf Dec 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interlinear alignment off in .eaf stories #14

Interlinear alignment off in .eaf stories #14

maksymilian-dabkowski commented Jun 5, 2020

sciepsilon commented Sep 26, 2020

sciepsilon commented Oct 23, 2020

elisharf commented Nov 16, 2020

sciepsilon commented Nov 16, 2020

Interlinear alignment off in .eaf stories #14

Interlinear alignment off in .eaf stories #14

Comments

maksymilian-dabkowski commented Jun 5, 2020

sciepsilon commented Sep 26, 2020

sciepsilon commented Oct 23, 2020

elisharf commented Nov 16, 2020

sciepsilon commented Nov 16, 2020