Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-level reading order #23

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Commits on Aug 7, 2024

  1. Configuration menu
    Copy the full SHA
    1175ba2 View commit details
    Browse the repository at this point in the history
  2. improve docstring

    bertsky committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    93d766c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    23e7248 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5be42bb View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. Configuration menu
    Copy the full SHA
    4111b3a View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2024

  1. Configuration menu
    Copy the full SHA
    9aec1e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f456777 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3f22fdb View commit details
    Browse the repository at this point in the history
  4. dangling lines/words: construct dummy regions/lines earlier, copy som…

    …e attributes (but not @Custom); ensure readingorder idrefs are correct (also for dummy text and image region)
    bertsky committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    2f2366e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    72e9e1d View commit details
    Browse the repository at this point in the history
  6. CI: disable py37 tests

    bertsky committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    cddb067 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2024

  1. Configuration menu
    Copy the full SHA
    5424b05 View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2024

  1. handle truly recursive regions…

    - `TextractLayout`: init with additional list
      of top-level block dict stored in `child_regions`:
      - just ID→dict (for LAYOUT_ types during build)
      - or ID→block (for TextractTable, KEY, SELECTION etc.
         already instantiated)
    - add `parent_layout` reference
    
    - `TextractTable`: add `parent_layout` reference
    - `TextractKey`: add `parent_layout` reference
    - `TextractValue`: add `parent_layout` reference
    
    - build `layouts` after all other blocks are built
    - replace recursive `child_regions` / `parent_layout`
      after all `layouts` are built; remove recursive
      instances from top level to avoid duplication
      in ReadingOrder or PcGts instantiation
    
    - ReadingOrder: extend recursive case (OrderedGroup)
      of `LAYOUT_FIGURE` with `LINE` children to `LAYOUT_*`
      with any `child_regions`
    
    - instantiation of PcGts types for `layouts`
      and `tables`: refactor as (inline) function
      for recursion and
      - try to re-use code
      - add assertions around known types of recursion
        (to be revisited with better documentation from AWS
        or more data examples)
    bertsky committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    86df1be View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    18b6fd1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4eb96ab View commit details
    Browse the repository at this point in the history