You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello.
I m trying out marker-pdf and I noticed that it sometimes classifies title levels wrong.
Here is how page 63 looks:
That title is classified as level 1. Shouldn't it be level 0? It's a title of one of the main sections of the document, as seen on the contents page of this pdf:
Meanwhile, same title is mentioned on page 25 like this:
And that one is classified as level 0. That one is a link to the page 63 title, so I m not sure if that is making it confused.
Also, OCR somehow detects another same title on page 25 but marks it as level 1, which is weird since only that one exists there.
The text was updated successfully, but these errors were encountered:
Here is another example of weird title stuff:
Here is a page:
OCR correctly detects this title as level 0 title: {'title': '7. Institutional stakeholders and their roles', 'level': 0, 'page': 62}
However, there is one more instance of that title being found on page 0, this time as a level 1, along with a LOT of empty text titles (also on page 0) that don't even exist there.
Hello.
I m trying out marker-pdf and I noticed that it sometimes classifies title levels wrong.
Here is how page 63 looks:
That title is classified as level 1. Shouldn't it be level 0? It's a title of one of the main sections of the document, as seen on the contents page of this pdf:
Meanwhile, same title is mentioned on page 25 like this:
And that one is classified as level 0. That one is a link to the page 63 title, so I m not sure if that is making it confused.
Also, OCR somehow detects another same title on page 25 but marks it as level 1, which is weird since only that one exists there.
The text was updated successfully, but these errors were encountered: