You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure how you are doing text extraction, but just saw an article in IEEE computing edge that cited your tool. If you have any interested in Apache Tika we provide a functional Python library that you could leverage. Does pdfminer also do the text extraction part?
The benefit of Tika is that it supports text extraction from 1400+ formats.
Cheers,
Chris
The text was updated successfully, but these errors were encountered:
Hi,
Not sure how you are doing text extraction, but just saw an article in IEEE computing edge that cited your tool. If you have any interested in Apache Tika we provide a functional Python library that you could leverage. Does
pdfminer
also do the text extraction part?The benefit of Tika is that it supports text extraction from 1400+ formats.
Cheers,
Chris
The text was updated successfully, but these errors were encountered: