You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import os
from unstructured.partition.pdf import partition_pdf
elements = partition_pdf(
filename=pdf_file,
# Unstructured Helpers
strategy="hi_res",
infer_table_structure=True,
model_name="yolox",
languages=["eng"] # this line can be deleted and the same error pops up
I am trying to run unstructured in Google Colab by following instructions from https://colab.research.google.com/drive/177-Tb6CJ0eFf9bZOEjbqb8xzR1IETpd-#scrollTo=huxQF-koB_8t
But getting this OCRAgentTesseract() takes no arguments error. The code used is provided below.
!apt-get -qq install poppler-utils tesseract-ocr
%pip install -q --user --upgrade pillow
%pip install -q unstructured["all-docs"]==0.12.5
import os
from unstructured.partition.pdf import partition_pdf
elements = partition_pdf(
filename=pdf_file,
)
/usr/local/lib/python3.10/dist-packages/unstructured/partition/utils/ocr_models/ocr_interface.py in get_instance(ocr_agent_module, language)
47 module_name, class_name = ocr_agent_module.rsplit(".", 1)
48 if module_name in OCR_AGENT_MODULES_WHITELIST:
---> 49 module = importlib.import_module(module_name)
50 loaded_class = getattr(module, class_name)
51 return loaded_class()
TypeError: OCRAgentTesseract() takes no arguments
The text was updated successfully, but these errors were encountered: