Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(deps): Remove unstructured.paddlepaddle fork #3506

Merged
merged 5 commits into from
Aug 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ jobs:
- name: Install all doc and test dependencies
run: |
make install-ci
make install-paddleocr
make install-all-ingest
make check-licenses

Expand Down
1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ RUN chown -R notebook-user:notebook-user /app && \
USER notebook-user

RUN find requirements/ -type f -name "*.txt" -exec pip3.11 install --no-cache-dir --user -r '{}' ';' && \
pip3.11 install unstructured.paddlepaddle && \
python3.11 -c "from unstructured.nlp.tokenize import download_nltk_packages; download_nltk_packages()" && \
python3.11 -c "from unstructured.partition.model_init import initialize; initialize()" && \
python3.11 -c "from unstructured_inference.models.tables import UnstructuredTableTransformerModel; model = UnstructuredTableTransformerModel(); model.initialize('microsoft/table-transformer-structure-recognition')"
Expand Down
4 changes: 0 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -277,10 +277,6 @@ install-local-inference: install install-all-docs
install-pandoc:
ARCH=${ARCH} ./scripts/install-pandoc.sh

.PHONY: install-paddleocr
install-paddleocr:
ARCH=${ARCH} ./scripts/install-paddleocr.sh

## pip-compile: compiles all base/dev/test requirements
.PHONY: pip-compile
pip-compile:
Expand Down
1 change: 1 addition & 0 deletions requirements/extra-paddleocr.in
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
-c ./deps/constraints.txt
-c base.txt

paddlepaddle==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
unstructured.paddleocr==2.8.0.1
53 changes: 52 additions & 1 deletion requirements/extra-paddleocr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@
#
# pip-compile ./extra-paddleocr.in
#
anyio==3.7.1
# via
# -c ././deps/constraints.txt
# -c ./base.txt
# httpx
astor==0.8.1
# via paddlepaddle
attrdict==2.0.1
# via unstructured-paddleocr
cachetools==5.4.0
Expand All @@ -12,6 +19,8 @@ certifi==2024.7.4
# via
# -c ././deps/constraints.txt
# -c ./base.txt
# httpcore
# httpx
# requests
charset-normalizer==3.3.2
# via
Expand All @@ -27,13 +36,33 @@ cycler==0.12.1
# via matplotlib
cython==3.0.11
# via unstructured-paddleocr
decorator==5.1.1
# via paddlepaddle
et-xmlfile==1.1.0
# via openpyxl
exceptiongroup==1.2.2
# via
# -c ./base.txt
# anyio
fonttools==4.53.1
# via matplotlib
h11==0.14.0
# via
# -c ./base.txt
# httpcore
httpcore==1.0.5
# via
# -c ./base.txt
# httpx
httpx==0.27.0
# via
# -c ./base.txt
# paddlepaddle
idna==3.7
# via
# -c ./base.txt
# anyio
# httpx
# requests
imageio==2.34.2
# via
Expand Down Expand Up @@ -61,7 +90,9 @@ matplotlib==3.7.2
more-itertools==10.4.0
# via cssutils
networkx==3.2.1
# via scikit-image
# via
# paddlepaddle
# scikit-image
numpy==1.26.4
# via
# -c ./base.txt
Expand All @@ -71,6 +102,8 @@ numpy==1.26.4
# matplotlib
# opencv-contrib-python
# opencv-python
# opt-einsum
# paddlepaddle
# scikit-image
# scipy
# shapely
Expand All @@ -87,25 +120,34 @@ opencv-python==4.8.0.76
# unstructured-paddleocr
openpyxl==3.1.5
# via unstructured-paddleocr
opt-einsum==3.3.0
# via paddlepaddle
packaging==23.2
# via
# -c ././deps/constraints.txt
# -c ./base.txt
# lazy-loader
# matplotlib
# scikit-image
paddlepaddle==3.0.0b1
# via -r ./extra-paddleocr.in
pdf2image==1.17.0
# via unstructured-paddleocr
pillow==10.4.0
# via
# imageio
# imgaug
# matplotlib
# paddlepaddle
# pdf2image
# scikit-image
# unstructured-paddleocr
premailer==3.10.0
# via unstructured-paddleocr
protobuf==4.23.4
# via
# -c ././deps/constraints.txt
# paddlepaddle
pyclipper==1.3.0.post5
# via unstructured-paddleocr
pyparsing==3.0.9
Expand Down Expand Up @@ -146,12 +188,21 @@ six==1.16.0
# attrdict
# imgaug
# python-dateutil
sniffio==1.3.1
# via
# -c ./base.txt
# anyio
# httpx
tifffile==2024.7.24
# via scikit-image
tqdm==4.66.5
# via
# -c ./base.txt
# unstructured-paddleocr
typing-extensions==4.12.2
# via
# -c ./base.txt
# paddlepaddle
unstructured-paddleocr==2.8.0.1
# via -r ./extra-paddleocr.in
urllib3==1.26.19
Expand Down
9 changes: 0 additions & 9 deletions scripts/install-paddleocr.sh

This file was deleted.

Loading