Skip to content

Commit

Permalink
Merge branch 'refs/heads/main' into feature/zh_adaptation
Browse files Browse the repository at this point in the history
# Conflicts:
#	CHANGELOG.md
  • Loading branch information
JIAQIA committed Aug 15, 2024
2 parents 6dc2aed + 9b778e2 commit 7c0adf7
Show file tree
Hide file tree
Showing 6 changed files with 23 additions and 10 deletions.
18 changes: 16 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,22 @@
## 0.15.2-dev1
## 0.15.4

### Enhancements

* Fix Compatibility Issue with Chinese Text in Document Parsing
### Features

### Fixes

* **Resolve an installation error with `pytesseract>=0.3.12` that occurred during `pip install unstructured[pdf]==0.15.3`.**

## 0.15.3

### Enhancements

### Features

### Fixes

* **Remove the custom index URL from `extra-paddleocr.in` to resolve the error in the `setup.py` configuration.**

## 0.15.2

Expand Down
3 changes: 1 addition & 2 deletions requirements/deps/constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@ Office365-REST-Python-Client<2.4.3
# unstructured-inference to be upgraded when unstructured library is upgraded
# https://github.com/Unstructured-IO/unstructured/issues/1458
# unstructured-inference
# use the known compatible version of weaviate and pytesseract
pytesseract @ git+https://github.com/madmaze/[email protected]
# use the known compatible version of weaviate
weaviate-client>3.25.0
# TODO: Pinned in transformers package, remove when that gets updated
tokenizers>=0.19,<0.20
Expand Down
2 changes: 1 addition & 1 deletion requirements/extra-paddleocr.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
-c ./deps/constraints.txt
-c base.txt

paddlepaddle==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
paddlepaddle==3.0.0b1
unstructured.paddleocr==2.8.0.1
4 changes: 3 additions & 1 deletion requirements/extra-pdf-image.in
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,6 @@ effdet
# Do not move to constraints.in, otherwise unstructured-inference will not be upgraded
# when unstructured library is.
unstructured-inference==0.7.36
pytesseract>=0.3.12
# NOTE(christine): Pinned to a specific version of pytesseract from the GitHub repository.
# Remove this pin and switch to the latest version from PyPI once version 0.3.13 or newer is officially released.
pytesseract @ git+https://github.com/madmaze/[email protected]
4 changes: 1 addition & 3 deletions requirements/extra-pdf-image.txt
Original file line number Diff line number Diff line change
Expand Up @@ -202,9 +202,7 @@ pypdf==4.3.1
pypdfium2==4.30.0
# via pdfplumber
pytesseract @ git+https://github.com/madmaze/[email protected]
# via
# -c ././deps/constraints.txt
# -r ./extra-pdf-image.in
# via -r ./extra-pdf-image.in
python-dateutil==2.9.0.post0
# via
# -c ./base.txt
Expand Down
2 changes: 1 addition & 1 deletion unstructured/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.15.2" # pragma: no cover
__version__ = "0.15.4" # pragma: no cover

0 comments on commit 7c0adf7

Please sign in to comment.