extract-image-pdf

extract image from pdf and upload it to IBM Cloud Object Storage

Credentials you need to provide

COS_APIKEY
COS_INSTANCE_CRN
endpoint_url_private (eg: s3.private.jp-tok.cloud-object-storage.appdomain.cloud)
endpoint_url_public (eg: s3.jp-tok.cloud-object-storage.appdomain.cloud)

Note: In this project you will need 3 COS buckets:

Staging (to store the uplaoded PDF files)
Parsing (to store the images uploded)
Knowledge ( to store txt files that contain text along with URL of corresponding images)

python -m venv genai

source genai/bin/activate

python -m pip install -r requirements.txt

streamlit run Main_Page.py

If you run it correcty, this Streamlit app will provide you this kind of output

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
files/sample_pdf		files/sample_pdf
utils		utils
.gitignore		.gitignore
Main_Page.py		Main_Page.py
README.md		README.md
requirements.txt		requirements.txt