extract image from pdf and upload it to IBM Cloud Object Storage
- COS_APIKEY
- COS_INSTANCE_CRN
- endpoint_url_private (eg: s3.private.jp-tok.cloud-object-storage.appdomain.cloud)
- endpoint_url_public (eg: s3.jp-tok.cloud-object-storage.appdomain.cloud)
Note: In this project you will need 3 COS buckets:
- Staging (to store the uplaoded PDF files)
- Parsing (to store the images uploded)
- Knowledge ( to store txt files that contain text along with URL of corresponding images)
- clone the project into your local
- create python virtual environment
python -m venv genai
- activate the virtual environment
source genai/bin/activate
- install the requirement
python -m pip install -r requirements.txt
- run the streamlit
streamlit run Main_Page.py
If you run it correcty, this Streamlit app will provide you this kind of output