Replies: 3 comments 1 reply
-
I would like to ask if the following projects can edit PDF in detail and specific document content, and what is their relationship with this project? pdfminer.six Fully open source. PyMuPDF has some table detection functionality. Please see their license. This is a good project! |
Beta Was this translation helpful? Give feedback.
-
The first step of a RAG pipeline is splitting up the document into "semantic" chunks (parts of the doc that are talking about the same thing. This library is primarily aimed at that use case - embedding are a nice way of doing this. |
Beta Was this translation helpful? Give feedback.
-
About Semantic Processing Example:
Could you please tell me if it can be combined with the code below,
Form a "answer every question" PDF analysis assistant? |
Beta Was this translation helpful? Give feedback.
-
Description
When I installed and ran the code according to the example, I easily obtained the text content existing on the pdf. This is a very convenient project!
But what puzzles me is that the developer also provided sample code for openai. Does this mean that openai can be provided to generate summary conclusions for PDF content, or analyze the theme of the content?
Beta Was this translation helpful? Give feedback.
All reactions