Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Roadmap #1

Open
3 of 11 tasks
Filimoa opened this issue Mar 27, 2024 · 10 comments
Open
3 of 11 tasks

🚀 Roadmap #1

Filimoa opened this issue Mar 27, 2024 · 10 comments

Comments

@Filimoa
Copy link
Owner

Filimoa commented Mar 27, 2024

Description

This is a tentative roadmap, I will update it as things evolve.

Roadmap

High Priority:

  • Implement unitable
  • Enable OCR support
  • Different embedding providers [in-progress]
  • Better table detection
  • LlamaIndex integration

Long Term:

  • Create a docker image with fastapi for non python users
  • Add support for ImageElements
  • More automated eval suite
  • Better OCR provider
  • Speed up parsing. Due to the way we construct TextSpan this can be quite slow especially on documents with tons of tables
  • Add embed_text property, useful on tables where embedding the contents performs poorly
@Filimoa Filimoa pinned this issue Mar 27, 2024
@shekhars-li
Copy link

Hey @Filimoa do you plan to add support for unitable anytime soon? Seems like the doc mentions it but the notebook does not have an example for it. Thanks for creating this project.

@Filimoa
Copy link
Owner Author

Filimoa commented Mar 31, 2024

Hey @Filimoa do you plan to add support for unitable anytime soon? Seems like the doc mentions it but the notebook does not have an example for it. Thanks for creating this project.

As soon as the pre-trained weights are released I'll be adding it. I talked with the ShengYun earlier this week and sounds like they'll be released ASAP.

@shekhars-li
Copy link

@Filimoa Looks like pretrained weights are available now! :)

@Filimoa
Copy link
Owner Author

Filimoa commented Apr 4, 2024

In progress! Should be merged in by the end of the week.

@Filimoa
Copy link
Owner Author

Filimoa commented Apr 5, 2024

Just merged - try it out, it will require downloading weights which you can read about here. We need to find a better model for table detection but this performs incredibly well otherwise.

@Ulipenitz
Copy link

Hey @Filimoa! Really great project!!
Have you thought about using open source models for the semantic processing?
You can find even better embedding models here: https://huggingface.co/spaces/mteb/leaderboard
Especially this one is really promising (only 0.67GB & better than text-embedding-3-large): https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1
There are also ONNX models, running pretty fast on CPUs.

@Filimoa
Copy link
Owner Author

Filimoa commented Apr 8, 2024

Added to the roadmap! Will ship very soon @Ulipenitz

@cthompson-insight
Copy link

Would be great to support Azure OpenAI as well.

@zishengwu
Copy link

Hey @Filimoa ! Have you try PaddleOCR ? As for me, this project have well performance for Layout Analysis and Table Recognition

@faileon
Copy link

faileon commented Nov 13, 2024

hello, i was playing with this library for the very first time today. it is very good, but i am missing image extraction. usually i need to work with images in some maner, for example sending them to a multimodal llm for description or similar. but for the PDFs I tried I am not getting any images back - is that what the Add support for ImageElements will bring? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants