Add Drag-and-Drop Document Upload, Web Scraping, and Alternative Model Support #3797
Replies: 5 comments 7 replies
-
Isn't web scraping already supported with WebDriver? Isn't that half of the reason that this repo exists? |
Beta Was this translation helpful? Give feedback.
-
Other APIs to add would be OPT |
Beta Was this translation helpful? Give feedback.
-
this could be the way to get any file or folder Technically speaking, the way it works is when you upload a file, the text is extracted from it and chunked using a chunking algorithm – and these chunks are sent to the OpenAI embeddings API to get a vector embedding (basically a long sequence of numbers) for each chunk. Then these vector embeddings are stored in a VectorDB like pinecone. Then when a question comes in, it is also converted to an embedding vector, and that vector is used to query the vector database, to get the most relevant, close matches within the multi-dimensional vector space – this ends up being the most relevant context chunk(s) . |
Beta Was this translation helpful? Give feedback.
-
using chat gpt i got this : import os pinecone.deinit() def extract_text(file_path: str) -> str: def process_file(file_path: str) -> None:
def process_folder(folder_path: str) -> None: pinecone.deinit() To process a single file: python python |
Beta Was this translation helpful? Give feedback.
-
here is an code i got: python load_dotenv() def chunk_text(text: str, chunk_size: int = 1024) -> list[str]: pinecone.deinit() def extract_text(file_path: str) -> str: def process_file(file_path: str) -> None:
def process_folder(folder_path: str) -> None: pinecone.deinit() |
Beta Was this translation helpful? Give feedback.
-
I would like to propose a few improvements for the Auto-GPT repository to enhance its functionality and usability:
Drag-and-drop document upload: Implement a drag-and-drop file input area in the front-end, allowing users to easily upload documents to the web application. The uploaded document's content should be processed and passed to the GPT model for further processing. Consider using libraries like Dropzone.js to facilitate this feature.
Web scraping integration: Add the capability to perform web scraping and pass the extracted content to the Auto-GPT model for processing. Create a separate Python module or script to handle web scraping and integrate it with the existing server code. Expose this functionality via an API or user interface. Libraries like Beautiful Soup or Scrapy can be used to implement this feature.
Alternative model support: Allow users to utilize the application without an API key by replacing the GPT-3 model with another pre-trained language model that does not require an API key, such as GPT-2 or GPT-Neo from Hugging Face's Transformers library. Moreover, explore the possibility of integrating a different model like VICUNA, which would involve modifying the server code, updating the model loading and text generation functions, and making necessary changes to the front-end.
These improvements would significantly enhance the user experience and functionality of the Auto-GPT web application.
Beta Was this translation helpful? Give feedback.
All reactions