This NLP project focuses on text mapping techniques, exploring various methodologies to handle and analyze textual data efficiently. With Jupyter notebooks for proofs of concept and a structured source code setup, it's designed for easy understanding and extensibility.
- Python 3.8 or above
- pip for managing Python packages
- Clone this repository:
gh repo clone JaynouOliver/NLPTextMapping
- Navigate to the project directory:
cd NLPTextMapping
- Install the required Python packages:
pip install -r requirements.txt
To explore the text mapping concepts:
- Navigate to the
Notebooks
directory. - Open the desired notebook (e.g.,
PoC - Text mapping.ipynb
) using Jupyter Notebook or JupyterLab. - Run the cells sequentially to understand the workflow and outputs.
- Navigate to the
src
directory. - Run the main application:
python main.py
This project is structured for easy understanding and further development:
- Notebooks: For experimenting with text mapping concepts and visualizing results.
- src: Contains the core logic, split into modular components for ease of enhancement and maintenance.
config.py
: Central configuration file.embedding.py
: Handles embedding generation and manipulation.main.py
: The starting point of the application.match.py
: Implements matching logic.pre_process.py
: Prepares data for processing.split_doc.py
: Splits documents for easier handling.utils.py
: Utility functions supporting various tasks.
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request