Concord is a Python project that leverages FastAPI, Neo4j, and BERTopic for advanced text analysis. It provides a platform for analyzing and visualizing text data using state-of-the-art machine learning techniques.
- Python 3.12+
- Poetry for dependency management
- Docker and Docker Compose
- Git
git clone https://github.com/boredlabsHQ/concord.git
cd concord
-
Update Package Lists
sudo apt update
-
Install Required Packages
sudo apt install -y software-properties-common curl git
-
Install Python 3.12
Add the Deadsnakes PPA and install Python 3.12:
sudo add-apt-repository ppa:deadsnakes/ppa sudo apt update sudo apt install -y python3.12 python3.12-venv python3.12-dev
-
Install Poetry
curl -sSL https://install.python-poetry.org | python3 - echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc source ~/.bashrc
-
Install Docker and Docker Compose
sudo apt install -y docker.io docker-compose sudo systemctl start docker sudo systemctl enable docker sudo usermod -aG docker $USER
Log out and log back in for the group changes to take effect.
-
Install Project Dependencies
poetry install poetry run pre-commit install
-
Install Python 3.12
Download and install Python 3.12 from the official website. During installation, make sure to check the box "Add Python to PATH".
-
Install Git
Download and install Git from the official website.
-
Install Poetry
Open Command Prompt or PowerShell and run:
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
Add Poetry to your PATH by adding the following line to your PowerShell profile:
$env:Path += ";$env:APPDATA\Python\Scripts"
-
Install Docker Desktop
Download and install Docker Desktop from the official website. Ensure that it is running before proceeding.
-
Install Project Dependencies
poetry install poetry run pre-commit install
Create nltk_data directory:
mkdir -p /YOUR-PATH/nltk_data
cd /YOUR-PATH/nltk_data
Open a Python shell and run the following commands:
import nltk
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('punkt_tab')
Add this env variable
NLTK_DATA=/YOUR-PATH/nltk_data
Set up a temporary Neo4j database:
docker-compose up -d
Note: On Windows, ensure Docker Desktop is running and has sufficient resources allocated.
poetry run pre-commit run -a
Install openapi-generator
openapi-generator-cli generate -c config.yml && \
rm -rf .flake8 docker-compose.yaml requirements.txt Dockerfile && \
poetry run pre-commit run -a