This project demonstrates the setup of continuous integration for a Python data science project using Github CI/CD pipelines. It includes a Jupyter Notebook performing descriptive statistics with Pandas, test scripts and jupyter notebook, a shared library, and a Makefile to automate tasks such as formatting, linting, and testing. Github Actions run these tasks, with badges reflecting their status.
- The notebook performs descriptive statistics using Pandas.
- It is tested using the
nbval
plugin forpytest
to validate the correctness of the results.
test_script.py
: Contains tests for a standalone script.test_lib.py
: Tests the shared code inlib.py
.lib.py
: Shared utility code used in both the notebook and scripts.main.py
: More advanced code built on the top oflib.py
The Makefile
includes the following commands:
- Run all tests: Executes tests on the notebook, script, and library.
- Code formatting: Formats the Python code using black.
- Linting: Lints the code using Ruff.
- Install dependencies: Installs all required packages listed in
requirements.txt
.
A pinned list of dependencies for the project to ensure consistent environments across systems.
Tasks from the Makefile are automated using Github CI/CD pipelines. The github worflows have following jobs that:
- Run tests on the Jupyter Notebook, scripts, and library.
- Format the code using black.
- Lint the code using Ruff.
- Install the dependencies via
pip
.
git https://github.com/nogibjj/Ramil-Individual-Project-1.git
cd Ramil-Individual-Project-1
make install
- Run all tests:
make test_code
- Format code with black:
make format
- Lint code with Ruff:
make lint