diff --git a/.coveragerc b/.coveragerc index 38cee4d..bb474ce 100644 --- a/.coveragerc +++ b/.coveragerc @@ -2,3 +2,4 @@ omit = test/* */__init__.py + */__main__.py diff --git a/.github/ISSUE_TEMPLATE/bug_report_template.yml b/.github/ISSUE_TEMPLATE/bug_report_template.yml new file mode 100644 index 0000000..7cc7169 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report_template.yml @@ -0,0 +1,68 @@ +name: "\U0001F41E Bug report" +description: Report a bug on UFOID +labels: ["bug"] +body: + - type: markdown + attributes: + value: | + Thanks for taking the time to fill out this bug report! Before you get started, please [search to see](https://github.com/immobiliare/ufoid/issues) if an issue already exists for the bug you encountered + - type: textarea + id: bug-description + attributes: + label: Describe the bug + description: Please provide a concise description of what the bug is, in clear English. If you intend to submit a PR for this issue, tell us in the description. + placeholder: Bug description + validations: + required: true + - type: checkboxes + attributes: + label: Have you searched existing issues? 🔎 + description: Please search to see if an issue already exists for the issue you encountered. + options: + - label: I have searched and found no existing issues + required: true + - type: textarea + id: reproduction + attributes: + label: Reproduction + description: Please provide a minimal example, with code, that can be run to reproduce the issue. Do NOT provide screenshots of code, or link to external repos or applications. Use ``` to format code blocks. + placeholder: Reproduction + value: | + ```console + python -m ufoid + ``` + validations: + required: true + - type: textarea + id: screenshot + attributes: + label: Screenshot + description: If relevant, please include screenshot(s) + - type: textarea + id: logs + attributes: + label: Logs + description: "Please include the full stacktrace of the errors you get" + render: shell + - type: textarea + id: system-info + attributes: + label: System Info + description: Please ensure you are running the latest version of UFOID. + render: shell + validations: + required: true + - type: dropdown + id: severity + attributes: + label: Severity + description: Select the severity of this issue + options: + - I can work around it + - Blocking usage of UFOID + validations: + required: true + - type: markdown + attributes: + value: | + 📌 Please ensure that you have filled all of the required sections above, and that the reproduction you have provided is [minimal, complete, and reproducible](https://stackoverflow.com/help/minimal-reproducible-example). Incomplete issues will be closed. diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 0000000..91df241 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,19 @@ +--- +name: ⚡ Feature request +about: Suggest an improvement or new feature for UFOID +title: '' +labels: '' +assignees: '' + +--- +- [ ] I have searched to see if a similar issue already exists. + + +**Is your feature request related to a problem? Please describe.** +A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] + +**Describe the solution you'd like** +A clear and concise description of what you want to happen. + +**Additional context** +Add any other context or screenshots about the feature request here. diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..60f84bb --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,25 @@ +## Description + +Please include a concise summary, in clear English, of the changes in this pull request. If it closes an issue, please mention it here. + +Closes: #(issue) + +## 🎯 PRs Should Target Issues + +Before your create a PR, please check to see if there is [an existing issue](https://github.com/immobiliare/ufoid/issues) for this change. If not, please create an issue before you create this PR, unless the fix is very small. + +Not adhering to this guideline will result in the PR being closed. + +## Tests + +1. PRs will only be merged if tests pass on CI. To run the tests locally, please set up [your environment locally](https://github.com/immobiliare/vegeta/blob/main/CONTRIBUTING.md) and run the tests:` + ```console + source venv/bin/activate + pytest --cov + ``` + +2. You may need to run the linters: + ```console + source venv/bin/activate + pre-commit run --all-files + ``` diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 0000000..be70707 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,12 @@ +version: 2 +updates: + - package-ecosystem: "github-actions" + directory: "/" + schedule: + interval: "weekly" + open-pull-requests-limit: 10 + - package-ecosystem: "pip" + directory: "/" + schedule: + interval: "weekly" + open-pull-requests-limit: 10 diff --git a/.github/example_images/ImmobiliareLabs_Logo_Negative.png b/.github/example_images/ImmobiliareLabs_Logo_Negative.png new file mode 100644 index 0000000..015b15c Binary files /dev/null and b/.github/example_images/ImmobiliareLabs_Logo_Negative.png differ diff --git a/.github/example_images/benchmark_stability_collisions.png b/.github/example_images/benchmark_stability_collisions.png new file mode 100644 index 0000000..8dcfe5c Binary files /dev/null and b/.github/example_images/benchmark_stability_collisions.png differ diff --git a/.github/example_images/benchmark_stability_precision.png b/.github/example_images/benchmark_stability_precision.png new file mode 100644 index 0000000..78dd0bb Binary files /dev/null and b/.github/example_images/benchmark_stability_precision.png differ diff --git a/.github/example_images/benchmark_stability_precision_zoom.png b/.github/example_images/benchmark_stability_precision_zoom.png new file mode 100644 index 0000000..37bb253 Binary files /dev/null and b/.github/example_images/benchmark_stability_precision_zoom.png differ diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml new file mode 100644 index 0000000..6a6daef --- /dev/null +++ b/.github/workflows/ci.yaml @@ -0,0 +1,39 @@ +name: CI Workflow + +on: + push: + branches: + - '*' + pull_request: + +jobs: + test: + runs-on: ubuntu-latest + strategy: + matrix: + python-version: ['3.9', '3.10', '3.11', '3.12'] + + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: ${{ matrix.python-version }} + cache: 'pip' + + - name: Set up virtual environment + run: python -m venv venv + + - name: Install dependencies + run: | + source venv/bin/activate + pip install --upgrade pip + pip install -r requirements.txt + pip install -r requirements-test.txt + + - name: Run tests + run: | + source venv/bin/activate + pytest --cov diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml new file mode 100644 index 0000000..d4a12d1 --- /dev/null +++ b/.github/workflows/release.yml @@ -0,0 +1,36 @@ +name: Create release + +on: + workflow_dispatch: + +jobs: + changelog: + name: Create changelog + runs-on: ubuntu-latest + + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Conventional Changelog Action + id: changelog + uses: TriPSs/conventional-changelog-action@v5 + with: + preset: conventionalcommits + github-token: ${{ secrets.GH_AIDE_TOKEN }} + git-user-name: ${{ github.actor }} + git-user-email: ${{ github.actor }}@users.noreply.github.com + tag-prefix: '' + input-file: 'CHANGELOG.md' + output-file: 'CHANGELOG.md' + release-count: 10000 + skip-on-empty: false + skip-version-file: true + + - name: Create Release + uses: softprops/action-gh-release@v2 + if: ${{ steps.changelog.outputs.skipped == 'false' }} + with: + tag_name: ${{ steps.changelog.outputs.tag }} + body: ${{ steps.changelog.outputs.clean_changelog }} + token: ${{ secrets.GH_AIDE_TOKEN }} diff --git a/.python-version b/.python-version deleted file mode 100644 index 2009c7d..0000000 --- a/.python-version +++ /dev/null @@ -1 +0,0 @@ -3.9.2 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 71d3058..e5d11cb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,144 +1,94 @@ -# Contributing - -This document describes how to work and contribute on this project. - -- [Contributing](#contributing) - - [1 How to clone the project](#1-how-to-clone-the-project) - - [2 Requirements](#2-requirements) - - [3 Configuration](#3-configuration) - - [4 Installation](#4-installation) - - [4.1 `pyenv` usage (skip if the installed python version is the same required by the project)](#41-pyenv-usage-skip-if-the-installed-python-version-is-the-same-required-by-the-project) - - [4.2 Create virtualenv and install requirements](#42-create-virtualenv-and-install-requirements) - - [5 Run procedure](#5-run-procedure) - - [6 Test procedure](#6-test-procedure) - - [7 CI Structure](#7-ci-structure) - - [7.1 Runners](#71-runners) - - [7.2 Variables](#72-variables) - - [7.3 Test](#73-test) - - [7.4 Badges](#74-badges) - - [8 Code style and conventions](#8-code-style-and-conventions) - -## 1 How to clone the project - -In order to get this project, execute the following command - -```console -git clone https://github.com/immobiliare/ufoid.git -``` - -## 2 Requirements - -This project requires the following technologies: - -- Python 3.9.2 (as shown in the `.python-version` file) -- PIP + packages in `requirements.txt` - -You can try to install all requirements using the following command - -```console -make env -``` - -or if you have pyenv installed - -```console -make env_pyenv -``` - -## 3 Configuration - -`ufoid/config/config.yaml` allows you to customize various aspects of the duplicate detection process. Here are some key parameters you can modify: - -- `num_processes`: Number of processes for parallel execution. -- `chunk_length`: The length of each chunk for chunk-based processing. See below for more information. -- `new_paths`: List of directory paths containing the new dataset for duplicate detection (automatically recursive). -- `old_paths`: List of directory paths containing the old dataset for comparison with the new dataset (automatically recursive). -- `check_with_itself`: Boolean flag to indicate whether to check for duplicates within the new dataset. -- `check_with_old_data`: Boolean flag to indicate whether to check for duplicates between the new and old datasets. -- `txt_output`: Boolean flag to indicate whether to save duplicate information to the output file. -- `txt_output_file`: Path to the output file where duplicate information will be saved. -- `delete_duplicates`: Boolean flag to indicate whether to delete duplicate images from the dataset. -- `create_folder_with_no_duplicates`: Boolean flag to indicate whether to create a folder with non-duplicate images. -- `new_folder`: Path to the folder where non-duplicate images will be stored. -- `distance_threshold`: The distance threshold for considering images as duplicates. 10 is optimal for our use case, since it allows to get all the exact duplicate (also with some resilience to minor manipulations on images, while avoiding collisions. See https://docs.google.com/document/d/16DS-Z-SHKtmTzQikxCO4SwRJU0cHS_-TVkf9UAKYRZA/edit#heading=h.uybo5cpys4eefor an extensive study on this. - - -## 4 Installation - -In order to guarantee the proper operation of the application, the recommended python version is 3.9.2. - -### 4.1 `pyenv` usage (skip if the installed python version is the same required by the project) - -If the default python version on the system is different from the recommended one, you can install the required version using the `pyenv` tool available [here](https://github.com/pyenv/pyenv#installation). -Once `pyenv` is installed, you can install the required python version executing - -```console -pyenv install 3.9.2 -``` - -and you can activate the recommended version for the current shell by - -```console -pyenv shell 3.9.2 -``` - -### 4.2 Create virtualenv and install requirements - -In order to create a clean environment for the execution of the application, a new virtualenv should be created inside the current folder, using the command - -```console -python3 -m venv venv -``` - -A new folder named `venv` will be created in `.` - -In order to activate the virtualenv, execute - -```console -source venv/bin/activate -``` - -and install python requirements executing - -```console -pip install -r requirements.txt -``` -A different approach consists in using the Makefile by running from the project root the command - -```console -make -``` - -This operation will: - -- create the venv; -- update pip to the latest version; -- install the requirements; -- install the git hook. - -## 5 Run procedure - -Start the script using the following command: - -```console -python -m ufoid -``` - -## 6 Test procedure - -Inside the project you can find functional tests created to verify the correctness of implemented features. -It is possible to start the execution of all `pytest` tests (unit + integration) running: - -```console -pytest -``` - -In order to see the coverage for unit tests, execute - -```console -pytest test/unit --cov -``` - -## 7 Code style and conventions - -This project makes use of the standard Python coding style: PEP-8 +# Contributing to UFOID + +First off, thank you for considering contributing to UFOID! Your contributions help make this project better for everyone. + +## Table of Contents + +- [Getting Started](#getting-started) +- [Submitting Issues](#submitting-issues) +- [Pull Requests](#pull-requests) +- [Coding Guidelines](#coding-guidelines) +- [Running Tests](#running-tests) + +## Getting Started + +1. **Fork the repository**: Click the "Fork" button at the top right of the repository page. +2. **Clone your fork**: Clone the forked repository to your local machine. + ```bash + git clone https://github.com//ufoid.git + cd ufoid + ``` +3. **Create a virtual environment and install dependencies**: Set up a virtual environment to install dependencies. + ```bash + make + source venv/bin/activate + ``` + +## Submitting Issues + +If you encounter any bugs or have feature requests, please create an issue in the GitHub repository. When creating an issue, please include: + +- A clear and descriptive title. +- A detailed description of the issue or request. +- Steps to reproduce the issue (if applicable). +- Any relevant logs, screenshots, or other information. + +## Pull Requests + +Before submitting a pull request, please ensure you have followed these steps: + +1. **Fork the repository** and **clone your fork** to your local machine. +2. **Create a new branch** from the `main` branch. + ```bash + git checkout -b my-feature-branch + ``` +3. **Make your changes**: Implement your changes in the new branch. +4. **Run tests**: Ensure all tests pass by running the test suite. +5. **Commit your changes**: Write clear and concise commit messages. + ```bash + git add . + git commit -m "Add my new feature" + ``` +6. **Push to your fork**: Push your changes to your forked repository. + ```bash + git push origin my-feature-branch + ``` +7. **Open a pull request**: Go to the original repository and open a pull request. Please provide a detailed description of your changes and link any related issues. + +## Coding Guidelines + +To maintain code quality, please adhere to the following guidelines: + +- Follow the [PEP 8](https://www.python.org/dev/peps/pep-0008/) style guide for Python code. +- Write clear and concise code with comments where necessary. +- Keep the codebase DRY (Don't Repeat Yourself) and modular. +- Ensure all new features are covered by tests. +- Update documentation as needed when making changes. + +## Running Tests and lint + +Before submitting your pull request, ensure that all tests pass and that the code is linted: + +1. **Activate the virtual environment**: + ```bash + source venv/bin/activate + ``` +2. **Run the tests** using pytest: + ```bash + pytest --cov + ``` +3. **Run the tests** using pytest: + ```bash + pre-commit run --all-files + ``` + +Thank you for your contributions! + +## Contact + +If you have any questions or need further assistance, feel free to reach out to the maintainers. + +## Contributors + +- @EgonFerri +- @lorenzomassimiani