Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: File management for fetch-binary and fetch-science functions #17

Open
wants to merge 104 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
fe26c60
feat: add class to manage outputs
mfacchinelli Aug 6, 2024
231e0c2
refactor: version is an `int` not a `str`
mfacchinelli Aug 6, 2024
3741a1f
refactor: replace custom logic with `OutputManager`
mfacchinelli Aug 6, 2024
fdcd9ba
refactor: convert "pattern" methods to separate "provider" class
mfacchinelli Aug 7, 2024
8ca6ca3
build: single source app version
mfacchinelli Aug 7, 2024
b4087a2
style: replace `-` with `_` in file names
mfacchinelli Aug 7, 2024
da4c44b
task: add support for `prefix` and `level` in output metadata
mfacchinelli Aug 7, 2024
99773b5
feat: add more variables in database
mfacchinelli Aug 7, 2024
7ede655
feat!: use output manager to write to output and database
mfacchinelli Aug 7, 2024
76afe1e
dev: add `SQLALCHEMY_URL` to dev container environment
mfacchinelli Aug 7, 2024
f6927b1
test(fix): fix output manager and calibration tests
mfacchinelli Aug 7, 2024
3eb23f1
fix: do not export to database in tests
mfacchinelli Aug 7, 2024
332cb24
fix: HK and science tests
mfacchinelli Aug 7, 2024
de14a6f
fix: attempt to fix issue with database initialization
mfacchinelli Aug 7, 2024
fdb5882
build: add `pytest-mock` to dev dependencies
mfacchinelli Aug 8, 2024
8cbe6be
test: add tests for `FetchBinary` class
mfacchinelli Aug 8, 2024
527bcc6
test: add unit tests for `FetchScience`
mfacchinelli Aug 9, 2024
bc374d3
chore: update lock file
mfacchinelli Aug 9, 2024
c7c274f
task: add `hash` to database and make `path` a unique constraint
mfacchinelli Aug 9, 2024
e95ac5f
feat: update database handler to check for error cases
mfacchinelli Aug 9, 2024
87dce32
fix: remove logging of URL
mfacchinelli Aug 9, 2024
050d8b4
test: move `create_test_file` to utility and use `assert_not_called` …
mfacchinelli Aug 9, 2024
fa82965
test: add some tests for database output manager
mfacchinelli Aug 9, 2024
84770f3
test: add coverage for database error
mfacchinelli Aug 12, 2024
3ae9fc4
Merge remote-tracking branch 'origin/main' into feat/hk_manage
mfacchinelli Aug 12, 2024
739765b
build: change `imap-mag` version to initial version
mfacchinelli Aug 12, 2024
94831b2
test(fix): remove commented out text
mfacchinelli Aug 12, 2024
bddeea8
fix: temporarely use custom version of imap-data-access to avoid issu…
mfacchinelli Aug 12, 2024
3239efb
Revert "fix: temporarely use custom version of imap-data-access to av…
mfacchinelli Aug 12, 2024
164b61b
build: add coverage report to CI action
mfacchinelli Aug 12, 2024
7c9ebd4
test: skip failing test due to bug in `imap-data-access`
mfacchinelli Aug 12, 2024
9145268
build(fix): use version instead of branch
mfacchinelli Aug 12, 2024
67b1726
test(fix): name of classes under test in comment
mfacchinelli Aug 12, 2024
9540787
test: add more tests to increase coverage
mfacchinelli Aug 12, 2024
39a30a6
test(fix): actually use `convertToDatetime`
mfacchinelli Aug 12, 2024
65db0ff
fix: CI action failure, packaging failure, test failure
mfacchinelli Aug 16, 2024
afa22f2
fix: more fixes to permissions for action and packaging
mfacchinelli Aug 16, 2024
cffe685
Squashed commit of the following:
mfacchinelli Aug 16, 2024
f342eb1
task: fix src paths and docker multistage builds
alastairtree Aug 5, 2024
dc72096
task: tidy dockerfile
alastairtree Aug 5, 2024
4857812
Initial add of CI step to test on Windows
mhairifin Aug 8, 2024
43d8083
install patched wiremock and fix Path usage
mhairifin Aug 9, 2024
65ebba6
clone from https rather than ssh
mhairifin Aug 9, 2024
8dbd7da
fix: add yaml representer for a indowsPath
mhairifin Aug 12, 2024
455277f
fix: remove hashes from requirements so git package can be installed
mhairifin Aug 12, 2024
4e27257
install git in dockerfile
mhairifin Aug 12, 2024
3a40b76
Correct formatting
mhairifin Aug 12, 2024
39df053
update dependency on patched python wiremock to dev dependency
mhairifin Aug 12, 2024
df730b8
Update wiremock to patch it so no path manipulations necessary
mhairifin Aug 13, 2024
8b98e37
Correct code formatting
mhairifin Aug 13, 2024
d71e359
Skip tests that will fail on Windows runner
mhairifin Aug 13, 2024
e0eee44
Fix ruff errors
mhairifin Aug 13, 2024
4f922b8
Fix ruff errors
mhairifin Aug 13, 2024
4e8d20e
Running pre-commit
mhairifin Aug 13, 2024
1163a78
chore: fix line ending in pre-commit
alastairtree Aug 16, 2024
1739fe6
feat: add class to manage outputs
mfacchinelli Aug 6, 2024
fc771c3
refactor: version is an `int` not a `str`
mfacchinelli Aug 6, 2024
d1099d9
refactor: replace custom logic with `OutputManager`
mfacchinelli Aug 6, 2024
4d66bed
refactor: convert "pattern" methods to separate "provider" class
mfacchinelli Aug 7, 2024
620da38
build: single source app version
mfacchinelli Aug 7, 2024
1a75c22
style: replace `-` with `_` in file names
mfacchinelli Aug 7, 2024
cb4186c
task: add support for `prefix` and `level` in output metadata
mfacchinelli Aug 7, 2024
cbe7e83
feat: add more variables in database
mfacchinelli Aug 7, 2024
a69dd9b
feat!: use output manager to write to output and database
mfacchinelli Aug 7, 2024
5093519
dev: add `SQLALCHEMY_URL` to dev container environment
mfacchinelli Aug 7, 2024
5510d3a
test(fix): fix output manager and calibration tests
mfacchinelli Aug 7, 2024
28ce147
fix: do not export to database in tests
mfacchinelli Aug 7, 2024
9b353bc
fix: HK and science tests
mfacchinelli Aug 7, 2024
d780de8
fix: attempt to fix issue with database initialization
mfacchinelli Aug 7, 2024
f58a6a7
build: add `pytest-mock` to dev dependencies
mfacchinelli Aug 8, 2024
f7e9fd1
test: add tests for `FetchBinary` class
mfacchinelli Aug 8, 2024
1a6178a
test: add unit tests for `FetchScience`
mfacchinelli Aug 9, 2024
b8737aa
chore: update lock file
mfacchinelli Aug 9, 2024
8a9236d
task: add `hash` to database and make `path` a unique constraint
mfacchinelli Aug 9, 2024
56d8fc1
feat: update database handler to check for error cases
mfacchinelli Aug 9, 2024
a366cbc
fix: remove logging of URL
mfacchinelli Aug 9, 2024
97ff8f2
test: move `create_test_file` to utility and use `assert_not_called` …
mfacchinelli Aug 9, 2024
c22489b
test: add some tests for database output manager
mfacchinelli Aug 9, 2024
5dbf6dd
test: add coverage for database error
mfacchinelli Aug 12, 2024
490a515
build: change `imap-mag` version to initial version
mfacchinelli Aug 12, 2024
d4ac011
test(fix): remove commented out text
mfacchinelli Aug 12, 2024
e4b545d
fix: temporarely use custom version of imap-data-access to avoid issu…
mfacchinelli Aug 12, 2024
7114e0c
Revert "fix: temporarely use custom version of imap-data-access to av…
mfacchinelli Aug 12, 2024
f786a87
build: add coverage report to CI action
mfacchinelli Aug 12, 2024
e9694c9
test: skip failing test due to bug in `imap-data-access`
mfacchinelli Aug 12, 2024
3a82f36
build(fix): use version instead of branch
mfacchinelli Aug 12, 2024
6c368f4
test(fix): name of classes under test in comment
mfacchinelli Aug 12, 2024
01360fb
test: add more tests to increase coverage
mfacchinelli Aug 12, 2024
0a0a455
test(fix): actually use `convertToDatetime`
mfacchinelli Aug 12, 2024
91f0c34
fix: CI action failure, packaging failure, test failure
mfacchinelli Aug 16, 2024
ea59ccb
fix: more fixes to permissions for action and packaging
mfacchinelli Aug 16, 2024
693cd49
Squashed commit of the following:
mfacchinelli Aug 16, 2024
921768a
Merge branch 'feat/hk_manage' of https://github.com/ImperialCollegeLo…
mfacchinelli Aug 16, 2024
9c86b45
fix: merge issues
mfacchinelli Aug 16, 2024
36d8ecf
build(fix): use existing permissions
mfacchinelli Aug 16, 2024
6218f99
fix: install git on Docker image
mfacchinelli Aug 16, 2024
41decd0
try: use git under wine
mfacchinelli Aug 16, 2024
844213b
try: force Docker run action to finish
mfacchinelli Aug 16, 2024
a041109
try: more tests to make git work on Docker
mfacchinelli Aug 16, 2024
37fa416
fix: give up on git-based imap-data-access and skip failing test
mfacchinelli Aug 16, 2024
1bcb16a
fix: update to latest `imap-data-access` and unfilter test
mfacchinelli Aug 22, 2024
55e0838
fix: partially address Alastair's comments
mfacchinelli Aug 30, 2024
49a8d22
fix: avoid circular dependency
mfacchinelli Aug 30, 2024
003cf5a
fix: further reduce duplication with hash
mfacchinelli Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
"WEBPODA_AUTH_CODE": "${localEnv:WEBPODA_AUTH_CODE}",
"SDC_AUTH_CODE": "${localEnv:SDC_AUTH_CODE}",
"IMAP_DATA_ACCESS_URL": "${localEnv:IMAP_DATA_ACCESS_URL}",
"SQLALCHEMY_URL": "${localEnv:SQLALCHEMY_URL}",
// Define WireMock variables to connect Docker outside of Docker.
"WIREMOCK_DIND": "1",
"TESTCONTAINERS_HOST_OVERRIDE": "host.docker.internal"
Expand Down
48 changes: 48 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,14 @@ on:
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# write to checks/pull-request extra permission needed by 5monkeys/cobertura-action to post coverage stats
# write packages needed by docker image step
permissions:
id-token: write
contents: write
checks: write
packages: write
pull-requests: write

env:
PREFERED_PYTHON_VERSION: '3.12'
Expand Down Expand Up @@ -150,6 +153,13 @@ jobs:
path: 'test-results.xml'
reporter: java-junit

- name: Coverage Report
uses: 5monkeys/cobertura-action@v14
with:
report_name: Coverage Report (${{ matrix.python-versions }})
path: "coverage.xml"
minimum_coverage: 80

- name: Create Release ${{github.ref_name}} & upload artifacts
uses: softprops/action-gh-release@v2
if: ${{ startsWith(github.ref, 'refs/tags/') }}
Expand All @@ -160,6 +170,44 @@ jobs:
files: |
dist/${{ env.PACKAGE_NAME }}_python${{matrix.python-versions}}_${{ env.PACKAGE_VERSION }}.zip

test_on_windows:
strategy:
matrix:
python-versions: ['3.10', '3.11', '3.12']
os: [windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-versions }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install poetry
poetry install

- name: Run tests
run: poetry run pytest -s --cov-config=.coveragerc --cov=src --cov-append --cov-report=xml --cov-report term-missing --cov-report=html --junitxml=test-results.xml tests

- name: Upload Coverage report
uses: actions/upload-artifact@v4
if: matrix.python-versions == env.PREFERED_PYTHON_VERSION
with:
name: CoverageReport_${{ matrix.os }}_python${{matrix.python-versions}}_${{ env.PACKAGE_VERSION }}
path: htmlcov
if-no-files-found: error

- name: Test Report
uses: dorny/test-reporter@v1
if: success() || failure()
with:
name: Test Results (${{ matrix.os }}) (${{ matrix.python-versions }})
path: 'test-results.xml'
reporter: java-junit


build_single_file_binary:
strategy:
matrix:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -135,3 +135,4 @@ site/
.work
/output
dev.env
debug
2 changes: 1 addition & 1 deletion deploy/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

# Install the postgres client and any other compile time dependencies needed to build our app
RUN apt-get update && apt-get install -y libpq-dev gcc
RUN apt-get update && apt-get install -y libpq-dev gcc git

# Creates a non-root user with an explicit UID and adds permission to access the /app folder
# For more info, please refer to https://aka.ms/vscode-docker-python-configure-containers
Expand Down
2 changes: 1 addition & 1 deletion pack.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ poetry build

# output a requierments.txt file used by docker during the build
poetry self add poetry-plugin-export
poetry export --format=requirements.txt > dist/requirements.txt
poetry export --without-hashes --format=requirements.txt > dist/requirements.txt

# move the files into a folder with the python version
mkdir -p dist/python$PYTHON_VERSION
Expand Down
409 changes: 245 additions & 164 deletions poetry.lock

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[project]
requires-python = ">=3.10"
name = "imap-mag"
version = "0.1.0"

[tool.poetry]
name = "imap-mag"
Expand Down Expand Up @@ -28,19 +29,20 @@ alembic = "^1.13.2"
sqlalchemy-utils = "^0.41.2"
requests = "^2.32.3"
pandas = "^2.2.2"
imap-data-access = "^0.7.0"
imap-data-access = "^0.9.0"
cdflib = "^1.3.1"
psycopg = {extras = ["binary"], version = "^3.2.1"}

[tool.poetry.group.dev.dependencies]
pytest = "^8.3.1"
pytest-cov = "^5.0.0"
pytest-mock = "^3.14.0"
pyinstaller = "^6.5.0"
pre-commit = "^3.8.0"
ruff = "^0.5.4"
wiremock = "^2.6.1"
docker = "^7.1.0"
testcontainers = "^4.7.2"
wiremock = {git = "https://github.com/ImperialCollegeLondon/python-wiremock.git", rev = "fix-test-containers-on-windows"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just chased on the wiremock Slack to try and get this PR bugfix we need merged but no luck so far.


[tool.poetry.scripts]
# can execute via poetry, e.g. `poetry run imap-mag hello world`
Expand Down
6 changes: 3 additions & 3 deletions run-docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ if [ "$1" == "debug" ] || [ "$1" == "DEBUG" ] || [ "$1" == "-i" ]; then
$IMAGE_NAME
elif [ -z "$1" ]; then # no args passed
docker run --rm -it \
--env-file dev.env \
-v /mnt/imap-data:/data \
$IMAGE_NAME
--env-file dev.env \
-v /mnt/imap-data:/data \
$IMAGE_NAME
else
echo "Extra arguments: $@"
docker run --rm -it \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
"""Added version, hash, date and software version columns

Revision ID: 669111c45c37
Revises: d0457f3e98c8
Create Date: 2024-08-09 13:35:21.578940

"""

from datetime import datetime

import sqlalchemy as sa
from alembic import op

# revision identifiers, used by Alembic.
revision = "669111c45c37"
down_revision = "d0457f3e98c8"
branch_labels = None
depends_on = None


def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column(
"files", sa.Column("version", sa.Integer(), nullable=False, default=0)
)
op.add_column(
"files", sa.Column("hash", sa.String(length=64), nullable=False, default="")
)
op.add_column(
"files",
sa.Column(
"date", sa.DateTime(), nullable=False, default=datetime.fromtimestamp(0)
),
)
op.add_column(
"files",
sa.Column(
"software_version", sa.String(length=16), nullable=False, default="0.0.0"
),
)
op.create_unique_constraint(None, "files", ["path"])
# ### end Alembic commands ###


def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_constraint(None, "files", type_="unique")
op.drop_column("files", "software_version")
op.drop_column("files", "date")
op.drop_column("files", "hash")
op.drop_column("files", "version")
# ### end Alembic commands ###
11 changes: 9 additions & 2 deletions src/imap_db/model.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from sqlalchemy import String
from datetime import datetime

from sqlalchemy import DateTime, Integer, String
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column


Expand All @@ -8,9 +10,14 @@ class Base(DeclarativeBase):

class File(Base):
__tablename__ = "files"

id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str] = mapped_column(String(128))
path: Mapped[str] = mapped_column(String(256))
path: Mapped[str] = mapped_column(String(256), unique=True)
version: Mapped[int] = mapped_column(Integer())
hash: Mapped[str] = mapped_column(String(64))
date: Mapped[datetime] = mapped_column(DateTime())
software_version: Mapped[str] = mapped_column(String(16))

def __repr__(self) -> str:
return f"<File {self.id} (name={self.name}, path={self.path})>"
86 changes: 84 additions & 2 deletions src/imap_mag/DB.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,34 @@
import abc
import logging
import os
from pathlib import Path

import typer
from imap_db.model import File
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

from imap_mag import __version__
from imap_mag.outputManager import IFileMetadataProvider, IOutputManager, generate_hash


class IDatabase(abc.ABC):
"""Interface for database manager."""

def insert_file(self, file: File) -> None:
"""Insert a file into the database."""
self.insert_files([file])
pass

@abc.abstractmethod
def insert_files(self, files: list[File]) -> None:
"""Insert a list of files into the database."""
pass


class Database(IDatabase):
"""Database manager."""

class DB:
def __init__(self, db_url=None):
env_url = os.getenv("SQLALCHEMY_URL")
if db_url is None and env_url is not None:
Expand All @@ -16,10 +39,12 @@ def __init__(self, db_url=None):
"No database URL provided. Consider setting SQLALCHEMY_URL environment variable."
)

# TODO: Check database is available

self.engine = create_engine(db_url)
self.Session = sessionmaker(bind=self.engine)

def insert_files(self, files: list[File]):
def insert_files(self, files: list[File]) -> None:
session = self.Session()
try:
for file in files:
Expand All @@ -40,3 +65,60 @@ def insert_files(self, files: list[File]):
raise e
finally:
session.close()


class DatabaseOutputManager(IOutputManager):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the name "output manager" because i can't really tell what it is but not sure if there is anything better. Basically just talking to myself here.

"""Decorator for adding files to database as well as output."""

__output_manager: IOutputManager
__database: IDatabase

def __init__(
self, output_manager: IOutputManager, database: Database | None = None
):
"""Initialize database and output manager."""

self.__output_manager = output_manager

if database is None:
self.__database = Database()
else:
self.__database = database

def add_file(
self, original_file: Path, metadata_provider: IFileMetadataProvider
) -> tuple[Path, IFileMetadataProvider]:
(destination_file, metadata_provider) = self.__output_manager.add_file(
original_file, metadata_provider
)

file_hash: str = generate_hash(original_file)

if not (
destination_file.exists() and (generate_hash(destination_file) == file_hash)
):
logging.error(
f"File {destination_file} does not exist or is not the same as original {original_file}."
)
destination_file.unlink(missing_ok=True)
raise typer.Abort()

logging.info(f"Inserting {destination_file} into database.")

try:
self.__database.insert_file(
File(
name=destination_file.name,
path=destination_file.absolute().as_posix(),
version=metadata_provider.version,
hash=file_hash,
date=metadata_provider.date,
software_version=__version__,
)
)
except Exception as e:
logging.error(f"Error inserting {destination_file} into database: {e}")
destination_file.unlink()
raise e

return (destination_file, metadata_provider)
13 changes: 13 additions & 0 deletions src/imap_mag/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""The main module for project."""

from importlib.metadata import PackageNotFoundError, version


def get_version() -> str:
try:
return version("imap-mag")
except PackageNotFoundError:
print("IMAP MAG CLI Version unknown, not installed via pip.")


__version__ = get_version()
Loading