Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api): add missing denormalized rules in refactor export_study_fla… #1676

Closed
wants to merge 13 commits into from

Conversation

TLAIDI
Copy link
Contributor

@TLAIDI TLAIDI commented Jul 27, 2023

…t (#1646)

@commit-lint
Copy link

commit-lint bot commented Jul 27, 2023

Tests

  • export: add unit tests for RawStudyService.export_study_flat method (961929d)
  • export: add unit tests for VariantStudyService.generate_task method (a31feec)
  • export: adjust the unit test to accommodate the updated signature in 'export_study_flat' (abd7490)

Code Refactoring

Bug Fixes

Contributors

laurent-laporte-pro, TLAIDI

Commit-Lint commands

You can trigger Commit-Lint actions by commenting on this PR:

  • @Commit-Lint merge patch will merge dependabot PR on "patch" versions (X.X.Y - Y change)
  • @Commit-Lint merge minor will merge dependabot PR on "minor" versions (X.Y.Y - Y change)
  • @Commit-Lint merge major will merge dependabot PR on "major" versions (Y.Y.Y - Y change)
  • @Commit-Lint merge disable will desactivate merge dependabot PR
  • @Commit-Lint review will approve dependabot PR
  • @Commit-Lint stop review will stop approve dependabot PR

@laurent-laporte-pro laurent-laporte-pro force-pushed the fix/denormalize_export_study_flat branch from 8d4dd66 to 02c7f67 Compare September 12, 2023 07:37
output_list_filter=output_list_filter,
output_src_path=output_src_path,
)
study = self.study_factory.create_from_fs(dst_path, "", use_cache=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

La "dénormalisation" doit rester optionnelle, il te faut donc conserver le paramètre denormalize.

@@ -325,28 +324,21 @@ def import_study(self, metadata: RawStudy, stream: IO[bytes]) -> Study:

def export_study_flat(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attention : la signature de cette fonction publique a changé. Cela entraîne une rupture de compatibilité.

Il faut donc adapter le code appelant (je crois que c'est fait), mais aussi les tests unitaires.

dst_path: Path,
outputs: bool = True,
output_src_path: Optional[Path] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pour moi, ce paramètre est inutile car le répertoire de sortie est et restera toujours "output'.

shutil.rmtree(output_dest_path)
if output_list_filter is not None:
os.mkdir(output_dest_path)
for output in output_list_filter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Le défaut de cet algorithme réside dans la possibilité de dupliquer des données. En effet, lorsqu'une sortie au format ZIP est décompressée, il n'y a aucune garantie que le fichier ZIP d'origine soit supprimé à la fin de la décompression. Autrement dit, si l'on décompresse le fichier "20230802-1628eco.zip" (par exemple, via une tâche planifiée), il est possible de se retrouver avec à la fois le répertoire "20230802-1628eco" et le fichier "20230802-1628eco.zip" d'origine.

dst=output_dest_path / output,
)
else:
shutil.copytree(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ma remarque concernant la duplication des sorties et aussi valable ici.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je te propose l'implémentation suivante :

import functools
import shutil
import typing as t
import zipfile
from pathlib import Path


# noinspection PyUnusedLocal
def _ignore_patterns(path_study: Path, directory: str, contents: t.Sequence[str]) -> t.Sequence[str]:
    return ["output"] if Path(directory) == path_study else []


def export_study_flat(
    path_study: Path,
    dest: Path,
    outputs: bool = True,
    output_list_filter: t.Sequence[str] = (),
) -> None:
    """
    Export study to destination

    Args:
        path_study: Study source path
        dest: Destination path.
        outputs: List of outputs to keep.
        output_list_filter: List of outputs to keep (None indicate all outputs).
    """
    output_src_path = path_study / "output"
    output_dest_path = dest / "output"

    ignore_patterns = functools.partial(_ignore_patterns, path_study)
    # noinspection PyTypeChecker
    shutil.copytree(src=path_study, dst=dest, ignore=ignore_patterns)

    if outputs and output_src_path.exists():
        if not output_list_filter:
            # Retrieve all directories or ZIP files without duplicates
            output_list_filter = list(
                {
                    # fmt: off
                    f.with_suffix("").name
                    for f in output_src_path.iterdir()
                    if f.is_dir() or f.suffix.lower() == ".zip"
                    # fmt: on
                }
            )
        # Copy each folder or uncompress each ZIP file to the destination dir.
        shutil.rmtree(output_dest_path, ignore_errors=True)
        output_dest_path.mkdir()
        for output in output_list_filter:
            zip_path = output_src_path / f"{output}.zip"
            if zip_path.exists():
                with zipfile.ZipFile(zip_path) as zf:
                    zf.extractall(output_dest_path / output)
            else:
                shutil.copytree(
                    src=output_src_path / output,
                    dst=output_dest_path / output,
                )

dst=output_dest_path,
)

stop_time = time.time()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bof ! on peut supprimer.

dst_path=dst_path,
outputs=False,
denormalize=False,
output_src_path=output_src_path,
Copy link
Contributor

@laurent-laporte-pro laurent-laporte-pro Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Le paramètre output_src_path n'est pas utile (car calculable).

dst_path: Path,
outputs: bool = True,
output_src_path: Optional[Path] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Il faut conserver le paramètre denormalize !

@laurent-laporte-pro laurent-laporte-pro force-pushed the fix/denormalize_export_study_flat branch from 02c7f67 to abd7490 Compare September 12, 2023 09:22
@laurent-laporte-pro laurent-laporte-pro added the wontfix This will not be worked on label Nov 16, 2023
@laurent-laporte-pro
Copy link
Contributor

wont fix

@laurent-laporte-pro laurent-laporte-pro deleted the fix/denormalize_export_study_flat branch November 16, 2023 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactoring request for Study export function
2 participants