Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pheval-exomiser prepare-exomiser-batch command bugs #61

Open
AO33 opened this issue Jun 6, 2024 · 1 comment
Open

pheval-exomiser prepare-exomiser-batch command bugs #61

AO33 opened this issue Jun 6, 2024 · 1 comment

Comments

@AO33
Copy link
Member

AO33 commented Jun 6, 2024

prepare-exomiser-batch bugs

Ive created a local branch that fixes or implements the features listed below

  • poetry install failure Error
    • Was an issue regarding lxml (4.9.2) not supporting PEP 517 builds.
    • Changed the poetry.lock file to the latest version of lxml 5.2.2 and reran poetry install to get around

  • pheval-exomiser prepare-exomiser-batch Error
    • TypeError: prepare_exomiser_batch() got an unexpected keyword argument 'results_dir'
    • The prepare_exomiser_batch and create_batch_files aren't quite synced up with input arguments
    • Inconsistency with the @click options that are made available

  • pheval-exomiser prepare-exomiser-batch Bug
    • phenotype-only setting is getting written to batch files even when vcf / analysis mode is desired.
    • The CommandsWriter.write_local_commands will only write analysis commands if variant_analysis (phenotype_only) is set to True. This can never be the case because it will always be set to False because of the vcf_dir and command options that are set.
    • I replaced --phenotype-only with --variant-analysis for tool input

  • pheval-exomiser prepare-exomiser-batch slight refactor
    • Originally output number_of_samples/max_jobs number of batch files
    • Refactored the BatchFileWriter.create_split_batch_files files function so that max_jobs argument outputs max_jobs number of batch files instead (User specified number of jobs without having to perform manual calculation beforehand)

  • pheval-exomiser post_process_results Added --from-batch-file optional option
    • --from-batch-file option allows user to pass in file/path/to/exomiser_batch_commands.txt file
    • This allows the user to only post process results from samples found in the input batch_commands.txt argument (So if multiple jobs are running they won't try and process incomplete or same results files)
@yaseminbridges
Copy link
Contributor

Hi @AO33!

I just have a few comments/questions 🙂

poetry install failure Error

  • Was an issue regarding lxml (4.9.2) not supporting PEP 517 builds.
  • Changed the poetry.lock file to the latest version of lxml 5.2.2 and reran poetry install to get around

I believe it is generally discouraged to manually change the poetry.lock as this can break the reproducibility. Perhaps another workaround can be to update the dependencies. I have had this error before but after fixing my poetry installation I've not encountered it.

  • pheval-exomiser prepare-exomiser-batch Error

    • TypeError: prepare_exomiser_batch() got an unexpected keyword argument 'results_dir'
    • The prepare_exomiser_batch and create_batch_files aren't quite synced up with input arguments
    • Inconsistency with the @click options that are made available

Sounds good!

  • pheval-exomiser prepare-exomiser-batch Bug

    • phenotype-only setting is getting written to batch files even when vcf / analysis mode is desired.
    • The CommandsWriter.write_local_commands will only write analysis commands if variant_analysis (phenotype_only) is set to True. This can never be the case because it will always be set to False because of the vcf_dir and command options that are set.
    • I replaced --phenotype-only with --variant-analysis for tool input

Sounds good! I had removed the phenotype_only a while ago so using --variant analysis makes perfect sense.

  • pheval-exomiser prepare-exomiser-batch slight refactor

    • Originally output number_of_samples/max_jobs number of batch files
    • Refactored the BatchFileWriter.create_split_batch_files files function so that max_jobs argument outputs max_jobs number of batch files instead (User specified number of jobs without having to perform manual calculation beforehand)

I am a bit confused by this - can you explain this a bit more? Has this changed so that max_jobs is the number of files rather than the number of samples? If so I think I would prefer to keep it as the number of samples ran in a single job.

  • pheval-exomiser post_process_results Added --from-batch-file optional option

    • --from-batch-file option allows user to pass in file/path/to/exomiser_batch_commands.txt file
    • This allows the user to only post process results from samples found in the input batch_commands.txt argument (So if multiple jobs are running they won't try and process incomplete or same results files)

What is the use case of this (just so I can understand a bit better)? What we have set out in PhEval is that a each job will have it's own output directory so this type of thing shouldn't happen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants