-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phylogenetic updates to match pathogen-repo-guide #238
Conversation
Part of work to update this repo to match the pathogen-repo-guide. Simplify directory structure by putting the core Snakemake rules in a "rules" directory.
Part of work to update this repo to match the pathogen-repo-guide. Renamed in preparation for adding the "build-configs" directory in following commits to hold additional build configurations and customizations.
Part of work to update this repo to match the pathogen-repo-guide. Moves the exisitng CI build config and customizations to the new "build-configs" directory. Includes renaming of the `builds.yaml` to `config.yaml` since the file no longer clashes with Snakemake Profiles.
Part of work to update this repo to match the pathogen-repo-guide. Since the chores rules are internal Nextstrain rules, they do not need to be part of the core workflow. This also resolves the DAG confusion that Snakemake occassionally runs into when running CI locally.¹ README.md includes the new instructions on how to invoke the workflow to update the example data. This requires two config files: 1. The CI config to provide all required config params and to ensure the example data uses correct `strain_id_field` for CI builds. 2. The chores config to include the custom rules ¹ #237
Part of work to update this repo to match the pathogen-repo-guide.
Part of work to update this repo to match the pathogen-repo-guide. LAPIS was the default data source until we switched to using data.nextstrain.org.¹ This commit moves the LAPIS related rules to a custom build-config and is able to successfully override the default download/decompress rules. However, there are downstream changes to the workflow that cause errors with the LAPIS data, e.g. hard-coded filters on columns do not exist in the LAPIS data.² Rather than spending the time to make sure the workflow runs with the LAPIS data, I will remove the lapis build-configs in the next commit. I wanted to do so separately to be able to easily revert the removal if needed. ¹ 5d0791a ² https://github.com/nextstrain/mpox/blob/34feb4a4e6f4d8b7929cc701734b15b8a5e6a6fc/phylogenetic/workflow/snakemake_rules/core.smk#L43
Removing build-configs/lapis instead of spending the time to make sure the workflow runs with the LAPIS data. If others need/want to use the LAPIS data, this custom build config can be easily restored.
Part of work to update this repo to match the pathogen-repo-guide. Following commit will move other rules for preparing sequences from core.smk to prepare_sequences.smk.
Part of work to update this repo to match the pathogen-repo-guide.
Part of work to update this repo to match the pathogen-repo-guide.
Part of work to update this repo to match the pathogen-repo-guide.
Part of work to update this repo to match the pathogen-repo-guide. The remaining rules in core.smk are all related to the final export of the Auspice JSONs.
Automated builds have been deployed directly to production since 4017f51.
If `auspice_name` param is not provided, then the workflow will use the required `build_name` as the default Auspice filename. Ensures that we are following the Nextstrain Snakemake style guide for configs¹ by not accessing configs with `config.get(key)`. ¹ https://docs.nextstrain.org/en/latest/reference/snakemake-style-guide.html#access-config-values-appropriately
Follow the Nextstrain Snakemake style guide¹ to use literal path strings as inputs for the core workflow rules. This will make it possible for users to override the default workflow with `custom_rules`. ¹ https://docs.nextstrain.org/en/latest/reference/snakemake-style-guide.html#define-input-paths-with-literal-path-strings
38441eb
to
ce5c2d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The move and rename commits to match pathogen-repo-guide look correct to me, so there are no blocking comments from me.
I can't speak to the lapis changes but it does seem like the staged builds completed.
Thanks for the review @j23414! I'll merge and monitor the automatic builds over week. If there are issues with the removal of LAPIS, we can always revert the change in the future. |
Includes major file restructuring within the phylogenetic workflow to match the pathogen-repo-guide:
I had originally wanted to create a default config YAML as part of this PR. Then remembered this was already done in #173 and changed back to not use a default config in 5019420...
I'll leave this potentially controversial change to another PR. It will most likely involve some changes to the config schema as well as I was finding some potentially confusing uses of the config parameters (e.g.
other_filters
for non_b1 vs b1).