Skip to content

Commit

Permalink
ingest: Filter out empty host "Query"
Browse files Browse the repository at this point in the history
Since we need to join the host taxonomy info with the metadata via
the "Query" key, filter out lines where the "Query" column is empty.

This is also a work-around that resolves
<#17>.
  • Loading branch information
joverlee521 committed Nov 4, 2024
1 parent 01f492a commit 418b686
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions ingest/rules/fetch_from_ncbi.smk
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ rule join_metadata_and_hostinfo:
shell:
"""
unzip -p {input.ncbi_hosttax_info} ncbi_dataset/data/taxonomy_summary.tsv \
| tsv-filter -H --not-blank Query \
| tsv-select -H -f {params.ncbi_hosttax_columns} \
| tsv-join -H \
--filter-file - \
Expand Down

0 comments on commit 418b686

Please sign in to comment.