Skip to content

Commit

Permalink
fix: make legacy mapping use {clade} (Omicron) by default
Browse files Browse the repository at this point in the history
Resolves #471
I forgot to update the clade-legacy-mapping.yml with the last clade update (24C). As a result, the Nextstrain_clade output for 24C was wrong (`?` instead of `24C (Omicron)`).

To reduce maintenance effort, we now use `{value} (Omicron)` as the default mapping. This means one less file to update when clades are updated. If there's ever a non-Omicron again, we'd have to updated `clade-legacy-mapping.yml` but WHO has stopped giving variant names and there's nothing left that's not Omicron. So this should work into the indefinite future (until Emma stops using `Nextstrain_clade` - she's possibly the only user of that backwards-compat column)
  • Loading branch information
corneliusroemer committed Aug 22, 2024
1 parent 24e7907 commit fe20141
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 22 deletions.
7 changes: 5 additions & 2 deletions bin/join-metadata-and-clades
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ def main():
metadata = pd.read_csv(args.metadata, index_col=METADATA_JOIN_COLUMN_NAME,
sep='\t', low_memory=False, na_filter = False)

# Read and rename clade column to be more descriptive
clades = pd.read_csv(args.nextclade_tsv, index_col=NEXTCLADE_JOIN_COLUMN_NAME,
usecols=[NEXTCLADE_JOIN_COLUMN_NAME, *(set(column_map.keys()) - clades_21L_columns)],
sep='\t', low_memory=True, dtype="object", na_filter = False) \
Expand All @@ -113,7 +112,11 @@ def main():
# Add clade_legacy column as Nextstrain_clade
# Use yml mapping
with open(args.clade_legacy_mapping, 'r') as legacy_mapping_file:
clade_legacy_mapping = yaml.safe_load(legacy_mapping_file)
clade_legacy_mapping_dict: dict[str, str] = yaml.safe_load(legacy_mapping_file)

def clade_legacy_mapping(clade_nextstrain: str) -> str:
return clade_legacy_mapping_dict.get(clade_nextstrain, f"{clade_nextstrain} Omicron")

clades["Nextstrain_clade"] = clades["clade_nextstrain"].map(clade_legacy_mapping)

# Remove immune_escape and ace2_binding when clade <21L and not recombinant
Expand Down
20 changes: 0 additions & 20 deletions defaults/clade-legacy-mapping.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,4 @@
21F: 21F (Iota)
21G: 21G (Lambda)
21H: 21H (Mu)
21K: 21K (Omicron)
21L: 21L (Omicron)
21M: 21M (Omicron)
22A: 22A (Omicron)
22B: 22B (Omicron)
22C: 22C (Omicron)
22D: 22D (Omicron)
22E: 22E (Omicron)
22F: 22F (Omicron)
23A: 23A (Omicron)
23B: 23B (Omicron)
23C: 23C (Omicron)
23D: 23D (Omicron)
23E: 23E (Omicron)
23F: 23F (Omicron)
23G: 23G (Omicron)
23H: 23H (Omicron)
23I: 23I (Omicron)
24A: 24A (Omicron)
24B: 24B (Omicron)
recombinant: recombinant

0 comments on commit fe20141

Please sign in to comment.