Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synonym sync: Refactor capitalization columns #725

Open
2 tasks
joeflack4 opened this issue Dec 18, 2024 · 1 comment
Open
2 tasks

Synonym sync: Refactor capitalization columns #725

joeflack4 opened this issue Dec 18, 2024 · 1 comment
Assignees

Comments

@joeflack4
Copy link
Contributor

joeflack4 commented Dec 18, 2024

Overview

In the first draft of the synonym sync, I introduced several extra columns that are supposed to help with analysis and curation, specifically with situations where the capitalization is different between the source and Mondo. They were: synonym_case_diff_mondo and synonym_case_diff_source (see: v8). In #720, I added additional columns to deal with cases where Mondo or the source had multiple capitalization variations. Now the columns are: synonym_case_mondo, synonym_case_diff_mondo, synonym_case_mondo_is_many, synonym_case_source, synonym_case_diff_source, synonym_case_source_is_many (see: v9).

Note: There is a UX issue in #720, where for rows where there are not multiple capitalization variations (that is, for most rows), the values in the new columns will be blank. I can fix that, but I have a better solution below.

Proposed changes

Existing columns (keep or drop):

  • synonym: Keep same. Shows the synonym capitalization in Mondo.
  • synonym_case_mondo: Drop. Duplicate of synonym column.
  • synonym_case_source: Keep.
  • synonym_case_diff_mondo & synonym_case_diff_source: Drop these in favor of new columns below. Currently, this is empty for rows where there is no capitalization difference, else it shows the synonym. If multiple values, pipe-delimited.
  • synonym_case_mondo_is_many & synonym_case_source_is_many (bool): Drop these in favor of new columns below.

New columns

  • synonym_case_differs (bool): If there is a case diff between Mondo and source.
  • synonym_case_alts_source: Pipe delimited variations
  • synonym_case_alts_mondo: Pipe delimited variations

Sub-tasks

Additional info

@joeflack4 joeflack4 self-assigned this Dec 18, 2024
@joeflack4
Copy link
Contributor Author

@twhetzel Note that I would prefer to combine this with #720, but we do not have to. If we are able to merge in #720 in time for the January release, I think we should do that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant