Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current issues with lovd_fixHGVS() #580

Open
15 of 20 tasks
ifokkema opened this issue Jan 10, 2022 · 1 comment
Open
15 of 20 tasks

Current issues with lovd_fixHGVS() #580

ifokkema opened this issue Jan 10, 2022 · 1 comment

Comments

@ifokkema
Copy link
Member

ifokkema commented Jan 10, 2022

There are some issues with the current version of lovd_fixHGVS() as in the improve/getVariantInfo branch. These are mostly variants that are "fixed" in a wrong way, causing data loss. Especially when the result is valid HGVS, this can cause a big problem as the "corrected" variant can be used to overwrite the original variant.

  • Variants that are edited but don't produce correct results
    • Variants that are left as HGVS (the most dangerous category):

      • c.361A>T^362C>G to c.361A>T (solution for now it to just return variants with a ^)
    • Variants that are left as sort-of HGVS:

      • g.(?_102471082?_103268232_?)dup to g.(?)
        It should be clear that lovd_fixHGVS() can not fix this variant. When the center ? is removed, the variant remains untouched.
      • g.117232272AA>G to g.117232272delinsG
        This variant is then reported to need to be a substitution, which lovd_fixHGVS() cannot fix. However, it's simply not translated correctly, it needs to become g.117232272_117232273delinsG or we should choose not to change it out of fear of ambiguity.
      • g.(135782758_135785957)_(135766735_?)del to g.(135766735_135785957)_(135782758_?)del
        Still not HGVS (positions still not in the right order), but I don't think we should have touched the positions at all.
      • g.(150138199_150142492)_(150145873_15147218)del to g.(15147218_150142492)_(150138199_150145873)del
        Positions are still not in the right order, but the last position misses a "0" in the description. I'm not super comfortable ignoring this issue even though the positions are still not in the right order, so the result is not HGVS.
      • g.(71706000_71841092)_(71841092_71841092)dup to g.(71706000_71841092)_71841092dup
        I'm not entirely sure how to interpret this variant, but I suppose we're almost there, but not quite.
    • Variants that are destroyed due to a bug (code generates a lot of notices):

      • c.-104-90_-104-899insAAAAA => c.
        Note that the actual issue is a typo in the first position.
      • c.(?_-1)_(-1_?)dup => c.
        I believe, technically, this is correct HGVS, so a fix may be required in lovd_getVariantInfo() for this.
        Related is c.(?_-182)_(-182_59)del. The problem appears to be that the inner position are equal. I suspect a simple < should be changed to <= or so.
      • g.(31196786_3119694)? => g.
        I'm not sure if this is HGVS.
    • Small errors:

      • c.145+1_146-1::NM_032221.4:c.(-24+1_-23-1) to c.145+1_146-1::NM_032221.4:(-24+1_-23-1)
        Second prefix is removed.
      • g.(135100001_136800000)_qterdelins[NC_000020.10:g.pter_(5100001_17900000)inv] to g.(135100001_136800000)_qterdelins[NC_000020.10:pter_(5100001_17900000)inv]
        Second prefix is removed.
      • c.1537-?_2784+?delins(1757-?)_(2611) to c.1537-?_2784+?delins(1757-?)_N[2611]
        Last position is misinterpreted as an insertion of 2611 bases.
      • C.3242C>A to g.C.3242C>A
      • del ex4 to g.delex4
      • GRCh37 chr17:g.47332514_49565743del to g.GRCh37chr17:47332514_49565743del
      • :g.220290456C>T to g.:220290456C>T
    • Stuff we could learn to fix but is currently left unchanged:

      • g.6128749_6128787delins[NC_000022.11:17178886_17178924] (misses prefix in inserted sequence)
      • g.(7045869_7045967)[ins15] (rewrite into insN[15])
      • g.8701170T>. (rewrite to del)
    • Other:

      • Now that we have lovd_getVariantLength(), we could implement this and check the suffixes. So, not just delete the suffix for variants like c.1delAA, as obviously the length doesn't match the positions.
@ifokkema
Copy link
Member Author

ifokkema commented Feb 9, 2022

Most things are handled now. Leaving this open so remind us that some variants are still not handled properly, but continuing with other work now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant