You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we create cohorts, variants in phenopackets are annotated and run through the _vep.py code.
One difficulty is that if there is an error, the ingest is killed, but the error message is not enough to know which phenopacket the error is coming from, e.g.
Expected a result but got an Error for variant: VariantCoordinates(chrom=9, start=127665287, end=127665288, ref=A, alt=G, change_length=0)
(note -- you will only see this error message after merge the PR I just made).
This is very difficult because (in this case) the original variant coordinates are hg19 and the above coordinates are hg38, and it took me about 15 minutes to track this down.
It would be better to catch this Exception and then rethrow it with the patient or phenopacket ID. I am not sure of the best place to add this?
The text was updated successfully, but these errors were encountered:
Hi @pnrobinson I think genome build mismatch is a situation that the code should handle. We have enough context to check if the builds match, see #84 for verbose info. So, I or Lauren will eventually fix this.
The only outstanding question is if we want to fail the entire ingest due to a bug like this. Currently, the framework must provide VariantCoordinates for a GenomicInterpretation, so, the only possible way to fail is to raise and explode at the parsing site. The upstream code that handles the phenopacket can either catch this exception or explode too.
Should we halt the entire ingest and point the user to the offending phenopacket or log the warning about invalid variant, swallow the exception, and ignore the variant? I think we should force the user to provide good input for now and defer the ergonomics to a later stage. What do you think?
Currently, the error message does not tell you the name of the erroneous phenopacket. It would make it much easier to include the name of the phenopacket in the error message for every time that we terminate the program like this -- otherwise it will be nearly impossible for others to use this package!
I can work on error messages today, there are a bunch that could be much more verbose, others that don't need to be printed where they are, etc.
I'll try to hit as many of them as I can and get a PR in soon!
When we create cohorts, variants in phenopackets are annotated and run through the _vep.py code.
One difficulty is that if there is an error, the ingest is killed, but the error message is not enough to know which phenopacket the error is coming from, e.g.
(note -- you will only see this error message after merge the PR I just made).
This is very difficult because (in this case) the original variant coordinates are hg19 and the above coordinates are hg38, and it took me about 15 minutes to track this down.
It would be better to catch this Exception and then rethrow it with the patient or phenopacket ID. I am not sure of the best place to add this?
The text was updated successfully, but these errors were encountered: