Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

determination of the end of 5'UTR and start of 3'UTR in a specific transcript #2

Open
JiaruiMi01 opened this issue Sep 23, 2024 · 1 comment

Comments

@JiaruiMi01
Copy link

Dear Daianna,

I have used your pipeline to analyze new data and obtained promising results. Thank you.

During the analysis, I encountered one challenge regarding the precise determination of the termination region of the 5' UTR and the starting region of the 3' UTR in terms of their physical locations in a automatic way. I have been manually calculating these positions using the exon panel on Ensembl. However, it is slow in process and I've noticed that in some cases, there is alternative exon-intron splicing, which complicates the process. I have to manually change these two position. I was wondering if you might have a simpler and automatic method to address this issue.

all the best,
Jiarui

@daianna21
Copy link
Owner

Hi Jiarui,

Unfortunately I cannot think of an easier way to be certain about the genomic boundaries. You can decide to trust the variant annotations returned by Ensemble, gnomAD, etc. (e.g. exonic, intronic, 3' UTR variants). But if you want to be precise and make sure, I think you should check them manually if the exact positions are crucial for your research.

Regarding alternative splicing, I know having multiple transcripts for the same gene adds much complexity to the analyses in terms of mapping variants and annotating their functional impact (a variant can be exotic for one transcript and intronic for another). If your aim is not to study variants in specific transcripts but globally at the gene level, my recommendation is to analyze genetic variation in the canonical transcripts of the genes only. In such way you avoid ambiguities in variant position and functional labels. But well, it depends on your aims and the gene(s)/gene family you're studying.

I'll be happy to talk more via zoom. Hope it helps.

Best,
Daianna

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants