Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPM from the count matrix ? #43

Open
ptranvan opened this issue Apr 26, 2024 · 3 comments
Open

TPM from the count matrix ? #43

ptranvan opened this issue Apr 26, 2024 · 3 comments
Labels

Comments

@ptranvan
Copy link

Hi,

I just got the count matrix from TElocal.

I would like to compute the TPM to perform within-sample comparison but the matrix is a mix of genes (I got ensembl ID, from annotation provided with --GTF) and transcripts from the TEs gtf.

Any advice for doing this ?

Thanks

@olivertam
Copy link
Member

Hi,

I assume that for each gene, you use the length of all non-overlapping exonic regions as the gene length, whereas for TE, the length of the TE copy/instance.
Let me know if that is unclear.

Thanks.

@ptranvan
Copy link
Author

ptranvan commented Apr 27, 2024

Thanks I will look for this.

It's somewhat surprising that in the GTF annotation, there are only transcripts with one exon (so I can compute the size easily).

Are there no transposable elements/transcripts with multiple exons ?

@olivertam
Copy link
Member

Hi,

Since we're using RepeatMasker as our annotation, we typically don't have the splicing information for TE. While there are studies that have shown spliced TE transcripts (e.g. in HERVK for humans), we don't feel that there is a comprehensive source for this information, and thus have not included it in our TE GTF.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants