-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatch between TElocal counts and BAM file reads #54
Comments
Hi, Thank you for your interest in the software.
Thanks. |
Thanks @olivertam for your prompt answer, here is the file |
Hi, Thank you for that file.
Thanks. |
Hi! Which gene and TE annotation files were you using with this? |
BAM_header.txt The
Then TElocal
Gene annotation is from https://rapid.ensembl.org/Homo_sapiens_GCA_009914755.4/Info/Index and TE annotation: The feature that has 24 counts is |
Thank you for providing the information. We'll take a closer look. Thanks. |
Hi, Sorry for bothering you again, but could you rerun the
Thanks. |
No problem, it looks a bit more logical now indeed. should I still run with with |
Hi, I don't think you need to rerun with Based on the new SAM, it does appear that there are 25 distinct read pairs that contributes to the region (23 pairs that align uniquely, and 2 pairs with 2 alignments). The way that I think that might give rise to the count of 24 for ALR/Alpha_dup1372. The previous Sorry for the inconvenience. Let us know if you have further questions. Thanks. |
Dear @olivertam I come back to you with an other similar issue. This is my command:
It seems to have one paired end mapped. However in my TElocal analysis, I don't see any count for that element (L1PA15_dup4168:L1PA15:L1:LINE) |
Hi, This is a case where you would need to see where the other 5 alignments for the read is located (as indicated by the Please let me know if that does not address your question. |
Oh, I see. How do you decide which copy gets the count assignment? |
The counts are initially split between each mapped copy, but then through the EM loop, the distribution of the counts are recalculated. This is done iteratively until the overall difference between subsequent EM iterations falls below a threshold (i.e. converged). To answer your question, the copy that got this particular read assignment is most likely the copy with the highest count where one of the 5 alignment overlapped. For more details on the methodology of the EM, section 3.3 of the paper might be helpful. Thanks. |
Hi,
I used TElocal to generate counts, and I’m particularly interested in one specific feature. For this feature, I obtained a count of 24. However, when I went back to the BAM file to extract the reads that map to this feature using
bedtools intersect
, I found only 12 reads (8 uniques) that come from:6 paired-end reads (representing 4 unique pairs).
How is it possible for the count to be 24 in this case?
Thanks for your help
The text was updated successfully, but these errors were encountered: