Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapped reads count in reads_report #267

Open
QianGuoPKU opened this issue May 21, 2024 · 2 comments
Open

mapped reads count in reads_report #267

QianGuoPKU opened this issue May 21, 2024 · 2 comments

Comments

@QianGuoPKU
Copy link

Hi,

I am reaching out for your help on the intepretation of the reads_report.

In my result, the raw data fastq included 26,362,661 reads in total. However, "#Assembly mapped" and "#Reference mapped" were reported as 42,433,073 and 31,999,068, respectively, both were more than the reads number (> 26,362,661). How to interpret the "#Assembly mapped" and "#Reference mapped"? Do they mean the count of mapped reads or the count of hits in the alignment, or anything else?

And I'd also like to ask for your help on interpreting the transposed_report. I am wondering which columns can convey the information on the misassemblies caused by transposable elements (TE)? In my file, I couldn't find any information of TE-related misassemblies or mismatches.

Please find the reads_report and transposed_report enclosed.

Any information and insights into the above questions would be much appreciated!

Best,
Zoe

image

reads_report.txt
transposed_report.txt

@balags1
Copy link

balags1 commented Nov 22, 2024

We face similar issues when comparing read counts from public data assemblies. The number of countable reads directly from the file vs what quast reports show as "# total reads" are not identical. For Illumina data, we can calculate and understand that the paired nature of input data isn't taken into account, but that still doesn't explain why read counts of the same raw data vary so much.

@balags1
Copy link

balags1 commented Nov 22, 2024

Here's a comparison from 12 samples we analyzed.

NCBI SRA output # reads Quast # reads
reads written : 5,101,034 5106239
reads written : 6,054,396 6057690
reads written : 3,578,388 3579502
reads written : 3,004,772 3011147
reads written : 1,133,924 1173786
reads written : 23,372,520 23391254
reads written : 1,706,432 1711560
reads written : 1,199,028 1283124
reads written : 1,824,162 2068696
reads written : 9,160,700 9177607
reads written : 1,937,778 2446295
reads written : 1,033,722 1034632
reads written : 1,587,460 1590464
reads written : 1,191,014 1266523

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants