Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged libraries do not show lower-count k-mers #30

Open
bnavarrodominguez opened this issue Dec 18, 2024 · 0 comments
Open

Merged libraries do not show lower-count k-mers #30

bnavarrodominguez opened this issue Dec 18, 2024 · 0 comments

Comments

@bnavarrodominguez
Copy link

Dear Gene,

I have a large sequencing library that I needed to split into 10 smaller files so I could run FastK on different nodes. Following the instructions in the README, I ran FastK on the split files with the following command:

for file in library_split_*; do mkdir tmp.${file}; FastK -v -t5 -k31 -M50 -T24 -Ptmp.${file} $file; done

This produced a *.hist and a *.ktab file for each *.split.fastq file. I looked at the k-mer count histogram for each split file:

Histex -G library_split_01.hist > library_split_01.histogram
$ head library_split_01.histogram
1       6062202409
2       3370987439
3       1728287765
4       894614808
5       482057568

I then merged the split files using Fastmerge, and generated histograms for the merged k-mer database:

Fastmerge -T12 -t -h library_fastmerged library_split_*ktab
Histex -G library_fastmerged.hist > library_fastmerged.histogram
$ head library_fastmerged.histogram

4       2032698049
5       522131235
6       134785342
7       33514609
8       420971175

I noticed that there are no k-mers with a count lower than 4 in the merged library histogram. I repeated the process a few times, combining different files, and the merged histograms consistently lack smaller k-mer counts (i.e., they start at 4 or 5). I’m unsure if this behavior is expected, as I do not understand why there are no single-occurrence k-mers. Is this a bug, or am I misunderstanding or misusing the tool?

Thanks for your assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant