Skip to content
Alessio Milanese edited this page Apr 3, 2020 · 11 revisions

1. Why just few reads map in my profiles?

If you profile your samples and have zero reads mapping or very few reads with the -c option. One possibility is that all the reads are filtered out. The mOTUs profiler is filtering out all the reads that map with less than 75 nucleotides (-l 75). For example, with old samples, the fastq reads were on average of length 50, and in this case they would all be filtered out. Try to use -l 45 to keep more reads during the filtering process.
Note that the average read length of the reads in your sample is printed by the tool:

[main] Minimum alignment length: 75 (average read length: 50)

You can also add -g 1 to keep more reads (see Increase precision or recall page for more information).
Another possibility is that you are profiling samples from a biome that is not covered by reference genomes and is also not covered by mOTUs. The biomes that we can currently profile with meta-mOTUs (unknown species) are oceans, human gut, human oral cavity, human vagina and human skin. If you have soil samples, mOTUs will be able to profile only the reference genomes, which will cover a small portion of all the species.

2. What is the meaning of the -1 fraction?

The -1 at the end of the profile file represents the fraction of unmapped reads. This represents species that we know to be present in the sample, but we are not able to quantify. For almost all the analysis, it is better to remove this value, since it does not represent a single species/clade. The usefulness of the -1 comes out when we need to calculate relative abundances. See the following example:

 True rel. ab.     mOTUs read counts     mOTUs rel. ab.
species1   20%        species1   200     species1   20%
species2   10%        species3   300     species3   30%
species3   30%        species4   100     species4   10%
species4   10%        -1         400     -1         40%
species5   30%

In the example the sample (True rel. ab.) contains 5 species, of which only 3 are represented in the mOTUs profiler. Despite this, the relative abundance of these species is correct since we are able to measure the -1 (or unmapped reads). If you would calculate the relative abundance without taking into account the -1, then you would get an over-estimation of the profiled species:

 True rel. ab.     mOTUs read counts       mOTUs rel. ab.
species1   20%        species1   200     species1   33.4%
species2   10%        species3   300     species3     50%
species3   30%        species4   100     species4   16.6%
species4   10%
species5   30%