Skip to content

Commit

Permalink
Fixes and clarifications to doc homepage
Browse files Browse the repository at this point in the history
  • Loading branch information
kdm9 committed Aug 19, 2015
1 parent 8a8621a commit 3280e3a
Showing 1 changed file with 9 additions and 10 deletions.
19 changes: 9 additions & 10 deletions doc/kwip.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,9 @@ of ``$HOME`` with your preferred installation prefix.


The ``kwip`` executable is the core of ``kWIP``; its help statement is
reproduced above. This program operates on the saved Count-min Sketches of
``khmer`` (which they call Countgraphs). One can run with or without the
population entropy weighting, using the ``-U`` parameter to disable weighting.
reproduced above. This program operates on the saved Countgraphs of ``khmer``.
One can run with or without the entropy weighting, using the ``-U`` parameter
to disable weighting.

An example command could be:

Expand All @@ -111,10 +111,9 @@ The Concepts Behind ``kWIP``
The inner product between two vectors is directly related to the distance
between the vectors in Euclidean space. This has been utilised several times in
Bioinformatics to implement measures of genetic similarity between two
sequences, including the :math:`D2` statistic, and the spectrum kernel for
protein sequence compassion. Traditionally, the software which implement these
and similar algorithms operate on known genetic sequences, e.g. those taken
from a reference genome. ``kWIP``'s innovation is to weight the inner product
operation by a weight vector, and to derive weights in a way which minimises
the noise generated by next-gen sequencing protocols while maximising the
signal of genetic distance between samples.
sequences, including the :math:`D2` statistic. Traditionally, the software
which implement these and similar algorithms operate on known genetic
sequences, e.g. those taken from a reference genome. ``kWIP``'s innovation is
to weight the inner product operation by a weight vector, and to derive weights
in a way which minimises the noise inherent in next-gen sequencing datasets
while maximising the signal of genetic distance between samples.

0 comments on commit 3280e3a

Please sign in to comment.