k-seek

Version 4. de novo identification and quantification of satellite DNA from illumina whole genome sequence data.

k.seek.pl identifies and quantifies satellite DNAs (tandemly repeating simple sequences) where the base motif is <= 20bp and generates two outputs .rep and .total. The former contains the the fastq reads from which the repeat was called with an additional 5th line documenting the the called repeat and the number of occurences. The latter contains the summation for all the called repeats.

Usage: perl k.seek.pl input.fastq output_basename

k.compiler.pl will concatenate multiple .total files in a folder into a table where each row is a sample and each column a repeat. Different phases and reverse compliments of the same repeat (e.g AAC, ACA, CAA, TTG, etc) will be summed. The .toal files need to be put into a folder without other files. The output will be .rep.compiled

Usage: perl k.compile.pl folder/ output_basename

For reference see: https://www.ncbi.nlm.nih.gov/pubmed/25512552

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

k-seek

Files

README.md

Latest commit

History

README.md

File metadata and controls

k-seek