added support for gs2

arangrhie · Feb 23, 2021 · d70009b · d70009b
1 parent b995a6c
commit d70009b
Show file tree

Hide file tree

Showing 3 changed files with 12 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -18,13 +18,21 @@ make -j 12
 
 Merfin can be used to assess collapsed or duplicated region of the assembly (`-hist`, `-dump`) or to evaluate variant calls (`-vmer`). QV estimates for all scaffolds will also be generated with `-hist` and `-dump`.
 
-In all cases a haploid/diploid peak estimate must be provided (`-peak`), either from the kmer histogram, or computed using the script `lookup.R` available under `scripts/lookup` (kcov).
+In all cases a haploid/diploid peak estimate must be provided (`-peak`), either from the kmer histogram, or computed using [Genomescope 2.0](https://github.com/gf777/genomescope2.0)(kcov).
 
 As a rule of thumb, the `-peak` should be:
 - haploid, if the reference used for read mapping and variant calling contains both the primary and the haplotigs, or both haplotypes of a trio
 - diploid (i.e. twice the haploid peak), for haploid representations of diploid genomes
 
-Optionally, a custom lookup table of kmer copy numbers with associated multiplicities and probabilities can be provided (`-lookup`). The lookup table is also generated using the script `lookup.R` under `scripts/lookup`. This is recommended, and can significantly improve the accuracy of all analyses.
+Optionally, a custom lookup table of kmer copy numbers with fitted multiplicities and probabilities can be provided (`-lookup`). The lookup table is generated when running our modified version of [Genomescope 2.0](https://github.com/gf777/genomescope2.0). This is recommended, and can significantly improve the accuracy of all analyses:
+
+```
+Rscript genomescope.R <kmer_histogram> <k_size> <output_folder> --fitted_hist [ploidy] [verbose]
+kmer_histogram  tab-delimited, 2-column file with (same as for Genomescope2, usually generated by meryl hist, or jellyfish)
+k_size          kmer length used for the histogram
+ploidy          haploid/diploid (default = 2)
+--fitted_hist	generates lookup_table.txt
+```
 
 ### Assess collapses/duplications ###
 

diff --git a/scripts/lookup_table/README.md b/scripts/lookup_table/README.md
@@ -1,5 +1,7 @@
 # Generate lookup table
 
+--This is discountined, follow the main readme instead--
+
 The script `lookup.R` is based on [Genomescope 2.0](http://qb.cshl.edu/genomescope/genomescope2.0/).
 
 In addition to the canonical Genomescope output it generates fitted read multiplicity values and probabilities (`lookup_table.txt`).

diff --git a/qv.sh → scripts/qv.sh b/qv.sh → scripts/qv.sh