-
Notifications
You must be signed in to change notification settings - Fork 27
Distance Calculation
BiG-SCAPE’s Pairwise distance calculation is divided between three values that measure:
- the percentage of shared domain types, measured by the Jaccard Index (
JI
) - a coefficient of all distinct shared types of domains divided by the total number of distinct domain types. - the similarity of adjacent domain pairs, measured by the Adjacency Index (
AI
) - a coefficient of all distinct shared domain pairs divided by the total number of distinct domain pairs. - the similarity between aligned domain sequences, i.e. Domain Sequence Similarity, or
DSS
index - a score that considers the sequence similarities for every domain type.
The DSS
score is further subdivided into two components, one which accounts for anchor domains
and one which accounts for non-anchor domains
. These so-called anchor domains consist of well known core scaffold domains for e.g. PKS
and NRPS
classes which are given a higher weight in the DSS
calculation.
BiG-SCAPE 2’s default anchor domains list resides in the config.yml
file and can be modified by the user.
The contribution, i.e. weight, of each of the three scores (JI
, AI
, DSS
) to the final BiG-SCAPE distance has been tuned in BiG-SCAPE 1 for each of the BGC class groups defined in v1. To make use of these tuned weights toggle --legacy-weights
. Otherwise, the default in BiG-SCAPE 2 is to use the distance metric based on the mix
weight distribution. --legacy-weights
can be combined with --classify legacy
to fully reproduce BiG-SCAPE 1 behavior. To combine with the newer --classify [mode]
ensure that the input .gbks
are processed with antiSMASH v6 or above.
weights are in the order JC, AI, DSS, Anchor boost
LEGACY_WEIGHTS = { "PKSI": {"weights": (0.22, 0.02, 0.76, 1.0)}, "PKSother": {"weights": (0.0, 0.68, 0.32, 4.0)}, "NRPS": {"weights": (0.0, 0.0, 1.0, 4.0)}, "RiPP": {"weights": (0.28, 0.01, 0.71, 1.0)}, "saccharide": {"weights": (0.0, 1.0, 0.0, 1.0)}, "terpene": {"weights": (0.2, 0.05, 0.75, 2.0)}, "PKS-NRP_Hybrids": {"weights": (0.0, 0.22, 0.78, 1.0)}, "other": {"weights": (0.01, 0.02, 0.97, 4.0)}, "mix": {"weights": (0.2, 0.05, 0.75, 2.0)}, }