silk-framework · mikel-egana-aranguren · Jun 2, 2021 · Jun 2, 2021
diff --git a/doc/Plugins.md b/doc/Plugins.md
@@ -1,4 +1,6 @@
-# Data Sources
+# Plugins
+
+## Data Sources
 
 - Sparql Endpoints
 - RDF files
@@ -7,10 +9,11 @@
 - CSV files
 - MySQL database
 
-# Similarity Measures
+## Similarity Measures
 
 The following similarity measures are included:
-## Characterbased
+
+### Character based
 
 Character-based distance measures compare strings on the character level. They are well suited for
 handling typographical errors.
@@ -24,10 +27,10 @@ handling typographical errors.
 | qGrams([q: Int = '2'], [minChar: Char = '0'], [maxChar: Char = 'z']) | qGrams | String similarity based on q-grams (by default q=2). |
 | substring([granularity: String = '3']) | SubString | Return 0 to 1 for strong similarity to weak similarity |
 
-## Tokenbased
+### Token based
+
+While character-based distance measures work well for typographical errors, there are a number of tasks where token-base distance measures are better suited:
 
-While character-based distance measures work well for typographical
-errors, there are a number of tasks where token-base distance measures are better suited:
 - Strings where parts are reordered e.g. &ldquo;John Doe&rdquo; and &ldquo;Doe, John&rdquo;
 - Texts consisting of multiple words
 
@@ -38,36 +41,44 @@ errors, there are a number of tasks where token-base distance measures are bette
 | jaccard() | Jaccard | Jaccard similarity coefficient. |
 | softjaccard([maxDistance: Int = '1']) | Soft Jaccard | Soft Jaccard similarity coefficient. Same as Jaccard distance but values within an levenhstein distance of 'maxDistance' are considered equivalent. |
 | tokenwiseDistance([ignoreCase: Boolean = 'true'], [metricName: String = 'levenshtein'], [splitRegex: String = '[\s\d\p{Punct}]+'], [stopwords: String = ''], [stopwordWeight: Double = '0.01'], [nonStopwordWeight: Double = '0.1'], [useIncrementalIdfWeights: Boolean = 'false'], [matchThreshold: Double = '0.0'], [orderingImpact: Double = '0.0'], [adjustByTokenLength: Boolean = 'false']) | Token-wise Distance | Token-wise string distance using the specified metric |
-## Numeric
+
+### Numeric
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | date() | Date | The distance in days between two dates ('YYYY-MM-DD' format). |
 | dateTime() | DateTime | Distance between two date time values (xsd:dateTime format) in seconds. |
 | insideNumericInterval([separator: String]) | Inside numeric interval | Checks if a number is contained inside a numeric interval, such as '1900 - 2000' |
 | num([minValue: Double = '-Infinity'], [maxValue: Double = 'Infinity']) | Numeric similarity | Computes the numeric distance between two numbers. |
 | wgs84([unit: String = 'km']) | Geographical distance | Computes the geographical distance between two points. Author: Konrad Höffner (MOLE subgroup of Research Group AKSW, University of Leipzig) |
-## Equality
+
+### Equality
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | constant([value: Double = '1.0']) | Constant | Always returns a constant similarity value. |
 | equality() | Equality | Return 0 if strings are equal, 1 otherwise. |
 | inequality() | Inequality | Return 1 if strings are equal, 0 otherwise. |
 | lowerThan([orEqual: Boolean = 'false']) | LowerThan | Return 1 if the source value is lower than the target value, 0 otherwise. If both strings are numbers, numerical order is used for comparison. Otherwise, alphanumerical order is used |
 | relaxedEquality() | RelaxedEquality | Return 1 if strings are equal, 0 otherwise. Lower/upper case and differences like ö/o, n/ñ, c/ç etc. are treated as equal. |
-## Asian
+
+### Asian
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | cjkReadingDistance([minChar: Char = '0'], [maxChar: Char = 'z']) | CJK Reading Distance | CJK Reading Distance. |
 | korean TranslitDistance([minChar: Char = '0'], [maxChar: Char = 'z']) | Korean translit distance | Transliterated Korean distance. |
 | koreanPhonemeDistance([minChar: Char = '0'], [maxChar: Char = 'z']) | Korean phoneme distance | Korean phoneme distance. |
 
-## Spatial Distances
+### Spatial Distances
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | CentroidDistanceMetric([blockingParameter: Double = 1.0]) | Centroid distance | Computes the distance between the centroids of two geometries in meters. |
 | MinDistanceMetric([blockingParameter: Double = 1.0]) | Min distance | Computes the minimum distance between two geometries in meters. |
 
-## Spatial Relations
+### Spatial Relations
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | SContainsMetric([blockingParameter: Double = 1.0]) | Spatial Contains | Computes the relation "contains" between two geometries. |
@@ -80,7 +91,8 @@ errors, there are a number of tasks where token-base distance measures are bette
 | TouchesMetric([blockingParameter: Double = 1.0]) | Touches | Computes the relation "touches" between two geometries. |
 | WithinMetric([blockingParameter: Double = 1.0]) | Within | Computes the relation "within" between two geometries. |
 
-## Temporal Distances
+### Temporal Distances
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | DaysDistanceMetric([blockingParameter: Double = 1.0]) | Days distance | Computes the distance in days between two time periods or instants. |
@@ -91,7 +103,8 @@ errors, there are a number of tasks where token-base distance measures are bette
 | SecsDistanceMetric([blockingParameter: Double = 1.0]) | Secs distance | Computes the distance in secs between two time periods or instants. |
 | YearsDistanceMetric([blockingParameter: Double = 1.0]) | Years distance | Computes the distance in years between two time periods or instants. |
 
-## Temporal Relations
+### Temporal Relations
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | AfterMetric([blockingParameter: Double = 1.0]) | After | Computes the relation "after" between two time periods or instants. |
@@ -108,21 +121,27 @@ errors, there are a number of tasks where token-base distance measures are bette
 | TOverlapsMetric([blockingParameter: Double = 1.0]) | Temporal Overlaps | Computes the relation "overlaps" between two time periods or instants. |
 | StartsMetric([blockingParameter: Double = 1.0]) | Starts | Computes the relation "starts" between two time periods or instants. |
 
-# Transformations
+## Transformations
 
 The following transform and normalization functions are included:
-## Replace
+
+### Replace
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | regexReplace(regex: String, replace: String) | Regex replace | Replace all occurrences of a regex "regex" with "replace" in a string. |
 | replace(search: String, replace: String) | Replace | Replace all occurrences of a string "search" with "replace" in a string. |
-## Combine
+
+### Combine
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | concat([glue: String = '']) | Concatenate | Concatenates strings from two inputs. |
 | concatMultiValues([glue: String = ''], [removeDuplicates: Boolean = 'false']) | ConcatenateMultipleValues | Concatenates multiple values received for an input. If applied to multiple inputs, yields at most one value per input. Optionally removes duplicate values. |
 | merge() | Merge | Merges the values of all inputs. |
-## Normalize
+
+### Normalize
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | alphaReduce() | Alpha reduce | Strips all non-alphabetic characters from a string. |
@@ -134,7 +153,9 @@ The following transform and normalization functions are included:
 | stripUriPrefix() | Strip URI prefix | Strips the URI prefix and decodes the remainder. Leaves values unchanged which don't start with 'http:' |
 | trim() | Trim | Remove leading and trailing whitespaces. |
 | upperCase() | Upper case | Converts a string to upper case. |
-## Linguistic
+
+### Linguistic
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | NYSIIS([refined: Boolean = 'true']) | NYSIIS | NYSIIS phonetic encoding. Provided by the StringMetric library: http://rockymadden.com/stringmetric/ |
@@ -143,7 +164,9 @@ The following transform and normalization functions are included:
 | soundex([refined: Boolean = 'true']) | Soundex | Soundex algorithm. Provided by the StringMetric library: http://rockymadden.com/stringmetric/ |
 | spotlight() | Spotlight | Concatenates all values to a string and gets a weighted entity vector from the Spotlight service. |
 | stem() | Stem | Stems a string using the Porter Stemmer. |
-## Substring
+
+### Substring
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | stripPostfix(postfix: String) | Strip postfix | Strips a postfix of a string. |
@@ -152,29 +175,39 @@ The following transform and normalization functions are included:
 | substring([beginIndex: Int = '0'], [endIndex: Int = '0']) | Substring | Returns a substring between 'beginIndex' (inclusive) and 'endIndex' (exclusive). If 'endIndex' is 0 (default), it is ignored and the entire remaining string starting with 'beginIndex' is returned. If 'endIndex' is negative, -endIndex characters are removed from the end.' |
 | trim() | Trim | Remove leading and trailing whitespaces. |
 | untilCharacter(untilCharacter: Char) | Until Character | Give a substring until the character given |
-## Conversion
+
+### Conversion
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | convertCharset([sourceCharset: String = 'ISO-8859-1'], [targetCharset: String = 'UTF-8']) | Convert Charset | Convert the string from "sourceCharset" to "targetCharset". |
-## Filter
+
+### Filter
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | filterByLength([min: Int = '0'], [max: Int = '2147483647']) | filter by length | Removes all strings that are shorter than 'min' characters and longer than 'max' characters. |
 | filterByRegex(regex: String, [negate: Boolean = 'false']) | filter by regex | Removes all strings that do NOT match a regex. If 'negate' is true, only strings will be removed that match the regex. |
 | removeEmptyValues() | Remove empty values | Removes empty values. |
 | removeValues(blacklist: String) | Remove values | Removes values. |
-## Tokenization
+
+### Tokenization
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | camelcasetokenizer() | Camel Case Tokenizer | Tokenizes a camel case string. That is it splits strings between a lower case characted and an upper case character. |
 | tokenize([regex: String = '\s']) | Tokenize | Tokenizes all input values. |
-## Numeric
+
+### Numeric
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | aggregateNumbers(operator: String) | Aggregate Numbers |  Aggregates all numbers in this set using a mathematical operation. |
 | compareNumbers([comparator: String = '<']) | Compare Numbers |  Compares the numbers of two sets. |
 | numReduce() | Numeric reduce | Strip all non-numeric characters from a string. |
-## Date
+
+### Date
+
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | compareDates([comparator: String = '<']) | Compare Dates |  Compares two dates. Returns 1 if the comparison yields true and 0 otherwise. If there are multiple dates in both sets, the comparator must be true for all dates. e.g. {2014-08-02,2014-08-03} < {2014-08-03} yields 0 as not all dates in the first set are smaller than in the second. Accepts one parameter: comparator: One of '<', '<=', '=', '>=', '>'  |
@@ -184,8 +217,8 @@ The following transform and normalization functions are included:
 | durationInSeconds() | Duration in Seconds | Converts an xsd:duration to seconds. |
 | timeToDate() | Timestamp to date | convert Unix timestamp to xsd:date |
 
+### Spatial
 
-## Spatial
 | Function and parameters | Name | Description |
 | --- | --- | --- |
 | AreaTransformer | Area Transformer | Returns the area of the input geometry. |

diff --git a/doc/Readme.md b/doc/Readme.md
@@ -1,8 +1,10 @@
 # Silk Documentation
 
-- [Overview of the Silk Workbench](Workbench.md)
+- [Overview of the Silk Workbench](Overview.md)
   - [Linking with the Silk Workbench](Linking.md)
-  - [REST API](RestInterface.md)
+  - [Command Line](CommandLine.md)
+  - [Configuration](Configuration.md)
+  - [Internal API](Internal.md)
 - [Linkage Rules](LinkageRules.md)
 - [The Silk Link Specification Language](LinkSpecificationLanguage.md)
 - [Available Plugins](Plugins.md)