From 8ff9932730a122a62dbfa51c6aa6e5c73c112bc1 Mon Sep 17 00:00:00 2001
From: FiveMovesAhead <ying@tig.foundation>
Date: Sun, 12 May 2024 13:21:10 +0800
Subject: [PATCH] Initial version of challenge descriptions.

---
 tig-challenges/docs/knapsack.md        | 59 +++++++++++++++++++++-
 tig-challenges/docs/satisfiability.md  | 70 +++++++++++++++++++++++++-
 tig-challenges/docs/vehicle_routing.md | 61 +++++++++++++++++++++-
 3 files changed, 186 insertions(+), 4 deletions(-)
diff --git a/tig-challenges/docs/knapsack.md b/tig-challenges/docs/knapsack.md
index b1e3567d..343922a3 100644
--- a/tig-challenges/docs/knapsack.md
+++ b/tig-challenges/docs/knapsack.md
@@ -1,3 +1,60 @@
 # Knapsack Problem
 
-Description placeholder
\ No newline at end of file
+[The area of Knapsack problems is one of the most active research areas of combinatorial optimization](https://en.wikipedia.org/wiki/Knapsack_problem). The problem is to maximise the value of items placed in a knapsack given the constraint that the total weight of items cannot exceed some limit.
+
+Figure 1 (below) illustrates a simple example.  In this case, the weight of all items in the knapsack cannot exceed 15kg.  Here, the solution is that the maximum value of items in the knapsack is $15, which is attained by including all items apart from the green item. 
+
+<img src="../assets/knapsack_example.png" alt="Knapsack Example" width="100%"/>
+<figcaption>Figure 1. Example of Knapsack Problem</figcaption>
+<br/>
+
+# Example
+
+For our challenge, we use a version of the knapsack problem with configurable difficulty, where the following two parameters can be adjusted in order to vary the difficulty of the challenge:
+
+- Parameter 1:  $num\textunderscore{ }items$ is the number of items from which you need to select a subset to put in the knapsack. 
+- Parameter 2: $better\textunderscore{ }than\textunderscore{ }baseline \geq 1$ is the factor by which a solution must be better than the baseline value [link TIG challenges for explanation of baseline value].
+
+
+The larger the $num\textunderscore{ }items$, the more number of possible $S_{knapsack}$, making the challenge more difficult. Also, the higher $better\textunderscore{ }than\textunderscore{ }baseline$, the less likely a given $S_{knapsack}$ will be a solution, making the challenge more difficult.
+
+The weight $w_j$ of each of the $num\textunderscore{ }items$ is an integer, chosen independently, uniformly at random, and such that each of the item weights $1 <= w_j <= 50$, for $j=1,2,...,num\textunderscore{ }items$. The values of the items $v_j$ are similarly selected at random from the same distribution.
+
+We impose a weight constraint $W(S_{knapsack}) <= 0.5 \cdot W(S_{all})$, where the knapsack can hold at most half the total weight of all items.
+
+
+Consider an example of a challenge instance with `num_items=6` and `better_than_baseline = 1.09`. Let the baseline value be 100:
+
+```
+weights = [48, 20, 39, 13, 25, 16]
+values = [24, 42, 27, 31, 44, 31]
+max_weight = 80
+min_value = baseline*better_than_baseline = 109
+```
+The objective is to find a set of items where the total weight is at most 80 but has a total value of at least 109.
+
+Now consider the following selection:
+
+```
+selected_items =  [1, 3, 4, 5]
+```
+
+When evaluating this selection, we can confirm that the total weight is less than 80, and the total value is more than 109, thereby this selection of items is a solution:
+
+* Total weight = 20 + 13 + 25 + 16 = 74
+* Total value = 42 + 31 + 44 + 31 = 148
+
+# Our Challenge 
+In TIG, the baseline value is determined by a greedy algorithm that simply iterates through items sorted by value to weight ratio, adding them if knapsack is still below the weight constraint.  
+
+# Applications
+
+The Knapsack problems have a wide variety of practical applications. The [use of knapsack in integer programming](https://www.sciencedirect.com/science/article/pii/0012365X71900057) led to break thoughs in several disciplines, including [energy management](https://www.sciencedirect.com/science/article/abs/pii/S0301421513003418) and [cellular network frequency planning](https://www.slideshare.net/deepakecrbs/gsm-frequency-planning). 
+
+Although originally studied in the context of logistics, Knapsack problems appear regularly in diverse areas of science and technology. For example, in gene expression data, there are usually thousands of genes, but only a subset of them are informative for a specific problem. The Knapsack Problem can be used to select a subset of genes (items) that maximises the total information (value) without exceeding the limit of the number of genes that can be included in the analysis (weight limit).
+
+<img src="../assets/gene_clustering.jfif" alt="Gene Clustering" width="100%"/>
+
+
+<figcaption>Figure 2: <a href="https://openres.ersjournals.com/content/4/4/00031-2018" target="_blank">Microarray clustering of differentially expressed genes in blood</a>. Genes are clustered in rows, with red indicating high expression, yellow intermediate expression and blue low expression. The Knapsack problem is <a href="https://www.sciencedirect.com/science/article/abs/pii/S0305054821003877" target="_blank">used to analyse</a> gene expression clustering.</figcaption>
+<br/>
diff --git a/tig-challenges/docs/satisfiability.md b/tig-challenges/docs/satisfiability.md
index 03616006..b458030a 100644
--- a/tig-challenges/docs/satisfiability.md
+++ b/tig-challenges/docs/satisfiability.md
@@ -1,3 +1,71 @@
 # Boolean Satisfiability
 
-Description placeholder
\ No newline at end of file
+[The SAT (or Boolean Satisfiability) problem is a decision problem in computer science](https://en.wikipedia.org/wiki/Boolean_satisfiability_problem). It's the problem of determining if there exists a truth assignment to a given Boolean formula that makes the formula true (satisfies all clauses).
+
+A Boolean formula is built from:
+
+- Boolean variables: $x_1, x_2, x_3, \ldots$
+- Logical connectives: AND ($\land$), OR ($\lor$), NOT ($\neg$)
+- Parentheses for grouping: ( )
+
+3-SAT is a special case of SAT where each clause is limited to exactly three literals (a literal is a variable or its negation). An example with 4 variables and 3 clauses can be seen below:
+
+$$(x_1 \lor x_2 \lor x_3) \land (\neg x_1 \lor \neg x_3 \lor \neg x_4) \land (\neg x_2 \lor x_3 \lor x_4)$$
+
+For this particular example, one possible truth assignment that satisfies this formula is $x_1 = True$, $x_2 = False$, $x_3 = True$, $x_4 = False$. This can be verified by substituting the variables and evaluating that every clause will result in $True$.
+
+# Example
+
+The following is an example of the 3-SAT problem with configurable difficulty. Two parameters can be adjusted in order to vary the difficulty of the challenge instance:
+
+- Parameter 1: $num\textunderscore{ }variables$ = **The number of variables**.  
+- Parameter 2: $clauses\textunderscore{ }to\textunderscore{ }variables\textunderscore{ }percent$ = **The number of variables as a percentage of the number of clauses**. 
+
+The number of clauses is derived from the above parameters.
+
+$$num\textunderscore{ }clauses = floor(num\textunderscore{ }variables \cdot \frac{clauses\textunderscore{ }to\textunderscore{ }variables\textunderscore{ }percent}{100})$$
+
+Where $floor$ is a function that rounds a floating point number down to the closest integer.
+
+Consider an example `Challenge` instance with `num_variables=4` and `clauses_to_variables_percent=75`:
+
+```
+clauses = [
+    [1, 2, -3],
+    [-1, 3, 4],
+    [2, -3, 4]
+]
+```
+
+Each clause is an array of three integers. The absolute value of each integer represents a variable, and the sign represents whether the variable is negated in the clause (negative means it's negated).
+
+The clauses represents the following Boolean formula:
+
+```
+(X1 or X2 or not X3) and (not X1 or X3 or X4) and (X2 or not X3 or X4)
+```
+
+Now consider the following assignment:
+
+```
+assignment = [False, True, True, False]
+```
+
+This assignment corresponds to the variable assignment $X1=False, X2=True, X3=True, X4=False$.
+
+When substituted into the Boolean formula, each clause will evaluate to True, thereby this assignment is a solution as it satisfies all clauses.
+
+# Our Challenge
+In TIG, the 3-SAT Challenge is based on the example above with configurable difficulty.  Please see the challenge code for a precise specification. 
+
+# Applications
+
+SAT has a vast range of applications in science and industry in fields including computational biology, formal verification, and electronic circuit design. For example:
+
+SAT is used in computational biology to solve the "cell formation problem" of [organising a plant into cells](https://www.sciencedirect.com/science/article/abs/pii/S0957417412006173).
+SAT is also heavily utilised in [electronic circuit design](https://dl.acm.org/doi/abs/10.1145/337292.337611).
+
+<img src="../assets/circuit.jfif" alt="Application of SAT" width="100%"/>
+
+<figcaption>Figure 1: Chips made possible by electronic circuit design.</figcaption>
+<br/>
diff --git a/tig-challenges/docs/vehicle_routing.md b/tig-challenges/docs/vehicle_routing.md
index 64db3042..16bc9fff 100644
--- a/tig-challenges/docs/vehicle_routing.md
+++ b/tig-challenges/docs/vehicle_routing.md
@@ -1,3 +1,60 @@
-# Vehicle Routing
+# Capacitated Vehicle Routing
+
+[The CVRP, or Capacitated Vehicle Routing Problem, is a well-studied optimisation problem in the field of operations research and transportation logistics](https://en.wikipedia.org/wiki/Vehicle_routing_problem). It involves the task of determining the optimal set of routes a fleet of vehicles should undertake in order to service a given set of customers, while meeting certain constraints.
+
+In the CVRP, a fleet of identical vehicles based at a central depot must be routed to deliver goods to a set of geographically dispersed customers. Each vehicle has a fixed capacity, and each customer has a known demand for goods. The objective is to determine the minimum total distance that the fleet must travel to deliver goods to all customers and return to the depot, such that:
+
+1. Each customer is visited by exactly one vehicle,
+2. The total demand serviced by each vehicle does not exceed its capacity, and
+3. Each vehicle starts and ends its route at the depot.
+
+# Example
+
+The following is an example of the Capacitated Vehicle Routing problem with configurable difficulty. Two parameters can be adjusted in order to vary the difficulty of the challenge instance:
+
+- Parameter 1: $num\textunderscore{ }nodes$ is the number of customers (plus 1 depot) which are  placed  uniformly at random on a grid of 500x500 with the depot at the centre (250, 250).  
+- Parameter 2: $better\textunderscore{ }than\textunderscore{ }baseline$ is the factor by which a solution must  be better than the baseline value [link TIG challenges for explanation of baseline value].
+
+The demand of each customer is selected independently and uniformly at random from the range [25, 50]. The maximum capacity of each vehicle is set to 100.
+
+Consider an example `Challenge` instance with `num_nodes=5` and `better_than_baseline=0.8` Let the baseline value be 175:
+
+```
+demands = [0, 25, 30, 40, 50] # a N array where index (i) is the demand at node i
+distance_matrix = [ # a NxN symmetric matrix where index (i,j) is distance from node i to node j
+    [0, 10, 20, 30, 40],
+    [10, 0, 15, 25, 35],
+    [20, 15, 0, 20, 30],
+    [30, 25, 20, 0, 10],
+    [40, 35, 30, 10, 0]
+]
+max_capacity = 100 # total demand for each route must not exceed this number 
+max_total_distance = baseline*better_than_baseline = 140 # routes must have total distance under this number to be a solution 
+```
+
+The depot is the first node (node 0) with demand 0. The vehicle capacity is set to 100. In this example, routes must have a total distance of 140 or less to be a solution.
+
+Now consider the following routes:
+
+```
+routes = [
+  [0, 3, 4, 0], 
+  [0, 1, 2, 0]
+]
+```
+
+When evaluating these routes, each route has demand less than 100, and the total distance is shorter than 140, thereby these routes are a solution:
+
+* Route 1: 
+    * Depot -> 3 -> 4 -> Depot
+    * Demand = 40 + 50 = 90
+    * Distance = 30 + 10 + 40 = 80
+* Route 2: 
+    * Depot -> 1 -> 2 -> Depot
+    * Demand = 25 + 30 = 55
+    * Distance = 10 + 15 + 20 = 45
+* Total Distance = 80 + 45 = 125
+
+## Our Challenge
+In TIG, the baseline route is determined by using a greedy algorithm that iteratively selects the closest unvisited node (returning to the depot when necessary) until all drop-offs are made. Please see the challenge code for a precise specification. 
 
-Description placeholder
\ No newline at end of file