From 86e7d3339cc4cca1b5d21c3f4f877756b0c48d1e Mon Sep 17 00:00:00 2001
From: Caio Ramos Casimiro <caiorcasimiro@gmail.com>
Date: Thu, 9 May 2024 11:15:31 +0100
Subject: [PATCH] docs(metrics) adr

---
 docs/adr/005-metrics.md | 219 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 219 insertions(+)
 create mode 100644 docs/adr/005-metrics.md

diff --git a/docs/adr/005-metrics.md b/docs/adr/005-metrics.md
new file mode 100644
index 000000000..b9942d68e
--- /dev/null
+++ b/docs/adr/005-metrics.md
@@ -0,0 +1,219 @@
+# Metrics
+
+* Status: proposed
+* Deciders: WasmX
+* Date: 2024-05-03
+
+## Table of Contents
+
+- [Problem Statement](#problem-statement)
+- [Technical Context](#technical-context)
+- [Decision Drivers](#decision-drivers)
+- [Proposal](#proposal)
+    - [Histograms](#histograms)
+        - [Binning](#binning)
+        - [Allocation and update handling](#allocation-and-update-handling)
+        - [Growth](#growth)
+
+## Problem Statement
+
+Support definition, update and retrieval of metrics from Proxy-Wasm filters,
+ngx_wasm_module itself and Lua land. How exactly are metrics stored and how 
+access to them is coordinated to ensure two Nginx workers never write to the
+same memory space?
+
+[Back to TOC](#table-of-contents)
+
+## Technical Context
+
+A metric can be either a counter, a gauge or a histogram.
+A counter is an integer that can only be incremented.
+A gauge is an integer that can take arbitrary positive values. 
+
+A histogram, used to represent ranges frequency of a variable, can be defined
+as a set of pairs of range and counter. For example, the distribution of the
+response time of a group of HTTP requests, can be represented as a histogram
+with ranges `[0, 10]`, `(10, 100]` and `(100, Inf]`. The 1st range's counter,
+would be the number of requests whose response time <= 10ms; the 2nd range's
+counter, requests whose 10ms < response time <= l00ms; and the last range's
+counter, requests whose response time > 100ms.
+
+A metric's value should reflect updates from all worker processes. If a counter
+is `0`, after being incremented by workers 0 and 1, it should be `2` -- despite
+the worker it's retrieved from. A gauge, however, is whatever value last set by
+any of the workers. Histograms, like counters, account for values recorded by
+all workers.
+
+[Back to TOC](#table-of-contents)
+
+## Decision Drivers
+
+* Full Proxy-Wasm ABI compatibility
+* Build atop ngx_wasm_shm
+* Minimize memory usage
+* Minimize metrics access cost
+
+[Back to TOC](#table-of-contents)
+
+## Proposal
+
+The proposed scheme for metrics storage builds atop ngx_wasm_shm's key-value
+store. Metric name is stored as a key in a red-black tree node along with metric
+value. Metric value is represented by `ngx_wa_metric_t`, see below. The member
+`type` is the metric type while the flexible array member `slots`, stores actual
+metric data.
+
+The length of `slots` equals the number of worker processes running when the
+metric is defined. This ensures each worker has its own dedicated slot to write
+metric updates.
+
+For counters, each entry in the `slots` array is simply an unsigned integer that
+its assigned worker increments. When a counter is retrieved, the values in the
+`slots` array are then summed and returned.
+
+For gauges, each of the `slots` is a pair of unsigned integer and timestamp.
+When a worker sets a gauge, the value is stored along with the time
+it's being updated in its slot. When a gauge is retrieved, the values in the
+`slots` are iterated and the most recent value is returned.
+
+For histograms, each of the `slots` points to a `ngx_wa_metrics_histogram_t`
+instance. Each worker updates the histogram pointed to by its slot. When a
+histogram is retrieved, the `slots` array is iterated and each worker's
+histogram is merged into a temporary histogram, which can then be serialized.
+
+```c
+typedef enum {
+    NGX_WA_METRIC_COUNTER,
+    NGX_WA_METRIC_GAUGE,
+    NGX_WA_METRIC_HISTOGRAM,
+} ngx_wa_metric_type_e;
+
+typedef struct {
+    ngx_uint_t  value;
+    ngx_msec_t  last_update;
+} ngx_wa_metrics_gauge_t;
+
+
+typedef struct {
+    uint32_t  upper_bound;
+    uint32_t  count;
+} ngx_wa_metrics_bin_t;
+
+typedef struct {
+    uint8_t               n_bins;
+    ngx_wa_metrics_bin_t  bins[];
+} ngx_wa_metrics_histogram_t;
+
+
+typedef union {
+    ngx_uint_t                   counter;
+    ngx_wa_metrics_gauge_t       gauge;
+    ngx_wa_metrics_histogram_t  *histogram;
+} ngx_wa_metric_val_t;
+
+typedef struct {
+    ngx_wa_metric_type_e  type;
+    ngx_wa_metric_val_t   slots[];
+} ngx_wa_metric_t;
+```
+
+This storage strategy ensures that two workers **never** write to the same
+memory address when updating a metric as long as no memory allocation is
+performed. This is indeed the case for counters and gauges and it's also the
+case for most histogram updates.
+
+This is an important feature of this design as it allows the more frequent
+update operations to be performed without the aid of a locks. The cost of a
+lock-less metric update is merely the cost of searching the key-value store red
+black tree, O(logn).
+
+The capacity of updating a metric without acquiring a lock is particularly
+attractive when a set of worker processes is under heavy load. In such
+conditions, lock contention is likely to impact proxy throughput as workers are
+more likely to wait for a lock to be released before proceeding with its metric
+update and resume its workload.
+
+Metric definition and removal still require locks to be safely performed as two
+workers might end up attempting to write to the same memory location. This is
+also true for histogram updates which cause them to grow in number of `bins`.
+
+The ABI proposed to accomplish the described system closely resembles the one
+from Proxy-Wasm specification itself:
+
+```c
+ngx_int_t ngx_wa_metrics_add(ngx_wa_metrics_t *metrics, ngx_str_t *name,
+    ngx_wa_metric_type_e type, uint32_t *out);
+ngx_int_t ngx_wa_metrics_get(ngx_wa_metrics_t *metrics, uint32_t metric_id,
+    ngx_uint_t *out);
+ngx_int_t ngx_wa_metrics_increment(ngx_wa_metrics_t *metrics,
+    uint32_t metric_id, ngx_int_t val);
+ngx_int_t ngx_wa_metrics_record(ngx_wa_metrics_t *metrics, uint32_t metric_id,
+    ngx_int_t val);
+```
+
+[Back to TOC](#table-of-contents)
+
+### Histograms
+
+This proposal includes a scheme composed of `ngx_wa_metrics_bin_t`, a pair of
+upper bound and counter, and `ngx_wa_metrics_histogram_t`, a list of
+`ngx_wa_metrics_bin_t` ordered by upper bound, to represent histogram data in
+memory. A bin's counter is the number of recorded values less than or equal to
+its upper bound and bigger than the previous bin's upper bound.
+
+This storage layout can represent both histograms with user-defined bins and
+those following an automatic binning strategy, like logarithmic binning. This
+document will focus, however, on logarithmic binning; user-defined bins are left
+for a future iteration.
+
+[Back to TOC](#table-of-contents)
+
+#### Binning
+
+The proposed binning strategy assumes the domain of the variables being measured
+is the set of nonnegative integers and divides this domain into bins whose upper
+bound grows in powers of 2, i.e., 1, 2, 4, 8, 16, etc. The mapping of a value
+`v` to its bin is given by the function `pow(2, ceil(log2(v)))` which calculates
+the bin's upper bound. The value 10, for example, is mapped to the bin whose
+upper bound is `pow(2, ceil(log2(10)))`, or 16. The bin with upper bound `16`
+represents recorded values between `8` and `16`.
+
+This logarithmic scaling provides good enough resolution for small values in
+return for low resolution for large values while keeping the memory footprint
+reasonably low: values up to 65,536 can be represented with only 16 bins. These
+characteristics fit the typical use case of measuring HTTP response time in
+milliseconds.
+
+[Back to TOC](#table-of-contents)
+
+#### Allocation and update handling
+
+Histograms are created with enough space for 5 bins, one of which is initialized
+with NGX_MAX_UINT32_VALUE as upper bound, leaving 4 uninitialized.
+
+If a value v is recorded into a histogram and its respective bin is part of the
+histogram's bins, its counter is simply incremented. If not, and there's at
+least one uninitialized bin, then one bin is initialized with v's upper bound,
+the bins are rearranged to ensure ascending order with respect to upper bound,
+and the new bin's counter is finally incremented.
+
+[Back to TOC](#table-of-contents)
+
+#### Growth
+
+If a value v is recorded but its bin isn't part of the histogram's bins and
+there aren't any uninitialized bins left, the histogram needs to grow to
+accommodate the new value's bin.
+
+Growing a histogram means allocating memory for a new histogram instance with
+enough space for the additional bin, copying memory from the old instance to
+the new one finally releasing the memory occupied by the old histogram. The new
+uninitialized bin is then initialized with v's upper bound and its counter is
+incremented.
+
+Histograms, however, can only grow up to a maximum number of bins. When a value
+`v` is recorded into a histogram, but its bin isn't part of the bins and the
+histogram's reached the bin limit, the bin with the smallest upper bound bigger
+than `v` is incremented.
+
+[Back to TOC](#table-of-contents)