Skip to content

Commit

Permalink
[WIP] feat(proxy-wasm) metrics
Browse files Browse the repository at this point in the history
  • Loading branch information
casimiro committed May 17, 2024
1 parent 86e7d33 commit e8cf21c
Show file tree
Hide file tree
Showing 38 changed files with 3,210 additions and 75 deletions.
9 changes: 7 additions & 2 deletions config
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ NGX_WASMX_INCS="\
$ngx_addon_dir/src/common \
$ngx_addon_dir/src/common/proxy_wasm \
$ngx_addon_dir/src/common/shm \
$ngx_addon_dir/src/common/metrics \
$ngx_addon_dir/src/common/lua"

NGX_WASMX_DEPS="\
Expand All @@ -141,7 +142,9 @@ NGX_WASMX_DEPS="\
$ngx_addon_dir/src/common/proxy_wasm/ngx_proxy_wasm_properties.h \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm.h \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_kv.h \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_queue.h"
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_queue.h \
$ngx_addon_dir/src/common/metrics/ngx_wa_histogram.h \
$ngx_addon_dir/src/common/metrics/ngx_wa_metrics.h"

NGX_WASMX_SRCS="\
$ngx_addon_dir/src/ngx_wasmx.c \
Expand All @@ -155,7 +158,9 @@ NGX_WASMX_SRCS="\
$ngx_addon_dir/src/common/proxy_wasm/ngx_proxy_wasm_util.c \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm.c \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_kv.c \
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_queue.c"
$ngx_addon_dir/src/common/shm/ngx_wasm_shm_queue.c \
$ngx_addon_dir/src/common/metrics/ngx_wa_histogram.c \
$ngx_addon_dir/src/common/metrics/ngx_wa_metrics.c"

# wasm

Expand Down
53 changes: 52 additions & 1 deletion docs/DIRECTIVES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ By alphabetical order:
- [cache_config](#cache-config)
- [compiler](#compiler)
- [flag](#flag)
- [max_metric_name_length](#max_metric_name_length)
- [module](#module)
- [proxy_wasm](#proxy_wasm)
- [proxy_wasm_isolation](#proxy_wasm_isolation)
Expand All @@ -16,6 +17,7 @@ By alphabetical order:
- [resolver_timeout](#resolver_timeout)
- [shm_kv](#shm_kv)
- [shm_queue](#shm_queue)
- [slab_size](#slab_size)
- [socket_buffer_size](#socket_buffer_size)
- [socket_buffer_reuse](#socket_buffer_reuse)
- [socket_connect_timeout](#socket_connect_timeout)
Expand Down Expand Up @@ -57,6 +59,9 @@ By context:
- [tls_trusted_certificate](#tls_trusted_certificate)
- [tls_verify_cert](#tls_verify_cert)
- [tls_verify_host](#tls_verify_host)
- `metrics{}`
- [max_metric_name_length](#max_metric_name_length)
- [slab_size](#slab_size)
- `wasmtime{}`
- [cache_config](#cache-config)
- [flag](#flag)
Expand Down Expand Up @@ -205,6 +210,24 @@ wasm {

[Back to TOC](#directives)

max_metric_name_length
---------

**usage** | `max_metric_name_length <size>;`
------------:|:----------------------------------------------------------------
**contexts** | `metrics{}`
**default** | `128`
**example** | `max_metric_name_length 256;`

Set the maximum allowed length of a metric name.

> Notes
See [Metrics] for a complete description of how metrics are represented in
memory.

[Back to TOC](#directives)

module
------

Expand Down Expand Up @@ -525,6 +548,33 @@ policy, and writes will fail when the allocated memory slab is full.

[Back to TOC](#directives)

slab_size
---------

**usage** | `slab_size <size>;`
------------:|:----------------------------------------------------------------
**contexts** | `metrics{}`
**default** | `5m`
**example** | `slab_size 12m;`

Set the `size` of the shared memory slab dedicated to metrics storage. The value
must be at least 3 * pagesize, e.g. `15k` on Linux.

> Notes
The space in memory occupied by a metric depends on its name length, type and
the number of worker processes running. As an example, if all metric names are
64 chars long and 4 workers are running, `5m` can accommodate 20k counters, 20k
gauges, or up to 16k histograms.

See the [max_metric_name_legnth](#max_metric_name_length) directive to configure
the max name length in chars for metrics.

See [Metrics] for a complete description of how metrics are represented in
memory.

[Back to TOC](#directives)

socket_buffer_reuse
-------------------

Expand Down Expand Up @@ -939,7 +989,8 @@ the `http{}` contexts.

[Contexts]: USER.md#contexts
[Execution Chain]: USER.md#execution-chain
[SLRU eviction algorithm]: SLRU.md
[Metrics]: METRICS.md
[OpenResty]: https://openresty.org/en/
[resolver]: https://nginx.org/en/docs/http/ngx_http_core_module.html#resolver
[resolver_timeout]: https://nginx.org/en/docs/http/ngx_http_core_module.html#resolver_timeout
[SLRU eviction algorithm]: SLRU.md
88 changes: 88 additions & 0 deletions docs/METRICS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Metrics

## Introduction

In the context of ngx_wasm_module, in accordance with Proxy-Wasm, a metric is
either a counter, a gauge or a histogram.

A counter is an unsigned 64-bit int that can only be incremented.
A gauge is an unsigned 64-bit int that can take arbitrary values.

## Histograms

A histogram represents ranges frequencies of a variable and can be defined as a
set of pairs of range and counter. For example, the distribution of response
time of HTTP requests, can be represented as a histogram with ranges `[0, 1]`,
`(1, 2]`, `(2, 4]` and `(4, Inf]`. The 1st range's counter, would be the number
of requests with response time less or equal to 1ms; the 2nd range's counter,
requests with response time between 1ms and 2ms; the 3rd range's counter,
requests with response time between 2ms and 4ms; and the last range's counter,
requests with response time bigger than 4ms.

### Binning

The above example demonstrates a histogram with ranges, or bins, whose upper
bound grows in powers of 2, i.e. 2^0, 2^1 and 2^2. This is usually called
logarithmic binning and is indeed how histograms bins are represented in the
ngx_wasm_module. This binning strategy implicates that when a value `v` is
recorded, it is matched with the smalllest power of two that's bigger than `v`;
this value is the upper bound of the bin associated with `v`; if the histogram
contain, or can contain, such bin, its counter is incremented; if not, the bin
with the next smallest upper bound bigger than `v` has its counter incremented.

### Update and expansion

Histograms are created with 5 bins, 1 initialized and 4 uninitialized. If a
value `v` is recorded and its bin isn't part of the initialized bins, one of the
uninitialized bins is initialized with the upper bound associated with `v` and
its counter is incremented. If the histogram is out of uninitialized bins, it
can be expanded, up to 18 bins, to accommodate the additional bin for `v`. The
bin initialized upon histogram creation has upper bound 2^32 and its counter is
incremented if it's the only bin whose upper bound is bigger than the recorded
value.

## Memory consumption

The space in memory occupied by a metric contains its name, value and the
underlying structure representing them in the key-value store. While the
key-value structure has a fixed size of 96 bytes, the sizes of name and value
vary.

The size in memory of the value of a counter or gauge is 8 bytes plus 16 bytes
per worker process. The value size grows according to the number of workers
because metric value is segmented across them. Each worker has its own segment
of the value to write updates to. When a metric is retrieved, the segments are
consolidated and returned as a single metric. This storage strategy allows
metric updates to be performed without the aid of locks at the cost of 16 bytes
per worker.

Histograms' values also have a baseline size of 8 bytes plus 16 bytes per
worker. However, histograms need extra space per worker for bins storage. Bins
storage costs 4 bytes plus 8 bytes per bin. So a 5-bin histogram takes 8 bytes
plus (16 + 4 + 5*8), 60 bytes per worker.

As such, in a 4-workers setup, a counter or gauge whose name is 64 chars long
takes 168 bytes, a 5-bin histogram with the same name takes 408 bytes and a
18-bin histogram with the same name takes 824 bytes.

### Shared memory allocation

Nginx employs an allocation model for shared memory that enforces allocation
size to be a power of 2 and greater than 8; nonconforming values are rounded up,
see [Nginx shared memory].

This means that an allocation of 168 bytes, for instance, ends up taking 256
bytes from the shared memory.

## Nginx Reconfiguration

If Nginx is reconfigured with a different number of workers or a different size
for the metrics shared memory zone, existing metrics need to be reallocated into
a brand new shared memory zone. This is due to the metric values being segmented
across workers.

As such, it's important to ensure a new size of the metrics' shared memory zone
is enough to accommodate existing metrics and that the value of
`max_metric_name_len` isn't less than any existing metric name.

[Nginx shared memory](https://nginx.org/en/docs/dev/development_guide.html#shared_memory)
8 changes: 4 additions & 4 deletions docs/PROXY_WASM.md
Original file line number Diff line number Diff line change
Expand Up @@ -536,10 +536,10 @@ SDK ABI `0.2.1`) and their present status in ngx_wasm_module:
`proxy_enqueue_shared_queue` | :heavy_check_mark: | No automatic eviction mechanism if the queue is full.
`proxy_resolve_shared_queue` | :x: |
*Stats/metrics* | |
`proxy_define_metric` | :x: |
`proxy_get_metric` | :x: |
`proxy_record_metric` | :x: |
`proxy_increment_metric` | :x: |
`proxy_define_metric` | :heavy_check_mark: | Support for histograms NYI.
`proxy_get_metric` | :heavy_check_mark: |
`proxy_record_metric` | :heavy_check_mark: |
`proxy_increment_metric` | :heavy_check_mark: |
*Custom extension points* | |
`proxy_call_foreign_function` | :x: |

Expand Down
26 changes: 26 additions & 0 deletions src/common/debug/ngx_wasm_debug_module.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
#include <ngx_http_wasm.h>
#endif

#include <ngx_wa_metrics.h>

#if (!NGX_DEBUG)
# error ngx_wasm_debug_module included in a non-debug build
#endif
Expand All @@ -20,6 +22,11 @@
static ngx_int_t
ngx_wasm_debug_init(ngx_cycle_t *cycle)
{
size_t long_metric_name_len = NGX_MAX_ERROR_STR;
uint32_t mid;
ngx_str_t metric_name;
u_char buf[long_metric_name_len];

static ngx_wasm_phase_t ngx_wasm_debug_phases[] = {
{ ngx_string("a_phase"), 0, 0, 0 },
{ ngx_null_string, 0, 0, 0 }
Expand All @@ -41,6 +48,25 @@ ngx_wasm_debug_init(ngx_cycle_t *cycle)
ngx_wasm_phase_lookup(&ngx_wasm_debug_subsystem, 3) == NULL
);

metric_name.len = long_metric_name_len;
metric_name.data = buf;

/* invalid metric name length */
ngx_wa_assert(
ngx_wa_metrics_add(ngx_wasmx_metrics(),
&metric_name,
NGX_WA_METRIC_COUNTER,
&mid) == NGX_ERROR
);

/* invalid metric type */
ngx_wa_assert(
ngx_wa_metrics_add(ngx_wasmx_metrics(),
&metric_name,
100,
&mid) == NGX_ERROR
);

return NGX_OK;
}

Expand Down
Loading

0 comments on commit e8cf21c

Please sign in to comment.