Skip to content

Commit

Permalink
feat: add OpenTelemetry metrics support
Browse files Browse the repository at this point in the history
This adds in support for OpenTelementry metrics.
  • Loading branch information
rowanmanning committed Jun 12, 2024
1 parent 85bed10 commit f1bac27
Show file tree
Hide file tree
Showing 16 changed files with 518 additions and 64 deletions.
140 changes: 140 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 44 additions & 4 deletions packages/opentelemetry/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,19 @@ An [OpenTelemetry](https://opentelemetry.io/docs/what-is-opentelemetry/) client
* [Automated setup with `require()`](#automated-setup-with-require)
* [Manual setup](#manual-setup)
* [Running in production](#running-in-production)
* [Production metrics](#production-metrics)
* [Production tracing](#production-tracing)
* [Running locally](#running-locally)
* [Running a backend](#running-a-backend)
* [Sending traces to your local backend](#sending-traces-to-your-local-backend)
* [Local metrics](#local-metrics)
* [Local tracing](#local-tracing)
* [Running a backend](#running-a-backend)
* [Sending traces to your local backend](#sending-traces-to-your-local-backend)
* [Implementation details](#implementation-details)
* [Configuration options](#configuration-options)
* [`options.authorizationHeader`](#optionsauthorizationheader)
* [`options.metrics`](#optionsmetrics)
* [`options.metrics.endpoint`](#optionsmetricsendpoint)
* [`options.metrics.apiGatewayKey`](#optionsmetricsapigatewaykey)
* [`options.tracing`](#optionstracing)
* [`options.tracing.endpoint`](#optionstracingendpoint)
* [`options.tracing.authorizationHeader`](#optionstracingauthorizationheader)
Expand Down Expand Up @@ -146,6 +153,15 @@ setupOpenTelemetry({ /* ... */ });

### Running in production

#### Production metrics

To send metrics in production, you'll need an API Gateway key and the URL of the FT's official metrics collector. [You can find this information in Tech Hub](https://tech.in.ft.com/tech-topics/observability/opentelemetry).

#### Production tracing

> [!WARNING]<br />
> Tracing is not supported centrally yet and these instructions assume your team or group will be setting up their own collector.
To use this package in production you'll need a [Collector](https://opentelemetry.io/docs/collector/) that can receive traces over HTTP. This could be something you run (e.g. the [AWS Distro for OpenTelemetry](https://aws.amazon.com/otel/)) or a third-party service.

Having traces collected centrally will give you a good view of how your production application is performing, allowing you to debug issues more effectively.
Expand All @@ -154,15 +170,21 @@ OpenTelemetry can generate a huge amount of data which, depending on where you s

### Running locally

#### Local metrics

We don't recommend trying to get a metrics exporter set up locally, if your `NODE_ENV` environment variable is not set to production, then local metrics will be available in Grafana under `OpenTelemetry Test`.

#### Local tracing

If you want to debug specific performance issues then setting up a local Collector can help you. You shouldn't be sending traces in local development to your production backend as this could make it harder to debug real production issues. You probably also don't want to sample traces in local development – you'll want to collect all traffic because the volume will be much lower.

#### Running a backend
##### Running a backend

To view traces locally, you'll need a backend for them to be sent to. In this example we'll be using [Jaeger](https://www.jaegertracing.io/) via [Docker](https://www.docker.com/). You'll need Docker (or a compatible [alternative](https://podman.io/)) to be set up first.

[Jaeger maintains a useful guide for this](https://www.jaegertracing.io/docs/1.53/getting-started/#all-in-one).

#### Sending traces to your local backend
##### Sending traces to your local backend

Once your backend is running you'll need to make some configuration changes.

Expand Down Expand Up @@ -212,6 +234,24 @@ setupOpenTelemetry({

**Deprecated**. This will still work but has been replaced with [`options.tracing.authorizationHeader`](#optionstracingauthorizationheader), which is now the preferred way to set this option.

#### `options.metrics`

An object containing other metrics-specific configurations. Defaults to `undefined` which means that OpenTelemetry metrics will not be sent.

#### `options.metrics.endpoint`

A URL to send OpenTelemetry metrics to. E.g. `http://localhost:4318/v1/metrics`. Defaults to `undefined` which means that OpenTelemetry metrics will not be sent.

**Environment variable:** `OPENTELEMETRY_METRICS_ENDPOINT`<br/>
**Option:** `metrics.endpoint` (`String`)

#### `options.metrics.apiGatewayKey`

Set the `Authorization` HTTP header in requests to the central API-Gateway-backed OpenTelemetry metrics collector. Defaults to `undefined`.

**Environment variable:** `OPENTELEMETRY_API_GATEWAY_KEY`<br/>
**Option:** `metrics.apiGatewayKey` (`String`)

#### `options.tracing`

An object containing other tracing-specific configurations. Defaults to `undefined` which means that OpenTelemetry traces will not be sent.
Expand Down
6 changes: 5 additions & 1 deletion packages/opentelemetry/lib/config/instrumentations.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
const {
getNodeAutoInstrumentations
} = require('@opentelemetry/auto-instrumentations-node');
const {
RuntimeNodeInstrumentation
} = require('@opentelemetry/instrumentation-runtime-node');
const { logRecoverableError } = require('@dotcom-reliability-kit/log-error');
const { UserInputError } = require('@dotcom-reliability-kit/errors');

Expand All @@ -21,7 +24,8 @@ exports.createInstrumentationConfig = function createInstrumentationConfig() {
'@opentelemetry/instrumentation-fs': {
enabled: false
}
})
}),
new RuntimeNodeInstrumentation()
];
};

Expand Down
60 changes: 60 additions & 0 deletions packages/opentelemetry/lib/config/metrics.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
const {
OTLPMetricExporter
} = require('@opentelemetry/exporter-metrics-otlp-proto');
const { CompressionAlgorithm } = require('@opentelemetry/otlp-exporter-base');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const logger = require('@dotcom-reliability-kit/logger');
const { METRICS_USER_AGENT } = require('./user-agents');

/**
* @typedef {object} MetricsOptions
* @property {string} [apiGatewayKey]
* The API key to send to the metrics collector if you're using the FT's official metrics collector endpoint.
* @property {string} [endpoint]
* The URL to send OpenTelemetry metrics to, for example http://localhost:4318/v1/metrics.
*/

/**
* Create an OpenTelemetry metrics configuration.
*
* @param {MetricsOptions} options
* @returns {Partial<import('@opentelemetry/sdk-node').NodeSDKConfiguration>}
*/
exports.createMetricsConfig = function createMetricsConfig(options) {
/** @type {Partial<import('@opentelemetry/sdk-node').NodeSDKConfiguration>} */
const config = {};

// If we have an OpenTelemetry metrics endpoint then set it up
if (options?.endpoint) {
const headers = {
'user-agent': METRICS_USER_AGENT
};
if (options.apiGatewayKey) {
headers['X-OTel-Key'] = options.apiGatewayKey;
}
config.metricReader = new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: options.endpoint,
compression: CompressionAlgorithm.GZIP,
headers
})
});

logger.info({
event: 'OTEL_METRICS_STATUS',
message: `OpenTelemetry metrics are enabled and exporting to endpoint ${options.endpoint}`,
enabled: true,
endpoint: options.endpoint
});
} else {
logger.warn({
event: 'OTEL_METRICS_STATUS',
message:
'OpenTelemetry metrics are disabled because no metrics endpoint was set',
enabled: false,
endpoint: null
});
}

return config;
};
14 changes: 8 additions & 6 deletions packages/opentelemetry/lib/config/resource.js
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
const appInfo = require('@dotcom-reliability-kit/app-info');
const appInfo = require('@dotcom-reliability-kit/app-info').semanticConventions;
const { Resource } = require('@opentelemetry/resources');
const {
SEMRESATTRS_CLOUD_PROVIDER,
SEMRESATTRS_CLOUD_REGION,
SEMRESATTRS_DEPLOYMENT_ENVIRONMENT,
SEMRESATTRS_SERVICE_INSTANCE_ID,
SEMRESATTRS_SERVICE_NAME,
SEMRESATTRS_SERVICE_VERSION
} = require('@opentelemetry/semantic-conventions');
Expand All @@ -16,10 +17,11 @@ const {
exports.createResourceConfig = function createResourceConfig() {
// We set OpenTelemetry resource attributes based on app data
return new Resource({
[SEMRESATTRS_SERVICE_NAME]: appInfo.systemCode || undefined,
[SEMRESATTRS_SERVICE_VERSION]: appInfo.releaseVersion || undefined,
[SEMRESATTRS_CLOUD_PROVIDER]: appInfo.cloudProvider || undefined,
[SEMRESATTRS_CLOUD_REGION]: appInfo.region || undefined,
[SEMRESATTRS_DEPLOYMENT_ENVIRONMENT]: appInfo.environment || undefined
[SEMRESATTRS_SERVICE_NAME]: appInfo.service.name,
[SEMRESATTRS_SERVICE_VERSION]: appInfo.service.version,
[SEMRESATTRS_SERVICE_INSTANCE_ID]: appInfo.service.instance.id,
[SEMRESATTRS_CLOUD_PROVIDER]: appInfo.cloud.provider,
[SEMRESATTRS_CLOUD_REGION]: appInfo.cloud.region,
[SEMRESATTRS_DEPLOYMENT_ENVIRONMENT]: appInfo.deployment.environment
});
};
2 changes: 1 addition & 1 deletion packages/opentelemetry/lib/config/tracing.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ const DEFAULT_SAMPLE_PERCENTAGE = 5;
/**
* @typedef {object} TracingOptions
* @property {string} [authorizationHeader]
* The HTTP `Authorization` header to send with OpenTelemetry tracing requests.
* The HTTP `Authorization` header to send with OpenTelemetry tracing requests if you're using the Customer Products trace collector endpoint.
* @property {string} [endpoint]
* The URL to send OpenTelemetry trace segments to, for example http://localhost:4318/v1/traces.
* @property {number} [samplePercentage]
Expand Down
2 changes: 2 additions & 0 deletions packages/opentelemetry/lib/config/user-agents.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
const appInfo = require('@dotcom-reliability-kit/app-info');
const packageJson = require('../../package.json');
const metricExporterPackageJson = require('@opentelemetry/exporter-metrics-otlp-proto/package.json');
const traceExporterPackageJson = require('@opentelemetry/exporter-trace-otlp-proto/package.json');

const BASE_USER_AGENT = `FTSystem/${appInfo.systemCode} (${packageJson.name}/${packageJson.version})`;

exports.METRICS_USER_AGENT = `${BASE_USER_AGENT} (${metricExporterPackageJson.name}/${metricExporterPackageJson.version})`;
exports.TRACING_USER_AGENT = `${BASE_USER_AGENT} (${traceExporterPackageJson.name}/${traceExporterPackageJson.version})`;
Loading

0 comments on commit f1bac27

Please sign in to comment.