From 1fc8c098f603a4172bc54f165b0246aeec34de3c Mon Sep 17 00:00:00 2001 From: Yuri Shkuro Date: Mon, 14 Oct 2024 21:43:27 -0400 Subject: [PATCH 1/2] Remove references to jaeger-agent Signed-off-by: Yuri Shkuro --- content/docs/next-release/apis.md | 32 +++------- content/docs/next-release/architecture.md | 8 --- content/docs/next-release/deployment.md | 63 +------------------ content/docs/next-release/external-guides.md | 2 - content/docs/next-release/faq.md | 16 ++--- content/docs/next-release/getting-started.md | 10 +-- content/docs/next-release/monitoring.md | 1 - content/docs/next-release/operator.md | 6 +- .../docs/next-release/performance-tuning.md | 46 ++------------ content/docs/next-release/sampling.md | 2 +- content/docs/next-release/security.md | 19 ------ content/docs/next-release/troubleshooting.md | 21 +------ content/docs/next-release/windows.md | 13 +--- 13 files changed, 32 insertions(+), 207 deletions(-) diff --git a/content/docs/next-release/apis.md b/content/docs/next-release/apis.md index 9ea63d19..e092f0ae 100644 --- a/content/docs/next-release/apis.md +++ b/content/docs/next-release/apis.md @@ -15,7 +15,7 @@ Since Jaeger v1.32, **jaeger-collector** and **jaeger-query** Service ports that ## Span reporting APIs -**jaeger-agent** and **jaeger-collector** are the two components of the Jaeger backend that can receive spans. At this time they support two sets of non-overlapping APIs. +**jaeger-collector** is the component of the Jaeger backend that can receive spans. At this time it supports two sets of non-overlapping APIs. ### OpenTelemetry Protocol (stable) @@ -31,21 +31,17 @@ The OTLP data is accepted in these formats: (1) binary gRPC, (2) Protobuf over H [otlp-rcvr]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/receiver/otlpreceiver/README.md [otlp]: https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md -### Thrift over UDP (stable) - -**jaeger-agent** can only receive spans over UDP in Thrift format. The primary API is a UDP packet that contains a Thrift-encoded `Batch` struct defined in the [jaeger.thrift] IDL file, located in the [jaeger-idl] repository. Most Jaeger Clients use Thrift's `compact` encoding, however some client libraries do not support it (notably, Node.js) and use Thrift's `binary` encoding (sent to a different UDP port). **jaeger-agent**'s API is defined by the [agent.thrift] IDL file. - -For legacy reasons, **jaeger-agent** also accepts spans over UDP in Zipkin format, however, only very old versions of Jaeger clients can send data in that format and it is officially deprecated. - ### Protobuf via gRPC (stable) -In a typical Jaeger deployment, **jaeger-agent**s receive spans from Clients and forward them to **jaeger-collector**s. Since Jaeger v1.11, the official and recommended protocol between **jaeger-agent**s and **jaeger-collector**s is `jaeger.api_v2.CollectorService` gRPC endpoint defined in [collector.proto] IDL file. The same endpoint can be used to submit trace data from SDKs directly to **jaeger-collector**. +**Deprecated**: we recommend the OpenTelemetry protocol. + +Since Jaeger v1.11, the official protocol between applicationss and **jaeger-collector**s is `jaeger.api_v2.CollectorService` gRPC endpoint defined in [collector.proto] IDL file. The same endpoint can be used to submit trace data from SDKs directly to **jaeger-collector**. ### Thrift over HTTP (stable) -In some cases it is not feasible to deploy **jaeger-agent** next to the application, for example, when the application code is running as a serverless function. In these scenarios the SDKs can be configured to submit spans directly to **jaeger-collector**s over HTTP/HTTPS. +**Deprecated**: we recommend the OpenTelemetry protocol. -The same [jaeger.thrift] payload can be submitted in an HTTP POST request to the `/api/traces` endpoint, for example, `https://jaeger-collector:14268/api/traces`. The `Batch` struct needs to be encoded using Thrift's `binary` encoding, and the HTTP request should specify the content type header: +The payload in [jaeger.thrift] format can be submitted in an HTTP POST request to the `/api/traces` endpoint, for example, `https://jaeger-collector:14268/api/traces`. The `Batch` struct needs to be encoded using Thrift's `binary` encoding, and the HTTP request should specify the content type header: ``` Content-Type: application/vnd.apache.thrift.binary @@ -83,7 +79,7 @@ When using the `grpc` storage type (a.k.a. [remote storage](../deployment/#remot This API supports Jaeger's [Remote Sampling](../sampling/#remote-sampling) protocol, defined in the [sampling.proto] IDL file. -Both **jaeger-agent** and **jaeger-collector** implement the API. See [Remote Sampling](../sampling/#remote-sampling) for details on how to configure the Collector with sampling strategies. **jaeger-agent** is merely acting as a proxy to **jaeger-collector**. +**jaeger-collector** implements this API. See [Remote Sampling](../sampling/#remote-sampling) for details on how to configure the Collector with sampling strategies. The following table lists different endpoints and formats that can be used to query for sampling strategies. The official HTTP/JSON endpoints use standard [Protobuf-to-JSON mapping](https://developers.google.com/protocol-buffers/docs/proto3#json). @@ -91,8 +87,6 @@ Component | Port | Endpoint | Format | Notes --------- | ----- | ----------------- | --------- | ----- Collector | 14268 | `/api/sampling` | HTTP/JSON | Recommended for most SDKs Collector | 14250 | [sampling.proto] | gRPC | For SDKs that want to use gRPC (e.g. OpenTelemetry Java SDK) -Agent | 5778 | `/sampling` | HTTP/JSON | Recommended for most SDKs if the Agent is used in a deployment -Agent | 5778 | `/` (deprecated) | HTTP/JSON | Legacy format, with enums encoded as numbers. **Not recommended.** **Examples** @@ -102,19 +96,10 @@ $ go run ./cmd/all-in-one \ --sampling.strategies-file=cmd/all-in-one/sampling_strategies.json ``` -Query different endpoints in another terminal: +Query the endpoint in another terminal: ```shell -# Collector $ curl "http://localhost:14268/api/sampling?service=foo" {"strategyType":"PROBABILISTIC","probabilisticSampling":{"samplingRate":1}} - -# Agent -$ curl "http://localhost:5778/sampling?service=foo" -{"strategyType":"PROBABILISTIC","probabilisticSampling":{"samplingRate":1}} - -# Agent, legacy endpoint / (not recommended) -$ curl "http://localhost:5778/?service=foo" -{"strategyType":0,"probabilisticSampling":{"samplingRate":1}} ``` ## Service dependencies graph (internal) @@ -134,7 +119,6 @@ Please refer to the [SPM Documentation](../spm#api) [jaeger-idl]: https://github.com/jaegertracing/jaeger-idl/ [jaeger.thrift]: https://github.com/jaegertracing/jaeger-idl/blob/main/thrift/jaeger.thrift -[agent.thrift]: https://github.com/jaegertracing/jaeger-idl/blob/main/thrift/agent.thrift [sampling.thrift]: https://github.com/jaegertracing/jaeger-idl/blob/main/thrift/sampling.thrift [collector.proto]: https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v2/collector.proto [query.proto]: https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v2/query.proto diff --git a/content/docs/next-release/architecture.md b/content/docs/next-release/architecture.md index e0b11458..22cc9437 100644 --- a/content/docs/next-release/architecture.md +++ b/content/docs/next-release/architecture.md @@ -104,14 +104,6 @@ Instrumentation typically should not depend on specific tracing SDKs, but only o The instrumentation is designed to be always on in production. To minimize overhead, the SDKs employ various sampling strategies. When a trace is sampled, the profiling span data is captured and transmitted to the Jaeger backend. When a trace is not sampled, no profiling data is collected at all, and the calls to the tracing API are short-circuited to incur a minimal amount of overhead. For more information, please refer to the [Sampling](../sampling/) page. -### Agent - -{{< warning >}} -**jaeger-agent** is [deprecated](https://github.com/jaegertracing/jaeger/issues/4739). The OpenTelemetry data can be sent from the OpenTelemetry SDKs (equipped with OTLP exporters) directly to **jaeger-collector**. Alternatively, use the OpenTelemetry Collector as a local agent. -{{< /warning >}} - -**jaeger-agent** is a network daemon that listens for spans sent over UDP, which are batched and sent to the collector. It is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the collectors away from the client. **jaeger-agent** is **not** a required component. - ### Collector **jaeger-collector** receives traces, runs them through a processing pipeline for validation and clean-up/enrichment, and stores them in a storage backend. Jaeger comes with built-in support for several storage backends (see [Deployment](../deployment)), as well as extensible plugin framework for implementing custom storage plugins. diff --git a/content/docs/next-release/deployment.md b/content/docs/next-release/deployment.md index 13780dd9..14f7aa93 100644 --- a/content/docs/next-release/deployment.md +++ b/content/docs/next-release/deployment.md @@ -23,7 +23,6 @@ The main Jaeger backend components are released as Docker images on [Docker Hub] Component | Docker Hub | Quay --------------------- | -------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------- **jaeger-all-in-one** | [hub.docker.com/r/jaegertracing/all-in-one/](https://hub.docker.com/r/jaegertracing/all-in-one/) | [quay.io/repository/jaegertracing/all-in-one](https://quay.io/repository/jaegertracing/all-in-one) -**jaeger-agent** | [hub.docker.com/r/jaegertracing/jaeger-agent/](https://hub.docker.com/r/jaegertracing/jaeger-agent/) | [quay.io/repository/jaegertracing/jaeger-agent](https://quay.io/repository/jaegertracing/jaeger-agent) **jaeger-collector** | [hub.docker.com/r/jaegertracing/jaeger-collector/](https://hub.docker.com/r/jaegertracing/jaeger-collector/) | [quay.io/repository/jaegertracing/jaeger-collector](https://quay.io/repository/jaegertracing/jaeger-collector) **jaeger-query** | [hub.docker.com/r/jaegertracing/jaeger-query/](https://hub.docker.com/r/jaegertracing/jaeger-query/) | [quay.io/repository/jaegertracing/jaeger-query](https://quay.io/repository/jaegertracing/jaeger-query) **jaeger-ingester** | [hub.docker.com/r/jaegertracing/jaeger-ingester/](https://hub.docker.com/r/jaegertracing/jaeger-ingester/) | [quay.io/repository/jaegertracing/jaeger-ingester](https://quay.io/repository/jaegertracing/jaeger-ingester) @@ -65,7 +64,7 @@ Command line option | Environment variable ## All-in-one -Jaeger all-in-one is a special distribution that combines three Jaeger components, [agent](#agent), [collector](#collector), and [query service/UI](#query-service--ui), in a single binary or container image. It is useful for single-node deployments where your trace volume is light enough to be handled by a single instance. By default, all-in-one starts with `memory` storage, meaning it will lose all data upon restart. All other [span storage backends](#span-storage-backends) can also be used with all-in-one, but `memory` and `badger` are exclusive to all-in-one because they cannot be shared between instances. +Jaeger all-in-one is a special distribution that combines three Jaeger components, [collector](#collector), and [query service/UI](#query-service--ui), in a single binary or container image. It is useful for single-node deployments where your trace volume is light enough to be handled by a single instance. By default, all-in-one starts with `memory` storage, meaning it will lose all data upon restart. All other [span storage backends](#span-storage-backends) can also be used with all-in-one, but `memory` and `badger` are exclusive to all-in-one because they cannot be shared between instances. All-in-one listens to the same ports as the components it contains (described below), with the exception of the admin port. @@ -94,64 +93,6 @@ docker run -d --name jaeger \ You can navigate to `http://localhost:16686` to access the Jaeger UI. -## Agent - -{{< warning >}} -**jaeger-agent** is [deprecated](https://github.com/jaegertracing/jaeger/issues/4739). The OpenTelemetry data can be sent from the OpenTelemetry SDKs (equipped with OTLP exporters) directly to **jaeger-collector**. See the [Architecture](../architecture) page for alternative deployment options. -{{< /warning >}} - -**jaeger-agent** is designed to receive tracing data in Thrift format over UDP and run locally on each host, either as a host agent / daemon or as an application sidecar. **jaeger-agent** exposes the following ports: - -Port | Protocol | Function ------ | ------- | --- -6831 | UDP | Accepts [jaeger.thrift][jaeger-thrift] in `compact` Thrift protocol used by most current Jaeger clients. -6832 | UDP | Accepts [jaeger.thrift][jaeger-thrift] in `binary` Thrift protocol used by Node.js Jaeger client (because [thriftrw][thriftrw] npm package does not support `compact` protocol). -5778 | HTTP | Serves SDK configs, namely sampling strategies at `/sampling` (see [Remote Sampling](../sampling/#remote-sampling)). -5775 | UDP | Accepts [zipkin.thrift][zipkin-thrift] in `compact` Thrift protocol (deprecated; only used by very old Jaeger clients, circa 2016). -14271 | HTTP | Admin port: health check at `/` and metrics at `/metrics`. - -It can be executed directly on the host or via Docker, as follows: - -```sh -## make sure to expose only the ports you use in your deployment scenario! -docker run \ - --rm \ - -p6831:6831/udp \ - -p6832:6832/udp \ - -p5778:5778/tcp \ - -p5775:5775/udp \ - jaegertracing/jaeger-agent:{{< currentVersion >}} -``` - -### Discovery System Integration - -**jaeger-agent**s can connect point-to-point to a single **jaeger-collector** address, which could be -load balanced by another infrastructure component (e.g. DNS) across multiple **jaeger-collector**s. -**jaeger-agent** can also be configured with a static list of **jaeger-collector** addresses. - -On Docker, a command like the following can be used: - -```sh -docker run \ - --rm \ - -p5775:5775/udp \ - -p6831:6831/udp \ - -p6832:6832/udp \ - -p5778:5778/tcp \ - jaegertracing/jaeger-agent:{{< currentVersion >}} \ - --reporter.grpc.host-port=jaeger-collector.jaeger-infra.svc:14250 -``` - -When using gRPC, you have several options for load balancing and name resolution: - -* Single connection and no load balancing. This is the default if you specify a single `host:port`. (example: `--reporter.grpc.host-port=jaeger-collector.jaeger-infra.svc:14250`) -* Static list of hostnames and round-robin load balancing. This is what you get with a comma-separated list of addresses. (example: `reporter.grpc.host-port=jaeger-collector1:14250,jaeger-collector2:14250,jaeger-collector3:14250`) -* Dynamic DNS resolution and round-robin load balancing. To get this behavior, prefix the address with `dns:///` and gRPC will attempt to resolve the hostname using SRV records (for [external load balancing](https://github.com/grpc/grpc/blob/master/doc/load-balancing.md)), TXT records (for [service configs](https://github.com/grpc/grpc/blob/master/doc/service_config.md)), and A records. Refer to the [gRPC Name Resolution docs](https://github.com/grpc/grpc/blob/master/doc/naming.md) and the [dns_resolver.go implementation](https://github.com/grpc/grpc-go/blob/master/resolver/dns/dns_resolver.go) for more info. (example: `--reporter.grpc.host-port=dns:///jaeger-collector.jaeger-infra.svc:14250`) - -### Agent level tags - -Jaeger supports agent level tags, that can be added to the process tags of all spans passing through **jaeger-agent**. This is supported through the command line flag `--agent.tags=key1=value1,key2=value2,...,keyn=valuen`. Tags can also be set through an environment flag like so - `--agent.tags=key=${envFlag:defaultValue}` - The tag value will be set to the value of the `envFlag` environment key and `defaultValue` if not set. - ## Collector **jaeger-collector**s are stateless and thus many instances of **jaeger-collector** can be run in parallel. @@ -171,7 +112,7 @@ At default settings **jaeger-collector** exposes the following ports: | 14269 | HTTP | `/` | Admin port: health check (`GET`). | | | `/metrics` | Prometheus-style metrics (`GET`). | 9411 | HTTP | `/api/v1/spans` and `/api/v2/spans` | Accepts Zipkin spans in Thrift, JSON and Proto (disabled by default). -| 14250 | gRPC | n/a | Used by **jaeger-agent** to send spans in [model.proto][] Protobuf format. +| 14250 | gRPC | n/a | Accepts spans in [model.proto][] Protobuf format. ## Ingester diff --git a/content/docs/next-release/external-guides.md b/content/docs/next-release/external-guides.md index ae89605f..0a4d0c63 100644 --- a/content/docs/next-release/external-guides.md +++ b/content/docs/next-release/external-guides.md @@ -23,10 +23,8 @@ weight: 13 ## Deployment -* [Running Jaeger on bare metal](https://medium.com/jaegertracing/running-jaeger-agent-on-bare-metal-d1fc47d31fab) by Juraci Paixão Kröhling * [Protecting Jaeger UI with an OAuth sidecar Proxy](https://medium.com/jaegertracing/protecting-jaeger-ui-with-an-oauth-sidecar-proxy-34205cca4bb1) by Juraci Paixão Kröhling * [Jaeger and multi-tenancy](https://medium.com/jaegertracing/jaeger-and-multitenancy-99dfa1d49dc0) by Juraci Paixão Kröhling -* [Deployment strategies for the Jaeger Agent](https://medium.com/jaegertracing/deployment-strategies-for-the-jaeger-agent-1d6f91796d09) by Juraci Paixão Kröhling * [How to deploy Jaeger on AWS: a comprehensive step-by-step guide](https://www.aspecto.io/blog/how-to-deploy-jaeger-on-aws-a-comprehensive-step-by-step-guide/) by Tom Zach ## Jaeger Performance Tuning diff --git a/content/docs/next-release/faq.md b/content/docs/next-release/faq.md index ac330bc3..2491cfdc 100644 --- a/content/docs/next-release/faq.md +++ b/content/docs/next-release/faq.md @@ -16,17 +16,17 @@ The Dependencies page shows a graph of services traced by Jaeger and connections Please refer to the [Troubleshooting](../troubleshooting/) guide. -## Do I need to run jaeger-agent? +## What happened to jaeger-agent? -{{< warning >}} -Since the Jaeger client libraries [are deprecated](../client-libraries) and the OpenTelemetry SDKs are phasing out support for Jaeger Thrift format, the **jaeger-agent** is no longer required or recommended. See the [Architecture](../architecture) page for alternative deployment options. -{{< /warning >}} +Since the Jaeger client libraries [are deprecated](../client-libraries) and the OpenTelemetry SDKs are phasing out support for Jaeger Thrift format, the **jaeger-agent** is no longer required and no longer supported. See the [Architecture](../architecture) page for alternative deployment options. -`jaeger-agent` is not always necessary. Jaeger client libraries can be configured to export trace data directly to `jaeger-collector`. However, the following are the reasons why running `jaeger-agent` is recommended: +Sometimes it is still desireable to run a **host agent**: - * If we want Jaeger client libraries to send trace data directly to **jaeger-collector**s, we must provide them with a URL of the HTTP endpoint. It means that our applications require additional configuration containing this parameter, especially if we are running multiple Jaeger installations (e.g. in different availability zones or regions) and want the data sent to a nearby installation. In contrast, when using the agent, the libraries require no additional configuration because the agent is always accessible via `localhost`. It acts as a sidecar and proxies the requests to the appropriate **jaeger-collector**s. - * **jaeger-agent** can be configured to enrich the tracing data with infrastructure-specific metadata by adding extra tags to the spans, such as the current zone, region, etc. If **jaeger-agent** is running as a host daemon, it will be shared by all applications running on the same host. If **jaeger-agent** is running as a true sidecar, i.e. one per application, it can provide additional functionality such as strong authentication, multi-tenancy (see [this blog post](https://medium.com/jaegertracing/jaeger-and-multitenancy-99dfa1d49dc0)), pod name, etc. - * **jaeger-agent**s allow implementing traffic control to **jaeger-collector**s. If we have thousands of hosts in the data center, each running many applications, and each application sending data directly to **jaeger-collector**s, there may be too many open connections for each **jaeger-collector** to handle. The agents can load balance this traffic with fewer connections. + * If we want SDKs to send trace data directly to **jaeger-collector**s, we must provide them with a URL of the HTTP endpoint. It means that our applications require additional configuration containing this parameter, especially if we are running multiple Jaeger installations (e.g. in different availability zones or regions) and want the data sent to a nearby installation. In contrast, when using the host agent, the libraries require no additional configuration because the agent is always accessible via `localhost`. It acts as a sidecar and proxies the requests to the appropriate **jaeger-collector**s. + * A host agent can be configured to enrich the tracing data with infrastructure-specific metadata by adding extra tags to the spans, such as the current zone, region, etc. If the host agent is running as a host daemon, it will be shared by all applications running on the same host. If the host agent is running as a true sidecar, i.e. one per application, it can provide additional functionality such as strong authentication, multi-tenancy (see [this blog post](https://medium.com/jaegertracing/jaeger-and-multitenancy-99dfa1d49dc0)), pod name, etc. + * Host agents allow implementing traffic control to **jaeger-collector**s. If we have thousands of hosts in the data center, each running many applications, and each application sending data directly to **jaeger-collector**s, there may be too many open connections for each **jaeger-collector** to handle. The agents can load balance this traffic with fewer connections. + +If your circumstances requires a host agent, you can deploy OpenTelemetry Collector in that capacity. ## What is the recommended storage backend? diff --git a/content/docs/next-release/getting-started.md b/content/docs/next-release/getting-started.md index 9423ea38..6dc897e5 100644 --- a/content/docs/next-release/getting-started.md +++ b/content/docs/next-release/getting-started.md @@ -14,16 +14,13 @@ Historically, the Jaeger project supported its own SDKs (aka tracers, client lib ## All in One -**all-in-one** is an executable designed for quick local testing. It includes the Jaeger UI, **jaeger-collector**, **jaeger-query**, and **jaeger-agent**, with an in memory storage component. +**all-in-one** is an executable designed for quick local testing. It includes the Jaeger UI, **jaeger-collector**, and **jaeger-query**, with an in memory storage component. The simplest way to start the all-in-one is to use the pre-built image published to DockerHub (a single command line). ```bash docker run --rm --name jaeger \ -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \ - -p 6831:6831/udp \ - -p 6832:6832/udp \ - -p 5778:5778 \ -p 16686:16686 \ -p 4317:4317 \ -p 4318:4318 \ @@ -46,11 +43,6 @@ The container exposes the following ports: Port | Protocol | Component | Function ----- | ------- | --------- | --- -6831 | UDP | agent | accept `jaeger.thrift` over Thrift-compact protocol (used by most SDKs) -6832 | UDP | agent | accept `jaeger.thrift` over Thrift-binary protocol (used by Node.js SDK) -5775 | UDP | agent | (deprecated) accept `zipkin.thrift` over compact Thrift protocol (used by legacy clients only) -5778 | HTTP | agent | serve configs (sampling, etc.) - | | | 16686 | HTTP | query | serve frontend | | | 4317 | HTTP | collector | accept OpenTelemetry Protocol (OTLP) over gRPC diff --git a/content/docs/next-release/monitoring.md b/content/docs/next-release/monitoring.md index 8e82b6e3..1ead37d0 100644 --- a/content/docs/next-release/monitoring.md +++ b/content/docs/next-release/monitoring.md @@ -18,7 +18,6 @@ Each Jaeger component exposes the metrics scraping endpoint on the admin port: Component | Port --------------------- | --- -**jaeger-agent** | 14271 **jaeger-collector** | 14269 **jaeger-query** | 16687 **jaeger-ingester** | 14270 diff --git a/content/docs/next-release/operator.md b/content/docs/next-release/operator.md index 134d4c6b..d924eb54 100644 --- a/content/docs/next-release/operator.md +++ b/content/docs/next-release/operator.md @@ -20,7 +20,7 @@ While we intend to have the Jaeger Operator working for as many Kubernetes versi While multiple operators might coexist watching the same set of namespaces, which operator will succeed in setting itself as the owner of the CR is undefined behavior. Automatic injection of the sidecars might also result in undefined behavior. Therefore, it's highly recommended to have at most one operator watching each namespace. Note that namespaces might contain any number of Jaeger instances (CRs). {{< info >}} -The Jaeger Operator version tracks one version of the Jaeger components (**jaeger-query**, **jaeger-collector**, **jaeger-agent**). When a new version of the Jaeger components is released, a new version of the operator will be released that understands how running instances of the previous version can be upgraded to the new version. +The Jaeger Operator version tracks one version of the Jaeger components (**jaeger-query**, **jaeger-collector**). When a new version of the Jaeger components is released, a new version of the operator will be released that understands how running instances of the previous version can be upgraded to the new version. {{< /info >}} ## Prerequisite @@ -119,7 +119,7 @@ After the role is granted, switch back to a non-privileged user. # Quick Start - Deploying the AllInOne image -The simplest possible way to create a Jaeger instance is by creating a YAML file like the following example. This will install the default AllInOne strategy, which deploys the **all-in-one** image (combining **jaeger-agent**, **jaeger-collector**, **jaeger-query**, and Jaeger UI) in a single pod, using in-memory storage by default. +The simplest possible way to create a Jaeger instance is by creating a YAML file like the following example. This will install the default AllInOne strategy, which deploys the **all-in-one** image (combining **jaeger-collector**, **jaeger-query**, and Jaeger UI) in a single pod, using in-memory storage by default. {{< info >}} This default strategy is intended for development, testing, and demo purposes, not for production. @@ -240,7 +240,7 @@ The available strategies are described in the following sections. This strategy is intended for development, testing, and demo purposes. -The main backend components,**jaeger-agent**, **jaeger-collector** and **jaeger-query** service, are all packaged into a single executable which is configured (by default) to use in-memory storage. This strategy cannot be scaled beyond one replica. +The main backend components, **jaeger-collector** and **jaeger-query** service, are all packaged into a single executable which is configured (by default) to use in-memory storage. This strategy cannot be scaled beyond one replica. ## Production strategy diff --git a/content/docs/next-release/performance-tuning.md b/content/docs/next-release/performance-tuning.md index 6d019a1f..7f6e7bf5 100644 --- a/content/docs/next-release/performance-tuning.md +++ b/content/docs/next-release/performance-tuning.md @@ -21,14 +21,6 @@ Adding **jaeger-collector** instances is recommended when your platform provides Each span is written to the storage by **jaeger-collector** using one worker, blocking it until the span has been stored. When the storage is too slow, the number of workers blocked by the storage might be too high, causing spans to be dropped. To help diagnose this situation, the histogram `jaeger_collector_save_latency_bucket` can be analyzed. Ideally, the latency should remain the same over time. When the histogram shows that most spans are taking longer and longer over time, it’s a good indication that your storage might need some attention. -### Place the Agents close to your applications - -{{< warning >}} -Since the Jaeger client libraries [are deprecated](../client-libraries) and the OpenTelemetry SDKs are phasing out support for Jaeger Thrift format, the **jaeger-agent** is no longer required or recommended. See the [Architecture](../architecture) page for alternative deployment options. -{{< /warning >}} - -**jaeger-agent** is meant to be placed on the same host as the instrumented application, in order to avoid UDP packet loss over the network. This is typically accomplished by having one **jaeger-agent** per bare metal host for traditional applications, or as a sidecar in container environments like Kubernetes, as this helps spread the load handled by **jaeger-agent**s with the additional advantage of allowing each **jaeger-agent** to be tweaked individually, according to the application’s needs and importance. - ### Consider using Apache Kafka as intermediate buffer Jaeger [can use Apache Kafka](../architecture/) as a buffer between **jaeger-collector** and the actual backing storage (Elasticsearch, Apache Cassandra). This is ideal for cases where the traffic spikes are relatively frequent (prime time traffic) but the storage can eventually catch up once the traffic normalizes. For that, the `SPAN_STORAGE_TYPE` environment variable should be set to `kafka` in **jaeger-collector**, and **jaeger-ingester** component must be used, reading data from Kafka and writing it to the storage. @@ -61,52 +53,26 @@ We recommend setting your clients/SDKs to use the [`remote` sampling strategy](. ### Increase in-memory queue size -Most of the Jaeger clients, such as the Java, Go, and C# clients, buffer spans in memory before sending them to **jaeger-agent**/**jaeger-collector**. The maximum size of this buffer is defined by the environment variable `JAEGER_REPORTER_MAX_QUEUE_SIZE` (default value: about `100` spans): the larger the size, the higher the potential memory consumption. When the instrumented application is generating a large number of spans, it’s possible that the queue will be full causing the Client to discard the new spans (metric `jaeger_tracer_reporter_spans_total{result="dropped",}`). - -In most common scenarios, the queue will be close to empty (metric: `jaeger_tracer_reporter_queue_length`), as spans are flushed to **jaeger-agent** or **jaeger-collector** at regular intervals or when a certain size of the batch is reached. The detailed behavior of this queue is described in this [GitHub issue](https://github.com/jaegertracing/jaeger-client-java/issues/607). +Most of the SDKs buffer spans in memory before sending them to **jaeger-collector**. The maximum size of this buffer is configurable (see respective OpenTelemetry SDK documentation): the larger the size, the higher the potential memory consumption. When the instrumented application is generating a large number of spans, it’s possible that the queue will be full causing the SDK to discard the new spans. -Thrift clients also report their dropped spans to **jaeger-agent**. These are then published by **jaeger-agent** itself as `jaeger_agent_client_stats_spans_dropped_total{cause="full-queue|send-failure|too-large",}`. This can be useful if client metrics are unavailable for some reason. +In most common scenarios, the queue will be close to empty, as spans are flushed to **jaeger-collector** at regular intervals or when a certain size of the batch is reached. ### Modify the batched spans flush interval -The Java, Go, NodeJS, Python and C# Clients allow the customization of the flush interval (default value: `1000` milliseconds, or 1 second) used by the reporters, such as the `RemoteReporter`, to trigger a `flush` operation, sending all in-memory spans to **jaeger-agent** or **jaeger-collector**. The lower the flush interval is set to, the more frequent the flush operations happen. As most reporters will wait until enough data is in the queue, this setting will force a flush operation at periodic intervals, so that spans are sent to the backend in a timely fashion. - -When the instrumented application is generating a large number of spans and **jaeger-agent**/**jaeger-collector** is close to the application, the networking overhead might be low, justifying a higher number of flush operations. When the `HttpSender` is being used and the **jaeger-collector** is not close enough to the application, the networking overhead might be too high so that a higher value for this property makes sense. - -## Agent settings - -{{< warning >}} -Since the Jaeger client libraries [are deprecated](../client-libraries) and the OpenTelemetry SDKs are phasing out support for Jaeger Thrift format, the **jaeger-agent** is no longer required or recommended. See the [Architecture](../architecture) page for alternative deployment options. -{{< /warning >}} +The SDKs allow the customization of the flush interval used by the exporters. The lower the flush interval is set to, the more frequent the flush operations happen. As most exporters will wait until enough data is in the queue, this setting will force a flush operation at periodic intervals, so that spans are sent to the backend in a timely fashion. -**jaeger-agent**s receive data from Clients, sending them in batches to **jaeger-collector**. When not properly configured, it might end up discarding data even if the host machine has plenty of resources. - -### Adjust server queue sizes - -The set of “server queue size” properties ( `processor.jaeger-binary.server-queue-size`, `processor.jaeger-compact.server-queue-size`, `processor.zipkin-compact.server-queue-size`) indicate the maximum number of span batches that **jaeger-agent** can accept and store in memory. It’s safe to assume that `jaeger-compact` is the most important processor in your **jaeger-agent** setup, as it’s the only one available in most Clients, such as the Java and Go Clients. - -The default value for each queue is `1000` span batches. Given that each span batch has up to 64KiB worth of spans, each queue can hold up to 64MiB worth of spans. - -In typical scenarios, the queue will be close to empty (metric `jaeger_agent_thrift_udp_server_queue_size`) as span batches should be quickly picked up and processed by a worker. However, sudden spikes in the number of span batches submitted by Clients might occur, causing the batches to be queued. When the queue is full, the older batches are overridden causing spans to be discarded (metric `jaeger_agent_thrift_udp_server_packets_dropped_total`). - -### Adjust processor workers - -The set of “processor workers” properties ( `processor.jaeger-binary.workers`, `processor.jaeger-compact.workers`, `processor.zipkin-compact.workers`) indicate the number of parallel span batch processors to start. Each worker type has a default size of `10`. In general, span batches are processed as soon as they are placed in the server queue and will block a worker until the whole packet is sent to **jaeger-collector**. For **jaeger-agent**s processing data from multiple Clients, the number of workers should be increased. Given that the cost of each worker is low, a good rule of thumb is 10 workers per Client with moderate traffic: given that each span batch might contain up to 64KiB worth of spans, it means that 10 workers are able to send about 640KiB concurrently to a **jaeger-collector**. +When the instrumented application is generating a large number of spans and **jaeger-collector** is close to the application, the networking overhead might be low, justifying a higher number of flush operations. When the `HttpSender` is being used and the **jaeger-collector** is not close enough to the application, the networking overhead might be too high so that a higher value for this property makes sense. ## Collector settings -**jaeger-collector** receives data from Clients and **jaeger-agent**s. When not properly configured, it might process less data than what would be possible on the same host, or it might overload the host by consuming more memory than permitted. +**jaeger-collector** receives data from SDKs. When not properly configured, it might process less data than what would be possible on the same host, or it might overload the host by consuming more memory than permitted. ### Adjust queue size -Similar to the **jaeger-agent**, **jaeger-collector** is able to receive spans and place them in an internal queue for processing. This allows **jaeger-collector** to return immediately to the Client/**jaeger-agent** instead of waiting for the span to make its way to the storage. +**jaeger-collector** is able to receive spans and place them in an internal queue for processing. This allows **jaeger-collector** to return immediately to the SDK instead of waiting for the span to make its way to the storage. The setting `collector.queue-size` (default: `2000`) dictates how many spans the queue should support. In the typical scenario, the queue will be close to empty, as enough workers should exist picking up spans from the queue and sending them to the storage. When the number of items in the queue (metric `jaeger_collector_queue_length`) is permanently high, it’s an indication that either the number of workers should be increased or that the storage cannot keep up with the volume of data that it’s receiving. When the queue is full, the older items in the queue are overridden, causing spans to be discarded (metric `jaeger_collector_spans_dropped_total`). -{{< warning >}} -The queue size for **jaeger-agent** is about _span batches_, whereas the queue size for the Collector is about _individual spans_. -{{< /warning >}} - Given that the queue size should be close to empty most of the time, this setting should be as high as the available memory for the Collector, to provide maximum protection against sudden traffic spikes. However, if your storage layer is under-provisioned and cannot keep up, even a large queue will quickly fill up and start dropping data. Experimental: starting from Jaeger 1.17, **jaeger-collector** can adjust the queue size automatically based on the memory requirements and average span size. Set the flag `collector.queue-size-memory` to the maximum memory size in MiB that **jaeger-collector** should use, and Jaeger will periodically calculate the ideal queue size based on the average span size it has seen. For safety reasons, the maximum queue size is hard-coded to 1 million records. If you are using this feature, [give us your feedback](https://www.jaegertracing.io/get-in-touch/)! diff --git a/content/docs/next-release/sampling.md b/content/docs/next-release/sampling.md index 01e818db..b55c7677 100644 --- a/content/docs/next-release/sampling.md +++ b/content/docs/next-release/sampling.md @@ -17,7 +17,7 @@ When using configuration object to instantiate the tracer, the type of sampling * **Constant** (`sampler.type=const`) sampler always makes the same decision for all traces. It either samples all traces (`sampler.param=1`) or none of them (`sampler.param=0`). * **Probabilistic** (`sampler.type=probabilistic`) sampler makes a random sampling decision with the probability of sampling equal to the value of `sampler.param` property. For example, with `sampler.param=0.1` approximately 1 in 10 traces will be sampled. * **Rate Limiting** (`sampler.type=ratelimiting`) sampler uses a leaky bucket rate limiter to ensure that traces are sampled with a certain constant rate. For example, when `sampler.param=2.0` it will sample requests with the rate of 2 traces per second. -* **Remote** (`sampler.type=remote`, which is also the default) sampler consults **jaeger-agent** for the appropriate sampling strategy to use in the current service. This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend (see [Remote Sampling](#remote-sampling)), or even dynamically (see [Adaptive Sampling](#adaptive-sampling)). +* **Remote** (`sampler.type=remote`, which is also the default) sampler consults **jaeger-collector** for the appropriate sampling strategy to use in the current service. This allows controlling the sampling strategies in the services from a central configuration in Jaeger backend (see [Remote Sampling](#remote-sampling)), or even dynamically (see [Adaptive Sampling](#adaptive-sampling)). ## Remote Sampling diff --git a/content/docs/next-release/security.md b/content/docs/next-release/security.md index 3c47787f..194a4c51 100644 --- a/content/docs/next-release/security.md +++ b/content/docs/next-release/security.md @@ -5,17 +5,6 @@ hasparent: true This page documents the existing security mechanisms in Jaeger, organized by the pairwise connections between Jaeger components. We ask for community help with implementing additional security measures (see [issue-1718][]). -## SDK to Agent - -{{< warning >}} -**jaeger-agent** is [deprecated](https://github.com/jaegertracing/jaeger/issues/4739). The OpenTelemetry data can be sent from the OpenTelemetry SDKs (equipped with OTLP exporters) directly to **jaeger-collector**. Alternatively, use the OpenTelemetry Collector as a local agent. -{{< /warning >}} - -Deployments that involve **jaeger-agent** are meant for trusted environments where the agent is run as a sidecar within the container's network namespace, or as a host agent. Therefore, there is currently no support for traffic encryption between clients and agents. - -* {{< check_no >}} Sending trace data over UDP - no TLS/authentication. -* {{< check_no >}} Retrieving sampling configuration via HTTP - no TLS/authentication. - ## SDK to Collector OpenTelemetry SDKs can be configured to communicate directly with **jaeger-collector** via gRPC or HTTP, with optional TLS enabled. @@ -24,14 +13,6 @@ OpenTelemetry SDKs can be configured to communicate directly with **jaeger-colle * {{< check_yes >}} gRPC - TLS with mTLS (client cert authentication) supported. * Covers both span export and sampling configuration querying. -## Agent to Collector - -{{< warning >}} -**jaeger-agent** is [deprecated](https://github.com/jaegertracing/jaeger/issues/4739). -{{< /warning >}} - -* {{< check_yes >}} gRPC - TLS with client cert authentication supported. - ## Collector/Ingester/Query-Service to Storage * {{< check_yes >}} Cassandra - TLS with mTLS (client cert authentication) supported. diff --git a/content/docs/next-release/troubleshooting.md b/content/docs/next-release/troubleshooting.md index 90487924..b6ba8811 100644 --- a/content/docs/next-release/troubleshooting.md +++ b/content/docs/next-release/troubleshooting.md @@ -57,33 +57,16 @@ If you suspect the remote sampling is not working correctly, try these steps: {"strategyType":"PROBABILISTIC","probabilisticSampling":{"samplingRate":0.001}} ``` -## Bypass the Jaeger Agent - -{{< warning >}} -This only applies when using Jaeger SDKs. The use of **jaeger-agent** [is deprecated](../deployment/#agent) when using OpenTelemetry SDKs. -{{< /warning >}} - -By default, the Jaeger SDK is configured to send spans via UDP to a **jaeger-agent** running on `localhost`. As some networking setups might drop or block UDP packets, or impose size limits, the Jaeger SDK can be configured to bypass **jaeger-agent**, sending spans directly to **jaeger-collector**. Some SDKs, such as the Jaeger SDK for Java, support the environment variable `JAEGER_ENDPOINT` which can be used to specify **jaeger-collector**'s location, such as `http://jaeger-collector:14268/api/traces`. Refer to the Jaeger SDK documentation for the language you are using. For example, when you have configured the `JAEGER_ENDPOINT` property in the Jaeger SDK for Java, it logs the following when the tracer is created (notice `sender=HttpSender`): - - 2018-12-10 17:06:30 INFO Configuration:236 - Initialized tracer=JaegerTracer(..., reporter=CompositeReporter(reporters=[RemoteReporter(sender=HttpSender(), ...), ...]), ...) - -Note: the Jaeger SDK for Java will not fail when a connection to **jaeger-collector** cannot be established. Spans will be collected and placed in an internal buffer. They might eventually reach **jaeger-collector** once a connection is established, or get dropped in case the buffer reaches its maximum size. - ## Networking Namespace If your Jaeger backend is still not able to receive spans (see the following sections on how to check logs and metrics for that), then the issue is most likely with your networking namespace configuration. When running the Jaeger backend components as Docker containers, the typical mistakes are: * Not exposing the appropriate ports outside of the container. For example, the collector may be listening on `:14268` inside the container network namespace, but the port is not reachable from the outside. - * Not making **jaeger-agent**'s or **jaeger-collector**'s host name visible from the application's network namespace. For example, if you run both your application and Jaeger backend in separate containers in Docker, they either need to be in the same namespace, or the application's container needs to be given access to Jaeger backend using the `--link` option of the `docker` command. + * Not making **jaeger-collector**'s host name visible from the application's network namespace. For example, if you run both your application and Jaeger backend in separate containers in Docker, they either need to be in the same namespace, or the application's container needs to be given access to Jaeger backend using the `--link` option of the `docker` command. ## Increase the logging in the backend components -**jaeger-agent** and **jaeger-collector** provide useful debugging information when the log level is set to `debug`. Every UDP packet that is received by **jaeger-agent** is logged, as well as every batch that is sent by **jaeger-agent** to **jaeger-collector**. **jaeger-collector** also logs every batch it receives and logs every span that is stored in the permanent storage. - -Here's what to expect when **jaeger-agent** is started with the `--log-level=debug` flag: - - {"level":"debug","ts":1544458854.5367086,"caller":"processors/thrift_processor.go:113","msg":"Span(s) received by the agent","bytes-received":359} - {"level":"debug","ts":1544458854.5408711,"caller":"tchannel/reporter.go:133","msg":"Span batch submitted by the agent","span-count":3} +**jaeger-collector** provides useful debugging information when the log level is set to `debug`. **jaeger-collector** logs every batch it receives and logs every span that is stored in the permanent storage. On the **jaeger-collector** side, these are the expected log entries when the flag `--log-level=debug` is specified: diff --git a/content/docs/next-release/windows.md b/content/docs/next-release/windows.md index 8e8bbbf8..136d257b 100644 --- a/content/docs/next-release/windows.md +++ b/content/docs/next-release/windows.md @@ -5,17 +5,6 @@ hasparent: true In Windows environments, Jaeger processes can be hosted and managed as Windows services controlled via the `sc` utility. To configure such services on Windows, download [nssm.exe](https://nssm.cc/download) for the appropriate architecture, and issue commands similar to how Jaeger is typically run. The example below showcases a basic Elasticsearch setup, configured using both environment variables and process arguments. -## Agent -```bat -nssm install JaegerAgent C:\Jaeger\jaeger-agent.exe --reporter.grpc.host-port=localhost:14250 - -nssm set JaegerAgent AppStdout C:\Jaeger\jaeger-agent.out.log -nssm set JaegerAgent AppStderr C:\Jaeger\jaeger-agent.err.log -nssm set JaegerAgent Description Jaeger Agent service - -nssm start JaegerAgent -``` - ## Collector ```bat nssm install JaegerCollector C:\Jaeger\jaeger-collector.exe --es.server-urls=http://localhost:9200 --es.username=jaeger --es.password=PASSWORD @@ -40,4 +29,4 @@ nssm set JaegerUI AppEnvironmentExtra SPAN_STORAGE_TYPE=elasticsearch nssm start JaegerUI ``` -For additional information & docs, please see [the NSSM usage guide.](https://nssm.cc/usage) \ No newline at end of file +For additional information & docs, please see [the NSSM usage guide.](https://nssm.cc/usage) From 4329486bb1534074a258caaee5d928aa3ac98a73 Mon Sep 17 00:00:00 2001 From: Yuri Shkuro Date: Mon, 14 Oct 2024 21:46:13 -0400 Subject: [PATCH 2/2] fix Signed-off-by: Yuri Shkuro --- content/docs/next-release/apis.md | 2 +- content/docs/next-release/faq.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/next-release/apis.md b/content/docs/next-release/apis.md index e092f0ae..9dcf966c 100644 --- a/content/docs/next-release/apis.md +++ b/content/docs/next-release/apis.md @@ -35,7 +35,7 @@ The OTLP data is accepted in these formats: (1) binary gRPC, (2) Protobuf over H **Deprecated**: we recommend the OpenTelemetry protocol. -Since Jaeger v1.11, the official protocol between applicationss and **jaeger-collector**s is `jaeger.api_v2.CollectorService` gRPC endpoint defined in [collector.proto] IDL file. The same endpoint can be used to submit trace data from SDKs directly to **jaeger-collector**. +Since Jaeger v1.11, the official protocol between user applications and **jaeger-collector**s is `jaeger.api_v2.CollectorService` gRPC endpoint defined in [collector.proto] IDL file. The same endpoint can be used to submit trace data from SDKs directly to **jaeger-collector**. ### Thrift over HTTP (stable) diff --git a/content/docs/next-release/faq.md b/content/docs/next-release/faq.md index 2491cfdc..8b83ee6c 100644 --- a/content/docs/next-release/faq.md +++ b/content/docs/next-release/faq.md @@ -20,7 +20,7 @@ Please refer to the [Troubleshooting](../troubleshooting/) guide. Since the Jaeger client libraries [are deprecated](../client-libraries) and the OpenTelemetry SDKs are phasing out support for Jaeger Thrift format, the **jaeger-agent** is no longer required and no longer supported. See the [Architecture](../architecture) page for alternative deployment options. -Sometimes it is still desireable to run a **host agent**: +Sometimes it is still desirable to run a **host agent**: * If we want SDKs to send trace data directly to **jaeger-collector**s, we must provide them with a URL of the HTTP endpoint. It means that our applications require additional configuration containing this parameter, especially if we are running multiple Jaeger installations (e.g. in different availability zones or regions) and want the data sent to a nearby installation. In contrast, when using the host agent, the libraries require no additional configuration because the agent is always accessible via `localhost`. It acts as a sidecar and proxies the requests to the appropriate **jaeger-collector**s. * A host agent can be configured to enrich the tracing data with infrastructure-specific metadata by adding extra tags to the spans, such as the current zone, region, etc. If the host agent is running as a host daemon, it will be shared by all applications running on the same host. If the host agent is running as a true sidecar, i.e. one per application, it can provide additional functionality such as strong authentication, multi-tenancy (see [this blog post](https://medium.com/jaegertracing/jaeger-and-multitenancy-99dfa1d49dc0)), pod name, etc.