Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APM: add known issue of lazy rollover bug #4459

Merged
merged 7 commits into from
Oct 31, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions docs/en/observability/apm/known-issues.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,51 @@ _Versions: XX.XX.XX, YY.YY.YY, ZZ.ZZ.ZZ_
// If applicable, link to fix
////

[discrete]
== Upgrading to v8.15.0 may cause ingestion to fail

_Elastic Stack versions: 8.15.0_ +
_Fixed in Elastic Stack version 8.15.1_
inge4pres marked this conversation as resolved.
Show resolved Hide resolved

// The conditions in which this issue occurs
The issue only occurs when _upgrading_ the {stack} from 8.12.2 directly to any 8.15.x version.
inge4pres marked this conversation as resolved.
Show resolved Hide resolved
The issue does _not_ occur when creating a _new_ cluster using any 8.15.x version, or when upgrading
from 8.12.2 to 8.13.x and then to 8.15.x.
inge4pres marked this conversation as resolved.
Show resolved Hide resolved

// Describe why it happens
In APM Servers versions prior to 8.13.0, an ingestion pipeline exists to perform a check on the version.
The version check would fail any APM document produced with a different version of APM server compared to the version of the installed APM’s ingest pipeline.
In 8.13.0 the version check in the ingest pipeline was removed.
However, in 8.15.0 APM Server has a different way of managing data streams, that is not aware of the version check present in the ingest pipeline.
Thus, the leftover version check remains and prevents the ingestion of data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit confusing, and the main RC is the lazy rollover bug; how about:

Due to the combination of an internal change in how apm data management assets are set up from 8.15 onwards and a bug in Elasticsearch, related to lazy rollover of data streams, the ingestion pipeline conducting the version check is not removed on upgrade and prevents the ingestion of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was esitant to add a link to the ES issue, but yeah that makes it read better


// How to fix it
If the deployment is running 8.15.0, upgrade the deployment to 8.15.1 or above.
A manual rollover of all APM data streams is required to pick up the new index templates and remove the faulty ingest pipeline version check.
Perform the following requests (they are assuming the `default` namespace is used, adjust if necessary):
+
[source,txt]
----
POST /traces-apm-default/_rollover
POST /traces-apm.rum-default/_rollover
POST /logs-apm.error-default/_rollover
POST /logs-apm.app-default/_rollover
POST /metrics-apm.app-default/_rollover
POST /metrics-apm.internal-default/_rollover
POST /metrics-apm.service_destination.1m-default/_rollover
POST /metrics-apm.service_destination.10m-default/_rollover
POST /metrics-apm.service_destination.60m-default/_rollover
POST /metrics-apm.service_summary.1m-default/_rollover
POST /metrics-apm.service_summary.10m-default/_rollover
POST /metrics-apm.service_summary.60m-default/_rollover
POST /metrics-apm.service_transaction.1m-default/_rollover
POST /metrics-apm.service_transaction.10m-default/_rollover
POST /metrics-apm.service_transaction.60m-default/_rollover
POST /metrics-apm.transaction.1m-default/_rollover
POST /metrics-apm.transaction.10m-default/_rollover
POST /metrics-apm.transaction.60m-default/_rollover
----

[discrete]
== Upgrading to v8.15.0 may cause APM indices to lose their lifecycle policy

Expand Down