You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We actively use Elastic Observability to monitor several distributed services in production. Cost optimization is an essential priority in operating such observability platforms, and sampling is an effective tool to achieve this.
When choosing an observability platform, users naturally focus on how efficiently they can optimize costs while quickly identifying problematic traces.
Problem Statement
Currently, the sampling methods provided by Elastic Observability are quite generic and basic. We aim to use this tool more effectively to identify and resolve issues.
As of the current version (8.17.0), the Tail-Based Sampling (TBS) policies only offer the ability to sample traces based on trace.outcome, which is useful but limited. Other policies are primarily designed for filtering static, known information and do not significantly enhance the ability to capture interesting traces.
For customers like us who use APM for monitoring, there are common patterns we observe, such as tracking failed transactions or slower traces. Adding the ability to sample traces based on trace duration or the duration of a specific transaction (root span) exceeding a certain threshold would be extremely beneficial. This would help us identify and resolve problematic traces more effectively.
Request
We propose adding a policy to Tail-Based Sampling that enables sampling of slower traces based on their duration.
Given that the duration of a trace or transaction can be calculated using the transaction.duration.us field, implementing this feature in the TBS mechanism should not involve overly complex logic. Elastic Observability’s official documentation has even mentioned this feature, reinforcing its importance.
Unlike head-based sampling, each trace does not have an equal probability of being sampled. Because slower traces are more interesting than faster ones, tail-based sampling uses weighted random sampling — so traces with a longer root transaction duration are more likely to be sampled than traces with a fast root transaction duration.
We are confident that this feature would be extremely valuable not just for us but for many other users as well.
Additional Notes
If this feature is already planned, we would appreciate an estimated timeline or version.
If not, we hope it can be positively considered for inclusion in future roadmaps.
Thank you for your support and consideration!
The text was updated successfully, but these errors were encountered:
Hello,
We actively use Elastic Observability to monitor several distributed services in production. Cost optimization is an essential priority in operating such observability platforms, and sampling is an effective tool to achieve this.
When choosing an observability platform, users naturally focus on how efficiently they can optimize costs while quickly identifying problematic traces.
Problem Statement
Currently, the sampling methods provided by Elastic Observability are quite generic and basic. We aim to use this tool more effectively to identify and resolve issues.
As of the current version (8.17.0), the Tail-Based Sampling (TBS) policies only offer the ability to sample traces based on
trace.outcome
, which is useful but limited. Other policies are primarily designed for filtering static, known information and do not significantly enhance the ability to capture interesting traces.For customers like us who use APM for monitoring, there are common patterns we observe, such as tracking failed transactions or slower traces. Adding the ability to sample traces based on trace duration or the duration of a specific transaction (root span) exceeding a certain threshold would be extremely beneficial. This would help us identify and resolve problematic traces more effectively.
Request
We propose adding a policy to Tail-Based Sampling that enables sampling of slower traces based on their duration.
Given that the duration of a trace or transaction can be calculated using the transaction.duration.us field, implementing this feature in the TBS mechanism should not involve overly complex logic. Elastic Observability’s official documentation has even mentioned this feature, reinforcing its importance.
We are confident that this feature would be extremely valuable not just for us but for many other users as well.
Additional Notes
Thank you for your support and consideration!
The text was updated successfully, but these errors were encountered: