-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Lens] Unique count aggregation should have control for precision threshold and warning about estimates #69832
Comments
Pinging @elastic/kibana-app (Team:KibanaApp) |
To understand your sentence here:
How would you be able to limit the input in a numeric input? |
By using the |
But how would they know what values are valid if you truly are restricting them? With the EuiRange you could give them very specific allowed increments and values that they can select by using the ticks. |
@cchaos I like that proposal, I think we could use a slider with with predefined increments at 1000, 3000, 10000, and 40000 thresholds. I don't think we need the extra color indicator, just a slider with predefined ticks. |
Sweet! You'll probably also want to shorten the labels to |
+1 #179934 |
In my opinion we should not surface an imprecise aggregation type in our core visualization engine to handle what many would expect to be an exact deterministic aggregation result. This impacts business reporting for customers and they seek alternatives. |
Hello, just a feedback, I have a recent issue because of this. I have an index where I'm storing data about my Elastic Agents, each document corresponds to one Elastic Agent, one of the fields is the Using this data I create some dashboards with the Metric visualization, the goal is to have a quick glampse on the deployment of agents on my infrastructure since Elastic does not provide a native dashboard for this. The issue is that I had 6997 documents in the index and creating a Metric visualization over the After some investigation and quick chat in the slack channel I learned that the Metric visualization is an estimatted only, I do not create queries manually only using the built-in visualizations, I would expect it to not be an exact value for large datasets, things in the range of high hundred thousands and millions, but for it to not be exact for under 10k documents was a surprise. Changing from unique count to count solved my issue in this case, but now I will need to review every single dashboard that uses a unique count metric because I can not trust in the information anymore. In my opinion this needs to be made more clearly in Kibana documentation, maybe I'm missing something, but I couldn't find a Kibana documentation mentioning that the Unique Count in Metrics visualization is not exact and just an estimation, it may be present in the Elasticsearch documentation about cardinality queries, but not in the Kibana documentation. Also, the |
Hi @leandrojmp, Thanks for the feedback. You are perfectly right about the documentation and the editor missing description of the inaccuracy of that cardinality aggregation. This behavior should be documented and clearly described. |
@markov00 yeah, the main issue is that this information only exists in the Elasticsearch docs, no mention of it in Kibana. Another issue is that the |
A user has raised an issue with me around this last week. They didn't realise that the Unique Count might deliver approximate results. They also want a way to deliver precise results in Lens. I understand Vega can do it, but this is outside of what this user is able to accomplish given their experience & familiarity. |
+1
|
The default precision value of the Cardinality aggregation is 3,000 documents: above 3,000, the precision will drop off. The max value is 40,000 in Elasticsearch. Users should be able to tune this parameter in Lens. I propose that we use a numeric text input with validation instead of a slider, but the second-best option would be grouped button at 1000, 3000, 10000, and 40000 thresholds.
Lens should also provide some helper text to indicate that this is not a precise aggregation. I propose that we put this helper text in the editor panel for Unique count, and that the text should be:
This text is trying to indicate that the queries won't be slower, but that there are other costs associated to running the high-precision queries. This is based on some of the docs here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
cc @cchaos do you agree with the proposal to use a numeric input instead of grouped buttons? A slider would be a bad option here since there aren't many possible options. No design needed.
The text was updated successfully, but these errors were encountered: