Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stack Monitoring] Viewing monitoring data in Cloud production cluster #128959

Closed
neptunian opened this issue Mar 30, 2022 · 10 comments
Closed

[Stack Monitoring] Viewing monitoring data in Cloud production cluster #128959

neptunian opened this issue Mar 30, 2022 · 10 comments
Labels
Feature:Stack Monitoring Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@neptunian
Copy link
Contributor

neptunian commented Mar 30, 2022

In cloud after the user sets up a production cluster to ship monitoring data to a separate cluster, some expect to view the monitoring data in the production cluster or have a more obvious way of knowing that they need to open Kibana in the monitoring deployment to view monitoring data. Another case is users simply want to be able to view monitoring data in their production cluster along with their production data. If not on cloud we document this scenario telling users to use the setting monitoring.ui.elasticsearch.hosts to point to your monitoring cluster, but this is not supported in Cloud. I was able to achieve this in Cloud by adding the monitoring cluster as a remote in the production cluster, but users may not be aware of it.

A support engineer suggested this behavior should be default (view monitoring data in the production cluster). Could we add the monitoring cluster as a remote by default? Stack monitoring already has CCS enabled by default.

If not, perhaps we should update the UI to make this more obvious or at least the docs to address this scenario.

Stack Monitoring from the production cluster when shipping data to another cluster:
image (4)

https://www.elastic.co/guide/en/kibana/current/monitoring-data.html
https://www.elastic.co/guide/en/cloud/current/ec-enable-logging-and-monitoring.html#ec-access-kibana-monitoring

@ravikesarwani

@neptunian neptunian added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Feature:Stack Monitoring labels Mar 30, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@kunisen
Copy link
Contributor

kunisen commented Apr 1, 2022

Put a note here as it's important:

I was able to achieve this in Cloud by adding the monitoring cluster as a remote in the production cluster, but users may not be aware of it.

Per guided by @neptunian,
If there are multiple remote clusters, configuring the above is going to give the person on that particular production cluster access to all the data in the monitoring cluster regardless of which production cluster they are on.

So this workaround is eligible only if

  • User has only one prod cluster and one monitoring cluster
  • Or user has multiple prod clusters, and they don't mind seeing ALL clusters monitoring data regardless which cluster they are on

@kunisen
Copy link
Contributor

kunisen commented Apr 2, 2022

Another update for future refrence:

I was able to achieve this in Cloud by adding the monitoring cluster as a remote in the production cluster, but users may not be aware of it.

Per discuss further with @neptunian, if user has multiple prod cluster with 1 central monitoring cluster, doing the above is going to give the person on that particular production cluster access to all the data in the monitoring cluster regardless of which production cluster they are on.

Since Stack Monitoring today is configured to query all configured remote clusters for monitoring data (an index pattern like *:.monitoring-* is used). So it suffices to configure the remotes for now.
There's another #120384 to allow configuring which remotes to use.

This ticket may need to wait on #120384.

@matschaffer
Copy link
Contributor

matschaffer commented Apr 4, 2022

If not, perhaps we should update the UI to make this more obvious or at least the docs to address this scenario.

I know I've discussed this before (I think with @jasonrhodes) but maybe I never opened an issue so thanks for opening this one.

Automatically adding CSS remotes seems like the kind of thing that could catch a user off-guard. Especially after seeing how the default *: pattern has caught people off-guard already.

But an option I've had on my mind is having the orchestration layer (cloud, eck, etc) provide info about which deployment is handling monitoring for the currently viewed deployment.

Then we could at least provide something in the UI like:

Monitoring Data Not available
Note: This deployment is being monitored by (monitoring deployment), click here to view monitoring data in that UI.

@kunisen
Copy link
Contributor

kunisen commented Apr 7, 2022

Thanks @matschaffer !! ❤️

Then we could at least provide something in the UI like:

Monitoring Data Not available
Note: This deployment is being monitored by (monitoring deployment), click here to view monitoring data in that UI.

That's what I proposed in an internal ticket, we should be at least guided if the cluster is using remote monitoring.
Otherwise there's no way to tell if it's being monitored, or it's just not turning on the monitoring.

@jasonrhodes
Copy link
Member

jasonrhodes commented Apr 18, 2022

So I think there are two ideas here but I just want to make sure I'm tracking. Given the following setup:

Cluster Type Components Monitoring Cluster?
A Production Elasticsearch A + Kibana A + APM A D
B Staging Elasticsearch B + Kibana B + APM B D
C Test Elasticsearch C + Kibana C + APM C D
D Monitoring Elasticsearch D + Kibana D -

Idea 1: Local Duplication Mirroring

Enable the local Stack Monitoring UI to display its own monitoring data by using CCS to pull that data in from the monitoring cluster. This would likely require filtering that data so that it only returns information about the current local cluster rather than enabling access to all monitoring data.

In the above example, a user visiting the Stack Monitoring UI in Kibana A would see data via CCS pointed to ES D, but that data should be filtered to only be about ES A, Kibana A, and APM A. The same for a user visiting the Stack Monitoring UI in Kibanas B or C.

Idea 2: Link Out

Detect when a cluster is set up for monitoring but sending data to a remote monitoring cluster and change the Stack Monitoring UI message to reflect this information and providing a link to the monitoring cluster's Stack Monitoring UI. It'd be great if that link was directly to the cluster in question, if possible.

In the above example, a user visiting the Stack Monitoring UI in Kibana A would see a message about monitoring already being set up and a link to the SM UI in Kibana D (if possible, to the Cluster A page there).

Next Steps

It sounds like what I'm hearing in this thread is that Idea 2 would be good enough and Idea 1 is probably somewhat significantly more complicated than Idea 2, yes? If so let's either edit these AC or create a new issue for that work and we can put it in our queue for a future cycle.

@kunisen
Copy link
Contributor

kunisen commented Apr 19, 2022

Thanks @jasonrhodes for the comment!

It sounds like what I'm hearing in this thread is that Idea 2 would be good enough and Idea 1 is probably somewhat significantly more complicated than Idea 2, yes?

Yes, you are right, we had several rounds of discussion internally and it came out that using CCS is much simpler than local duplication, in terms of the complexity aspect (, while local duplication idea, self-monitoring itself isn't really an issue, i.e. self-monitoring can block monitoring visibility).

If so let's either edit these AC or create a new issue for that work and we can put it in our queue for a future cycle.

Sorry but can you share some more insights about how to edit the AC please?
I think I didn't get your point clearly.

@jasonrhodes
Copy link
Member

@kunisen I changed the name of Idea 1 to "Local Mirroring" instead of "Local Duplication" to be more clear.

This ticket is not about the specific problem you mention in your SDH, but the more general problem of "User visits Stack Monitoring UI in a Kibana instance that is being monitored via a separate dedicated monitoring cluster". We agree that seeing a "no monitoring data" screen there is confusing, and we'd like to fix it. The two ideas are either to read back in via CCS or to provide a link to the monitoring cluster.

I do however see now that providing that link could be problematic if the user doesn't have access to that central monitoring cluster, and/or isn't supposed to visit there because it would provide that user with access to all the monitoring data that is collected there, so I don't think we've quite found a good solution yet.

@kunisen
Copy link
Contributor

kunisen commented Apr 20, 2022

Thanks @jasonrhodes for the clarification and detailed explanation!

We agree that seeing a "no monitoring data" screen there is confusing, and we'd like to fix it. The two ideas are either to read back in via CCS or to provide a link to the monitoring cluster.

I do however see now that providing that link could be problematic if the user doesn't have access to that central monitoring cluster, and/or isn't supposed to visit there because it would provide that user with access to all the monitoring data that is collected there, so I don't think we've quite found a good solution yet.

[1]

100% agree that simply providing the link could be problematic as by doing that user may have more visibility than expected.
If this is the case, we probably can simply say "you have remote monitoring in use", instead of showing nothing (same as the situation the user never has configured monitoring), when the user accesses the Stack Monitoring page in their non monitoring (production/staging/test) cluster.

[2]

Previously @matschaffer and I had a zoom session, and we came up with a potential workaround, not on our table yet, but can help address things in such a situation.
I grabbed the content from the internal ticket and pasted it here for better visibility.

[Workaround] Sending prod cluster's monitoring data to monitoring cluster, production cluster accesses monitoring data via CCS, with kibana configuration to control visibility

graph LR

subgraph team 1
  ProdKibana["Kibana"]
  ProdElasticsearch["Elasticsearch"]
  ProdKibana-.->|"monitoring:.monitoring-* & configuration filter"|ProdElasticsearch
  Metricbeat-.->|Poll|ProdElasticsearch
  EndUser1([EndUser])-.->|Monitoring UI|ProdKibana
end

subgraph admin team
  Admin([Admin])-.->|Monitoring UI|MonitoringElasticsearch
  MonitoringElasticsearch["Elasticsearch"]
end

Metricbeat-->|monitoring data|MonitoringElasticsearch
ProdElasticsearch-.->|CCS|MonitoringElasticsearch
Loading

We can use a kibana configuration to simply filter visibility.

For example:

monitoring.ui.visible_cluster_uuids:
  - (prod cluster 1 uuid)

This avoids the cost of storing the monitoring data in both places.

The production clusters would still be capable of directly querying all monitoring data via CCS, but the UI would only show the configured cluster.

This would require changes to stack monitoring UI code, and there may be unforeseen issues trying to apply global filtering to all queries in both the UI and possibly alerting rules.

Moreover, for things on ESS, this setting can have the default value to its own cluster, and can be handled by the orchestration layer but not exposed to user.
By doing so, user can see only their own cluster, in their non-monitoring (prod/stating/test) cluster.


The above [1] should be very simple but it cannot address the real problem.
The above [2] can address it with relatively small impact, at least from user side, if they are on cloud, they do not need to configure anything, in order to view the monitoring metrics in Stack Monitoring page when they access their prod/staging/test cluster.

@jasonrhodes
Copy link
Member

If this is the case, we probably can simply say "you have remote monitoring in use", instead of showing nothing (same as the situation the user never has configured monitoring), when the user accesses the Stack Monitoring page in their non monitoring (production/staging/test) cluster.

Yes, we came to this same conclusion this morning! We're going to move ahead with a very, very simple change of mentioning remote monitoring in the message for all users, but in the future we may also check to see if remote monitoring is enabled and change the message based on that fact.

cc: @miltonhultgren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Stack Monitoring Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

5 participants