Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add troubleshoot for missing metrics for Argo installation #3493

Merged
merged 1 commit into from
Jan 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/troubleshoot-collection.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
- [Check the `/metrics` endpoint for Kubernetes services](#check-the-metrics-endpoint-for-kubernetes-services)
- [Check the Prometheus UI](#check-the-prometheus-ui)
- [Check Prometheus Remote Storage](#check-prometheus-remote-storage)
- [Missing metrics for ArgoCD installation](#missing-metrics-for-argocd-installation)
- [Common Issues](#common-issues)
- [Missing metrics - cannot see cluster in Explore](#missing-metrics---cannot-see-cluster-in-explore)
- [Pod stuck in `ContainerCreating` state](#pod-stuck-in-containercreating-state)
Expand Down Expand Up @@ -340,6 +341,34 @@ You [check Prometheus logs](#prometheus-logs) to verify there are no errors duri

You can also check `prometheus_remote_storage_.*` metrics to look for success/failure attempts.

### Missing metrics for ArgoCD installation

There is known issue with Argo CD and metrics collection. If you override `spec.source.helm.releaseName` in the `Application` or
`ApplicationSet`, which are used to configure your application in Argo CD, then Kube State and Node metrics are not collected due to the
following:

Service Monitor is looking for service labeled with `app.kubernetes.io/instance: <spec.source.helm.releaseName>`, but the label is actually
`app.kubernetes.io/instance: <metadata.name>`.

In order to fix it, you need to ensure that the labels are matching and you can do it by adding the following to `user-values.yaml`:

```
kube-prometheus-stack:
kube-state-metrics:
prometheus:
monitor:
selectorOverride:
app.kubernetes.io/instance: <metadata.name>
app.kubernetes.io/name: kube-state-metrics
prometheus-node-exporter:
prometheus:
monitor:
selectorOverride:
app.kubernetes.io/name: prometheus-node-exporter
Comment on lines +366 to +367
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use more than label, because our prometheus-node-expoerter is in version 4.3.1, and bug has been fixed in 4.4.2: prometheus-community/helm-charts#2618

which is used from kube-prometheus-stack in version 42.1.1 (version is 4.4.x) or definitely in 45.3.0

We are using kube-prometheus-stack in version 40.5.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to #3494

```

where `metadata.name` is value from Argo Application manifest

## Common Issues

### Missing metrics - cannot see cluster in Explore
Expand Down
16 changes: 16 additions & 0 deletions vagrant/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -299,3 +299,19 @@ istio-disable:

restart-pods:
kubectl -n sumologic delete pod --all --force --grace-period=0

install-argocd-binary:
curl -LO https://github.com/argoproj/argo-cd/releases/download/v2.9.3/argocd-linux-amd64
chmod +x argocd-linux-amd64
sudo mv argocd-linux-amd64 /usr/local/bin/argocd

install-argocd:
${mkfile_path} \
apply-receiver-mock
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
argocd admin initial-password -n argocd

expose-argocd:
kubectl port-forward svc/argocd-server -n argocd 8080:443 --address 0.0.0.0
22 changes: 22 additions & 0 deletions vagrant/k8s/argo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: collection
namespace: argocd
spec:
destination:
namespace: sumologic
server: https://kubernetes.default.svc
project: default
source:
chart: sumologic
helm:
releaseName: collection2
values:
'{"sumologic": {"accessId": "dummy", "accessKey": "dummy", "endpoint": "http://receiver-mock.receiver-mock:3000/terraform/api/"}}'
repoURL: https://sumologic.github.io/sumologic-kubernetes-collection
targetRevision: 4.3.1
syncPolicy:
syncOptions:
- CreateNamespace=true
- ServerSideApply=true