Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to start cluster with unmanaged: true on latest versions #1607

Open
jonathon2nd opened this issue Jul 24, 2024 · 16 comments
Open

Failure to start cluster with unmanaged: true on latest versions #1607

jonathon2nd opened this issue Jul 24, 2024 · 16 comments
Labels

Comments

@jonathon2nd
Copy link

jonathon2nd commented Jul 24, 2024

I opened this issue when I attempted with previous version, but even latest versions do not work, so updated title.

Report

Creation of cluster with unmanaged: true fails, mongodb nodes bootloop.

Operator continues to log the following repeatedly as the mongo node bootloop.
INFO Replset is not exposed. Make sure each pod in the replset can reach each other. {"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "4071c15b-9595-443f-bb20-5705204cbd3d", "replset": "rs0"}

More about the problem

I am attempting to follow this guide
to migrate one of our legacy mongodb without downtime.

SSL off as the internal db we are using does not have that on, I will turn it on later with short downtime after migration is successful.

Yaml

---
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
  name: example-mongodb
  namespace: mongodb
spec:
  allowUnsafeConfigurations: true
  unsafeFlags:
    tls: true
  unmanaged: true

  crVersion: 1.15.0
  image: percona/percona-server-mongodb:4.4.24
  tls:
    mode: disabled
  replsets:
    - affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      name: rs0
      size: 3
      volumeSpec:
        persistentVolumeClaim:
          resources:
            requests:
              storage: 100Gi
      expose:
        enabled: false
        exposeType: LoadBalancer
  secrets:
    users: users
  sharding:
    enabled: false
  backup:
    enabled: false
    pitr:
      enabled: false
  upgradeOptions:
    apply: disabled
    schedule: 0 2 * * *

Steps to reproduce

  1. Deploy yaml as described, wait for failure.

Versions

  1. Kubernetes: v1.26.13 +rke2r1
  2. Operator: First tried with 1.16.1, upgraded to 1.16.3, no change

Anything else?

The DB I am attempting to migrate requires a target of 4.4, hence the cr version selection.

The deployment works without unmanaged: true

Two pod logs from start up to bootloop
example-mongodb-rs0-0_mongod.log
example-mongodb-rs0-1_mongod.log

Operator logs:

2024-07-24T18:21:17.244954170-04:00 2024-07-24T22:21:17.244Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "replset": "rs0"}
2024-07-24T18:21:17.519518649-04:00 2024-07-24T22:21:17.518Z	INFO	Created a new mongo key	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "KeyName": "example-mongodb-mongodb-keyfile"}
2024-07-24T22:21:17.535Z	INFO	Created a new mongo key	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "KeyName": "example-mongodb-mongodb-encryption-key"}
2024-07-24T18:21:17.699208811-04:00 2024-07-24T22:21:17.698Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "replset": "rs0", "size": 3, "pods": 1}
2024-07-24T18:21:17.751281423-04:00 2024-07-24T22:21:17.750Z	INFO	add new job	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "name": "ensure-version/mongodb/example-mongodb", "schedule": "0 2 * * *"}
2024-07-24T18:21:17.779567104-04:00 2024-07-24T22:21:17.779Z	INFO	Cluster state changed	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "de793d73-e501-4bac-93cd-351957acaa4a", "previous": "", "current": "initializing"}
2024-07-24T18:21:17.816437308-04:00 2024-07-24T22:21:17.815Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "a48c61c2-ac90-4f17-b581-cf8d1a3c5dbe", "replset": "rs0"}
2024-07-24T18:21:18.096219954-04:00 2024-07-24T22:21:18.095Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "a48c61c2-ac90-4f17-b581-cf8d1a3c5dbe", "replset": "rs0", "size": 3, "pods": 
1}
2024-07-24T18:21:22.815267111-04:00 2024-07-24T22:21:22.814Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "97ee1bca-d292-44c0-ae9b-4adf1dc2570d", "replset": "rs0"}
2024-07-24T22:21:23.071Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "97ee1bca-d292-44c0-ae9b-4adf1dc2570d", "replset": "rs0", "size": 3, "pods": 1}
2024-07-24T18:21:28.187110713-04:00 2024-07-24T22:21:28.186Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "9be35d66-6137-4d9d-aa13-2d98d1658fc2", "replset": "rs0"}
2024-07-24T18:21:28.489985716-04:00 2024-07-24T22:21:28.489Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "9be35d66-6137-4d9d-aa13-2d98d1658fc2", "replset": "rs0", "size": 3, "pods": 1}
2024-07-24T18:21:33.639456067-04:00 2024-07-24T22:21:33.639Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "10c49def-f0d8-4766-96e6-adf1c185f949", "replset": "rs0"}
2024-07-24T22:21:33.943Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "10c49def-f0d8-4766-96e6-adf1c185f949", "replset": "rs0", "size": 3, "pods": 1}
2024-07-24T22:21:39.363Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "21bb72e2-b804-4246-9ce7-45b87bce823d", "replset": "rs0"}
2024-07-24T22:21:39.630Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "21bb72e2-b804-4246-9ce7-45b87bce823d", "replset": "rs0", "size": 3, "pods": 1}
2024-07-24T22:21:45.632Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "e8703952-b61d-47a0-9193-27b57146df16", "replset": "rs0"}
2024-07-24T22:21:45.919Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "e8703952-b61d-47a0-9193-27b57146df16", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:21:46.084Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "14129a56-3d0e-47fc-981a-7c86b951d285", "replset": "rs0"}
2024-07-24T22:21:46.405Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "14129a56-3d0e-47fc-981a-7c86b951d285", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:21:51.312Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "c667ba1e-1d82-4850-b17f-759d61e0e4dd", "replset": "rs0"}
2024-07-24T22:21:51.619Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "c667ba1e-1d82-4850-b17f-759d61e0e4dd", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:21:57.004Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "fce820a1-c739-44e9-b578-47c9e3defcee", "replset": "rs0"}
2024-07-24T22:21:57.282Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "fce820a1-c739-44e9-b578-47c9e3defcee", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:22:02.817Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "79d72aa6-f450-41f9-8997-e8869282f337", "replset": "rs0"}
2024-07-24T22:22:03.090Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "79d72aa6-f450-41f9-8997-e8869282f337", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:22:08.509Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "fbcd6837-1e99-49c7-a801-37a6ab0d143d", "replset": "rs0"}
2024-07-24T22:22:08.795Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "fbcd6837-1e99-49c7-a801-37a6ab0d143d", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:22:14.213Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "10fdccca-4165-452b-8b56-043fdf63f8db", "replset": "rs0"}
2024-07-24T22:22:14.511Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "10fdccca-4165-452b-8b56-043fdf63f8db", "replset": "rs0", "size": 3, "pods": 2}
2024-07-24T22:22:19.927Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "8a52d2f9-57a4-4a20-94a9-507980a8a66b", "replset": "rs0"}
2024-07-24T22:22:31.851Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "6e9ad316-6787-4cb8-8146-ed34accbea83", "replset": "rs0"}
2024-07-24T22:22:43.739Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "41260c8c-bd30-4b66-a827-5fbb1a2b7081", "replset": "rs0"}
2024-07-24T22:22:55.687Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "7c3eb48f-59d5-4892-a2d6-f75118109e5c", "replset": "rs0"}
2024-07-24T22:23:07.532Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "727a406f-4f22-4e05-b1f3-7d4856077402", "replset": "rs0"}
2024-07-24T22:23:19.398Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "18cf442f-00ab-4605-82e3-4eec2c330abd", "replset": "rs0"}
2024-07-24T22:23:31.221Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "d018f4e4-efa9-4749-95b4-fdd6e9929538", "replset": "rs0"}
2024-07-24T22:23:43.030Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "5ebc4aed-b855-4483-b404-58580864283d", "replset": "rs0"}
2024-07-24T22:23:54.899Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "8b2c2559-2804-4388-a798-732bd5986536", "replset": "rs0"}
2024-07-24T22:24:06.718Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "1680e0de-9a6f-4669-9b8c-8a88e98c5397", "replset": "rs0"}
2024-07-24T22:24:18.539Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "81a4d31a-1abc-4eaa-8b22-823200d9c57f", "replset": "rs0"}
2024-07-24T22:24:30.317Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", "object": {"name":"example-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "example-mongodb", "reconcileID": "58e30cba-39a6-4d2a-88af-2cb736dce724", "replset": "rs0"}
2024-07-24T22:24:43.680Z	INFO	Replset is not exposed. Make sure each pod in the replset can reach each other.	{"controller": "psmdb-controller", 
@jonathon2nd
Copy link
Author

I attempted this on a new k8s cluster, with the 1.15.4 version of the operator, and it failed in the same way.

@jonathon2nd
Copy link
Author

Attempting the following cluster also fails.

spec:
  allowUnsafeConfigurations: true
  unsafeFlags:
    tls: true
  unmanaged: true
  crVersion: 1.16.2
  image: percona/percona-server-mongodb:6.0

@jonathon2nd
Copy link
Author

The following based off of, https://github.com/percona/percona-server-mongodb-operator/blob/v1.16.2/deploy/cr-minimal.yaml, fails the exact same way as well

apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
  name: minimal-cluster
  namespace: mongodb
spec:
  unmanaged: true
  crVersion: 1.16.2
  image: percona/percona-server-mongodb:7.0.8-5
  upgradeOptions:
    apply: disabled
    schedule: "0 2 * * *"
  secrets:
    users: example-users
  replsets:

  - name: rs0
    size: 3
    volumeSpec:
      persistentVolumeClaim:
        resources:
          requests:
            storage: 3Gi

  sharding:
    enabled: false
    ```

@jonathon2nd
Copy link
Author

Yeah at this point I am really confused. Either I am missing something I need to get an unmanaged cluster working, or unmanaged cluster are fully broken. I am assuming I am missing something.

@jonathon2nd jonathon2nd changed the title Failure to start cluster with unmanaged: true Failure to start cluster with unmanaged: true on latest versions Aug 1, 2024
@spron-in
Copy link
Collaborator

spron-in commented Aug 1, 2024

Hey @jonathon2nd, to make it work, all your replica set nodes need to be exposed. That way you can form a full mesh connection accross all of the nodes.
The operator tells you exactly that.
To get rid of that error, make sure that you expose your replica set nodes somehow.

  replsets:
    - name: rs0      
      ...
      expose:
        enabled: true
        exposeType: LoadBalancer

Make sure that expose.enabled is set to true.

@jonathon2nd
Copy link
Author

That is not it unfortunately. I had tried that already, though I added it here but I did not.

minimal-cluster-rs0-0_mongod.log
One of the mongo logs from start to creash/shutdown

percona/percona-server-mongodb-operator:1.16.2
Operator logs

2024-08-01T13:58:34.998413859-04:00 2024-08-01T17:58:34.997Z	INFO	createSSLByCertManager	updating cert-manager certificates	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3"}
2024-08-01T17:58:34.998Z	INFO	Creating old secrets	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3"}
2024-08-01T13:58:35.011585549-04:00 2024-08-01T17:58:35.011Z	INFO	applying new certificates	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3"}
2024-08-01T13:58:37.196462958-04:00 2024-08-01T17:58:37.196Z	INFO	migrating new ca	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3"}
2024-08-01T13:58:37.256609683-04:00 2024-08-01T17:58:37.256Z	INFO	Created a new mongo key	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3", "KeyName": "minimal-cluster-mongodb-keyfile"}
2024-08-01T13:58:37.511286475-04:00 2024-08-01T17:58:37.510Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:58:40.665246032-04:00 2024-08-01T17:58:40.664Z	INFO	Cluster state changed	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "e1cefcf8-6d9d-4d66-8575-8b3da2338bf3", "previous": "", "current": "initializing"}
2024-08-01T13:58:42.893580309-04:00 2024-08-01T17:58:42.893Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "nam
espace": "mongodb", "name": "minimal-cluster", "reconcileID": "3f6c7ae6-d381-48a0-8bc7-1fe877440dab", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:58:44.658645715-04:00 2024-08-01T17:58:44.658Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "7506e4c5-e59d-4c42-a423-a94456e5b26c", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:58:46.337067374-04:00 2024-08-01T17:58:46.336Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "723783ee-99a1-4bc6-a2e8-424f3c318845", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:58:51.308303487-04:00 2024-08-01T17:58:51.307Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "2e8f4e1e-f4c6-443f-b476-c7e02b761504", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:58:57.962204631-04:00 2024-08-01T17:58:57.961Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "2bdb8728-5ed3-4a50-827e-d27ef252dab1", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:59:04.604879319-04:00 2024-08-01T17:59:04.604Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "76e8a774-c13d-45c5-babd-dd2e4c521c01", "replset": "rs0", "size": 3, "pods": 1}
2024-08-01T13:59:11.307682203-04:00 2024-08-01T17:59:11.307Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "fa9e18ad-50fb-453
7-88e7-44566a81db88", "replset": "rs0", "size": 3, "pods": 2}
2024-08-01T13:59:15.414887733-04:00 2024-08-01T17:59:15.414Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "43d149dc-931f-40f0-a6c3-01554a4a6ee7", "replset": "rs0", "size": 3, "pods": 2}

yaml


  name: minimal-cluster
  namespace: mongodb
  resourceVersion: '953812129'
  uid: 356137d2-aaec-4278-9fca-2076f7a33933
spec:
  crVersion: 1.16.2
  image: percona/percona-server-mongodb:7.0.8-5
  replsets:
    - expose:
        enabled: true
        exposeType: LoadBalancer
      name: rs0
      size: 3
      volumeSpec:
        persistentVolumeClaim:
          resources:
            requests:
              storage: 3Gi
  secrets:
    users: production-users
  sharding:
    enabled: false
  unmanaged: true
  upgradeOptions:
    apply: disabled
    schedule: 0 2 * * *

@spron-in
Copy link
Collaborator

spron-in commented Aug 1, 2024

can you show
kubectl get psmdb
kubectl get pods
kubectl desribe pod PODNAME - where PODNAME is the one that is stuck in init.

If I can't figure it out, then might need to see the logs of the replica set Pod itself.

@jonathon2nd
Copy link
Author

I have already provided a log from a replica pod

minimal-cluster-rs0-0_mongod.log
One of the mongo logs from start to creash/shutdown

jonathon@jonathon-framework:~$ kubectl get psmdb -n mongodb
NAME              ENDPOINT                                          STATUS         AGE
minimal-cluster   10.1.9.54:27017,10.1.9.55:27017,10.1.9.56:27017   initializing   15m
jonathon@jonathon-framework:~$ kubectl get pods -n mongodb
NAME                    READY   STATUS    RESTARTS        AGE
minimal-cluster-rs0-0   1/1     Running   4 (2m58s ago)   15m
minimal-cluster-rs0-1   1/1     Running   4 (2m32s ago)   14m
minimal-cluster-rs0-2   1/1     Running   4 (2m10s ago)   14m
jonathon@jonathon-framework:~$ kubectl  describe pod minimal-cluster-rs0-0 -n mongodb
Name:             minimal-cluster-rs0-0
Namespace:        mongodb
Priority:         0
Service Account:  default
Node:             ovbh-vprod-k8s04-worker03/10.1.8.155
Start Time:       Thu, 01 Aug 2024 10:58:42 -0700
Labels:           app.kubernetes.io/component=mongod
                  app.kubernetes.io/instance=minimal-cluster
                  app.kubernetes.io/managed-by=percona-server-mongodb-operator
                  app.kubernetes.io/name=percona-server-mongodb
                  app.kubernetes.io/part-of=percona-server-mongodb
                  app.kubernetes.io/replset=rs0
                  controller-revision-hash=minimal-cluster-rs0-659865b7d4
                  statefulset.kubernetes.io/pod-name=minimal-cluster-rs0-0
Annotations:      cni.projectcalico.org/containerID: 057b14a0ecbc739f3d1ca0ad22c95a28891b3281f58e4e6e995e9051f9237e82
                  cni.projectcalico.org/podIP: 10.42.175.36/32
                  cni.projectcalico.org/podIPs: 10.42.175.36/32
                  percona.com/ssl-hash: 18dd5ab83c340a20c75a82b7258bf44a
                  percona.com/ssl-internal-hash: 4a9f6b2c937f20eee7990e0f9d82e6a8
Status:           Running
IP:               10.42.175.36
IPs:
  IP:           10.42.175.36
Controlled By:  StatefulSet/minimal-cluster-rs0
Init Containers:
  mongo-init:
    Container ID:  containerd://a902039fcce27daf5d72cdfbe8cdeb7a4210da1187dda6130bebd739c8e35162
    Image:         percona/percona-server-mongodb-operator:1.16.2
    Image ID:      docker.io/percona/percona-server-mongodb-operator@sha256:75abf186c7eb7b5ee56ed209f4a929b10b8bb6a7810f9f892694f93861d1dd14
    Port:          <none>
    Host Port:     <none>
    Command:
      /init-entrypoint.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 01 Aug 2024 10:58:54 -0700
      Finished:     Thu, 01 Aug 2024 10:58:54 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /data/db from mongod-data (rw)
      /opt/percona from bin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-42r78 (ro)
Containers:
  mongod:
    Container ID:  containerd://9b655b1539229d81787557165409537958556a5062fcf355d9e63d6954d76d29
    Image:         percona/percona-server-mongodb:7.0.8-5
    Image ID:      docker.io/percona/percona-server-mongodb@sha256:f81d1353d5497c5be36ee525f742d498ee6e1df9aba9502660c50f0fc98743b6
    Port:          27017/TCP
    Host Port:     0/TCP
    Command:
      /opt/percona/ps-entry.sh
    Args:
      --bind_ip_all
      --auth
      --dbpath=/data/db
      --port=27017
      --replSet=rs0
      --storageEngine=wiredTiger
      --relaxPermChecks
      --sslAllowInvalidCertificates
      --clusterAuthMode=x509
      --tlsMode=preferTLS
      --enableEncryption
      --encryptionKeyFile=/etc/mongodb-encryption/encryption-key
      --wiredTigerIndexPrefixCompression=true
      --quiet
    State:          Running
      Started:      Thu, 01 Aug 2024 11:11:03 -0700
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 01 Aug 2024 11:08:03 -0700
      Finished:     Thu, 01 Aug 2024 11:11:02 -0700
    Ready:          True
    Restart Count:  4
    Liveness:       exec [/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200] delay=60s timeout=10s period=30s #success=1 #failure=4
    Readiness:      exec [/opt/percona/mongodb-healthcheck k8s readiness --component mongod] delay=10s timeout=2s period=3s #success=1 #failure=8
    Environment Variables from:
      internal-minimal-cluster-users  Secret  Optional: false
    Environment:
      SERVICE_NAME:     minimal-cluster
      NAMESPACE:        mongodb
      MONGODB_PORT:     27017
      MONGODB_REPLSET:  rs0
    Mounts:
      /data/db from mongod-data (rw)
      /etc/mongodb-encryption from minimal-cluster-mongodb-encryption-key (ro)
      /etc/mongodb-secrets from minimal-cluster-mongodb-keyfile (ro)
      /etc/mongodb-ssl from ssl (ro)
      /etc/mongodb-ssl-internal from ssl-internal (ro)
      /etc/users-secret from users-secret-file (rw)
      /opt/percona from bin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-42r78 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  mongod-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  mongod-data-minimal-cluster-rs0-0
    ReadOnly:   false
  minimal-cluster-mongodb-keyfile:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  minimal-cluster-mongodb-keyfile
    Optional:    false
  bin:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  minimal-cluster-mongodb-encryption-key:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  minimal-cluster-mongodb-encryption-key
    Optional:    false
  ssl:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  minimal-cluster-ssl
    Optional:    false
  ssl-internal:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  minimal-cluster-ssl-internal
    Optional:    true
  users-secret-file:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  internal-minimal-cluster-users
    Optional:    false
  kube-api-access-42r78:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               13m                   default-scheduler        Successfully assigned mongodb/minimal-cluster-rs0-0 to ovbh-vprod-k8s04-worker03
  Normal   SuccessfulAttachVolume  13m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-b0a37fa3-322f-4f0f-b5ab-c3822892debe"
  Normal   Pulling                 13m                   kubelet                  Pulling image "percona/percona-server-mongodb-operator:1.16.2"
  Normal   Pulled                  13m                   kubelet                  Successfully pulled image "percona/percona-server-mongodb-operator:1.16.2" in 249.375832ms (249.47691ms including waiting)
  Normal   Created                 13m                   kubelet                  Created container mongo-init
  Normal   Started                 13m                   kubelet                  Started container mongo-init
  Normal   Pulled                  13m                   kubelet                  Successfully pulled image "percona/percona-server-mongodb:7.0.8-5" in 228.966574ms (228.977485ms including waiting)
  Normal   Pulled                  10m                   kubelet                  Successfully pulled image "percona/percona-server-mongodb:7.0.8-5" in 222.54261ms (222.554789ms including waiting)
  Normal   Killing                 7m16s (x2 over 10m)   kubelet                  Container mongod failed liveness probe, will be restarted
  Normal   Pulling                 7m15s (x3 over 13m)   kubelet                  Pulling image "percona/percona-server-mongodb:7.0.8-5"
  Normal   Created                 7m15s (x3 over 13m)   kubelet                  Created container mongod
  Normal   Pulled                  7m15s                 kubelet                  Successfully pulled image "percona/percona-server-mongodb:7.0.8-5" in 215.427972ms (215.446863ms including waiting)
  Normal   Started                 7m14s (x3 over 13m)   kubelet                  Started container mongod
  Warning  Unhealthy               2m46s (x13 over 11m)  kubelet                  Liveness probe failed: command "/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200" timed out

@spron-in
Copy link
Collaborator

spron-in commented Aug 1, 2024

My hypothesis is that you need to form a replica set so that liveness probe passes. Standalone unmanaged node will fail the lifeness probe.
What happens when you try to add these nodes into a replica set?

@jonathon2nd
Copy link
Author

I am getting auth errors.
I am not able to fix it, cause attempting to log into the new nodes with MONGODB_DATABASE_ADMIN_USER fails.
I have setup the users secret and am using that.

@spron-in
Copy link
Collaborator

spron-in commented Aug 1, 2024

What errors are you getting on the "managed" side?

@jonathon2nd
Copy link
Author

From rs.status() what else should I be looking at?

{
--
_id: 2,
name: '10.1.9.54:27017',
health: 0,
state: 6,
stateStr: '(not reachable/healthy)',
uptime: 0,
optime: [Object],
optimeDurable: [Object],
optimeDate: 1970-01-01T00:00:00.000Z,
optimeDurableDate: 1970-01-01T00:00:00.000Z,
lastHeartbeat: 2024-08-01T18:49:27.616Z,
lastHeartbeatRecv: 1970-01-01T00:00:00.000Z,
pingMs: Long("0"),
authenticated: false,
configVersion: -1
}

@jonathon2nd
Copy link
Author

I am not able to connect to the new nodes with any auth in the pods command line.

@jonathon2nd
Copy link
Author

The operator is still waiting for the pods

2024-08-01T18:32:27.829Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"test-mongodb","namespace":"mongodb"}, "namespace": "mongodb", "name": "test-mongodb", "reconcileID": "856bb0f5-a482-479b-9148-dc10cf78c425", "replset": "rs0", "size": 3, "pods": 2}

I suspect that the operator has not added the users to the nodes yet.

@jonathon2nd
Copy link
Author

Yes that seems to be what is happening, it does not setup the users till into the setup process

Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "f794e4c3-cfc2-4181-8021-86936d867c4e", "replset": "rs0", "size": 3, "pods": 2}
2024-08-01T19:07:21.431Z	INFO	Waiting for the pods	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "1c2e85c8-ea22-4331-b0a5-68f11417f704", "replset": "rs0", "size": 3, "pods": 2}
2024-08-01T19:07:37.012Z	INFO	initiating replset	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "74f8b860-3eab-46ff-a6b6-d721f9fd531e", "replset": "rs0", "pod": "minimal-cluster-rs0-0"}
2024-08-01T19:07:45.353Z	INFO	replset initialized	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "74f8b860-3eab-46ff-a6b6-d721f9fd531e", "replset": "rs0", "pod": "minimal-cluster-rs0-0"}
2024-08-01T19:07:47.928Z	INFO	Fixing member tags	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "764764d0-22b2-4d60-b266-d7ed196131b4", "replset": "rs0"}
2024-08-01T19:07:47.933Z	INFO	Adding new nodes	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "764764d0-22b2-4d60-b266-d7ed196131b4", "replset": "rs0"}
2024-08-01T19:07:47.959Z	INFO	Configuring member votes and priorities	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "764764d0-22b2-4d60-b266-d7ed196131b4", "replset": "rs0"}
2024-08-01T19:07:51.548Z	INFO	Adding new nodes	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "1cde9c98-3e40-484b-87e4-0cfd6965c693", "replset": "rs0"}
2024-08-01T19:07:51.570Z	INFO	Configuring member votes and priorities	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "1cde9c98-3e40-484b-87e4-0cfd6965c693", "replset": "rs0"}
2024-08-01T19:08:03.304Z	INFO	Cluster state changed	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "9808766e-49eb-4515-89c7-b7d5bfd5d37a", "previous": "initializing", "current": "ready"}
2024-08-01T19:08:04.098Z	INFO	Secret data changed. Updating users...	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "d8347221-aaa5-40b4-93a3-17c4488df241"}
2024-08-01T19:08:04.658Z	INFO	update Mongo version to 7.0.8-5 (fetched from db)	{"controller": "psmdb-controller", "object": {"name":"minimal-cluster","namespace":"mongodb"}, "namespace": "mongodb", "name": "minimal-cluster", "reconcileID": "d8347221-aaa5-40b4-93a3-17c4488df241"}

When not unmanaged

@jonathon2nd
Copy link
Author

Any ideas @spron-in ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants