Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request #2001

Closed
judexzhu opened this issue Feb 10, 2021 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@judexzhu
Copy link

judexzhu commented Feb 10, 2021

Bug Report

What did you do?

git clone https://github.com/operator-framework/operator-lifecycle-manager.git

~ # k apply -f deploy/upstream/quickstart/crds.yaml
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created

# wait a long time, over 5 mins 

~ #  k apply -f deploy/upstream/quickstart/olm.yaml
namespace/olm created
namespace/operators created
serviceaccount/olm-operator-serviceaccount created
clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager created
clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm created
deployment.apps/olm-operator created
deployment.apps/catalog-operator created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-view created
operatorgroup.operators.coreos.com/global-operators created
operatorgroup.operators.coreos.com/olm-operators created
clusterserviceversion.operators.coreos.com/packageserver created
catalogsource.operators.coreos.com/operatorhubio-catalog created

What did you expect to see?
clusterserviceversion.operators.coreos.com/packageserver success installed and packages.operators.coreos.com/v1 works

What did you see instead? Under which circumstances?

➜  olm git:(master) ✗ k api-resources
NAME                              SHORTNAMES   APIGROUP                       NAMESPACED   KIND
....
error: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request

from API-Server logs

repeating

...
W0210 20:19:32.418237       1 handler_proxy.go:102] no RequestInfo found in the context
E0210 20:19:32.418496       1 controller.go:114] loading OpenAPI spec for "v1.packages.operators.coreos.com" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0210 20:19:32.418523       1 controller.go:127] OpenAPI AggregationController: action for item v1.packages.operators.coreos.com: Rate Limited Requeue.
E0210 20:19:36.420936       1 available_controller.go:420] v1.packages.operators.coreos.com failed with: failing or missing response from https://10.218.33.124:5443/apis/packages.operators.coreos.com/v1: Get https://10.218.33.124:5443/apis/packages.operators.coreos.com/v1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
...

image

~ k get ev -n olm | grep clusterserviceversion/packageserver
17m         Normal    RequirementsUnknown   clusterserviceversion/packageserver      requirements not yet checked
6m58s       Normal    AllRequirementsMet    clusterserviceversion/packageserver      all requirements found, attempting install
6m58s       Normal    InstallSucceeded      clusterserviceversion/packageserver      waiting for install components to report healthy
6m57s       Normal    InstallWaiting        clusterserviceversion/packageserver      APIServices not installed
118s        Warning   InstallCheckFailed    clusterserviceversion/packageserver      install timeout
118s        Normal    NeedsReinstall        clusterserviceversion/packageserver      APIServices not installed

Environment

  • operator-lifecycle-manager version:

0.17.0

  • Kubernetes version information:
~ k version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.12", GitCommit:"7cd5e9086de8ae25d6a1514d0c87bac67ca4a481", GitTreeState:"clean", BuildDate:"2020-11-12T09:11:15Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:
    Vanilla Kubernetes deploy on baremetal via Ansible using hyperkube on Flatcar Linux OS

Possible Solution
N/A, tried the 0.16.1 install.sh, same thing

Additional context

~ k get clusterserviceversion/packageserver -n olm -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"operators.coreos.com/v1alpha1","kind":"ClusterServiceVersion","metadata":{"annotations":{},"labels":{"olm.version":"0.17.0"},"name":"packageserver","namespace":"olm"},"spec":{"apiservicedefinitions":{"owned":[{"containerPort":5443,"deploymentName":"packageserver","description":"A PackageManifest is a resource generated from existing CatalogSources and their ConfigMaps","displayName":"PackageManifest","group":"packages.operators.coreos.com","kind":"PackageManifest","name":"packagemanifests","version":"v1"}]},"description":"Represents an Operator package that is available from a given CatalogSource which will resolve to a ClusterServiceVersion.","displayName":"Package Server","install":{"spec":{"clusterPermissions":[{"rules":[{"apiGroups":["authorization.k8s.io"],"resources":["subjectaccessreviews"],"verbs":["create","get"]},{"apiGroups":[""],"resources":["configmaps"],"verbs":["get","list","watch"]},{"apiGroups":["operators.coreos.com"],"resources":["catalogsources"],"verbs":["get","list","watch"]},{"apiGroups":["packages.operators.coreos.com"],"resources":["packagemanifests"],"verbs":["get","list"]}],"serviceAccountName":"olm-operator-serviceaccount"}],"deployments":[{"name":"packageserver","spec":{"replicas":2,"selector":{"matchLabels":{"app":"packageserver"}},"strategy":{"type":"RollingUpdate"},"template":{"metadata":{"labels":{"app":"packageserver"}},"spec":{"containers":[{"command":["/bin/package-server","-v=4","--secure-port","5443","--global-namespace","olm"],"image":"quay.io/operator-framework/olm@sha256:de396b540b82219812061d0d753440d5655250c621c753ed1dc67d6154741607","imagePullPolicy":"Always","livenessProbe":{"httpGet":{"path":"/healthz","port":5443,"scheme":"HTTPS"}},"name":"packageserver","ports":[{"containerPort":5443}],"readinessProbe":{"httpGet":{"path":"/healthz","port":5443,"scheme":"HTTPS"}},"resources":{"requests":{"cpu":"10m","memory":"50Mi"}},"securityContext":{"runAsUser":1000},"terminationMessagePolicy":"FallbackToLogsOnError","volumeMounts":[{"mountPath":"/tmp","name":"tmpfs"}]}],"nodeSelector":{"kubernetes.io/os":"linux"},"serviceAccountName":"olm-operator-serviceaccount","volumes":[{"emptyDir":{},"name":"tmpfs"}]}}}}]},"strategy":"deployment"},"installModes":[{"supported":true,"type":"OwnNamespace"},{"supported":true,"type":"SingleNamespace"},{"supported":true,"type":"MultiNamespace"},{"supported":true,"type":"AllNamespaces"}],"keywords":["packagemanifests","olm","packages"],"links":[{"name":"Package Server","url":"https://github.com/operator-framework/operator-lifecycle-manager/tree/master/pkg/package-server"}],"maintainers":[{"email":"[email protected]","name":"Red Hat"}],"maturity":"alpha","minKubeVersion":"1.11.0","provider":{"name":"Red Hat"},"version":"0.17.0"}}
    olm.operatorGroup: olm-operators
    olm.operatorNamespace: olm
    olm.targetNamespaces: olm
  creationTimestamp: "2021-02-10T20:07:23Z"
  generation: 2
  labels:
    olm.api.4bca9f23e412d79d: provided
    olm.version: 0.17.0
  managedFields:
  - apiVersion: operators.coreos.com/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:labels:
          .: {}
          f:olm.version: {}
      f:spec:
        .: {}
        f:apiservicedefinitions:
          .: {}
          f:owned: {}
        f:description: {}
        f:displayName: {}
        f:install:
          .: {}
          f:spec:
            .: {}
            f:clusterPermissions: {}
          f:strategy: {}
        f:installModes: {}
        f:keywords: {}
        f:links: {}
        f:maintainers: {}
        f:maturity: {}
        f:minKubeVersion: {}
        f:provider:
          .: {}
          f:name: {}
        f:version: {}
    manager: kubectl
    operation: Update
    time: "2021-02-10T20:07:23Z"
  - apiVersion: operators.coreos.com/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:olm.operatorGroup: {}
          f:olm.operatorNamespace: {}
          f:olm.targetNamespaces: {}
        f:labels:
          f:olm.api.4bca9f23e412d79d: {}
      f:spec:
        f:customresourcedefinitions: {}
        f:install:
          f:spec:
            f:deployments: {}
      f:status:
        .: {}
        f:certsLastUpdated: {}
        f:certsRotateAt: {}
        f:conditions: {}
        f:lastTransitionTime: {}
        f:lastUpdateTime: {}
        f:message: {}
        f:phase: {}
        f:reason: {}
        f:requirementStatus: {}
    manager: olm
    operation: Update
    time: "2021-02-10T20:28:39Z"
  name: packageserver
  namespace: olm
  resourceVersion: "7929990"
  selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/olm/clusterserviceversions/packageserver
  uid: fbba0eeb-6543-4acf-926e-0d9ab66ce3d3
spec:
  apiservicedefinitions:
    owned:
    - containerPort: 5443
      deploymentName: packageserver
      description: A PackageManifest is a resource generated from existing CatalogSources
        and their ConfigMaps
      displayName: PackageManifest
      group: packages.operators.coreos.com
      kind: PackageManifest
      name: packagemanifests
      version: v1
  customresourcedefinitions: {}
  description: Represents an Operator package that is available from a given CatalogSource
    which will resolve to a ClusterServiceVersion.
  displayName: Package Server
  install:
    spec:
      clusterPermissions:
      - rules:
        - apiGroups:
          - authorization.k8s.io
          resources:
          - subjectaccessreviews
          verbs:
          - create
          - get
        - apiGroups:
          - ""
          resources:
          - configmaps
          verbs:
          - get
          - list
          - watch
        - apiGroups:
          - operators.coreos.com
          resources:
          - catalogsources
          verbs:
          - get
          - list
          - watch
        - apiGroups:
          - packages.operators.coreos.com
          resources:
          - packagemanifests
          verbs:
          - get
          - list
        serviceAccountName: olm-operator-serviceaccount
      deployments:
      - name: packageserver
        spec:
          replicas: 2
          selector:
            matchLabels:
              app: packageserver
          strategy:
            type: RollingUpdate
          template:
            metadata:
              creationTimestamp: null
              labels:
                app: packageserver
            spec:
              containers:
              - command:
                - /bin/package-server
                - -v=4
                - --secure-port
                - "5443"
                - --global-namespace
                - olm
                image: quay.io/operator-framework/olm@sha256:de396b540b82219812061d0d753440d5655250c621c753ed1dc67d6154741607
                imagePullPolicy: Always
                livenessProbe:
                  httpGet:
                    path: /healthz
                    port: 5443
                    scheme: HTTPS
                name: packageserver
                ports:
                - containerPort: 5443
                  protocol: TCP
                readinessProbe:
                  httpGet:
                    path: /healthz
                    port: 5443
                    scheme: HTTPS
                resources:
                  requests:
                    cpu: 10m
                    memory: 50Mi
                securityContext:
                  runAsUser: 1000
                terminationMessagePolicy: FallbackToLogsOnError
                volumeMounts:
                - mountPath: /tmp
                  name: tmpfs
              nodeSelector:
                kubernetes.io/os: linux
              serviceAccountName: olm-operator-serviceaccount
              volumes:
              - emptyDir: {}
                name: tmpfs
    strategy: deployment
  installModes:
  - supported: true
    type: OwnNamespace
  - supported: true
    type: SingleNamespace
  - supported: true
    type: MultiNamespace
  - supported: true
    type: AllNamespaces
  keywords:
  - packagemanifests
  - olm
  - packages
  links:
  - name: Package Server
    url: https://github.com/operator-framework/operator-lifecycle-manager/tree/master/pkg/package-server
  maintainers:
  - email: [email protected]
    name: Red Hat
  maturity: alpha
  minKubeVersion: 1.11.0
  provider:
    name: Red Hat
  version: 0.17.0
status:
  certsLastUpdated: "2021-02-10T20:27:28Z"
  certsRotateAt: "2023-02-09T20:27:28Z"
  conditions:
  - lastTransitionTime: "2021-02-10T20:12:25Z"
    lastUpdateTime: "2021-02-10T20:12:25Z"
    message: install timeout
    phase: Failed
    reason: InstallCheckFailed
  - lastTransitionTime: "2021-02-10T20:12:25Z"
    lastUpdateTime: "2021-02-10T20:12:25Z"
    message: APIServices not installed
    phase: Pending
    reason: NeedsReinstall
  - lastTransitionTime: "2021-02-10T20:12:26Z"
    lastUpdateTime: "2021-02-10T20:12:26Z"
    message: all requirements found, attempting install
    phase: InstallReady
    reason: AllRequirementsMet
  - lastTransitionTime: "2021-02-10T20:12:26Z"
    lastUpdateTime: "2021-02-10T20:12:26Z"
    message: waiting for install components to report healthy
    phase: Installing
    reason: InstallSucceeded
  - lastTransitionTime: "2021-02-10T20:12:26Z"
    lastUpdateTime: "2021-02-10T20:12:28Z"
    message: APIServices not installed
    phase: Installing
    reason: InstallWaiting
  - lastTransitionTime: "2021-02-10T20:17:26Z"
    lastUpdateTime: "2021-02-10T20:17:26Z"
    message: install timeout
    phase: Failed
    reason: InstallCheckFailed
  - lastTransitionTime: "2021-02-10T20:17:26Z"
    lastUpdateTime: "2021-02-10T20:17:26Z"
    message: APIServices not installed
    phase: Pending
    reason: NeedsReinstall
  - lastTransitionTime: "2021-02-10T20:17:26Z"
    lastUpdateTime: "2021-02-10T20:17:26Z"
    message: all requirements found, attempting install
    phase: InstallReady
    reason: AllRequirementsMet
  - lastTransitionTime: "2021-02-10T20:17:27Z"
    lastUpdateTime: "2021-02-10T20:17:27Z"
    message: waiting for install components to report healthy
    phase: Installing
    reason: InstallSucceeded
  - lastTransitionTime: "2021-02-10T20:17:27Z"
    lastUpdateTime: "2021-02-10T20:17:28Z"
    message: APIServices not installed
    phase: Installing
    reason: InstallWaiting
  - lastTransitionTime: "2021-02-10T20:22:26Z"
    lastUpdateTime: "2021-02-10T20:22:26Z"
    message: install timeout
    phase: Failed
    reason: InstallCheckFailed
  - lastTransitionTime: "2021-02-10T20:22:27Z"
    lastUpdateTime: "2021-02-10T20:22:27Z"
    message: APIServices not installed
    phase: Pending
    reason: NeedsReinstall
  - lastTransitionTime: "2021-02-10T20:22:27Z"
    lastUpdateTime: "2021-02-10T20:22:27Z"
    message: all requirements found, attempting install
    phase: InstallReady
    reason: AllRequirementsMet
  - lastTransitionTime: "2021-02-10T20:22:27Z"
    lastUpdateTime: "2021-02-10T20:22:27Z"
    message: waiting for install components to report healthy
    phase: Installing
    reason: InstallSucceeded
  - lastTransitionTime: "2021-02-10T20:22:27Z"
    lastUpdateTime: "2021-02-10T20:22:29Z"
    message: APIServices not installed
    phase: Installing
    reason: InstallWaiting
  - lastTransitionTime: "2021-02-10T20:27:26Z"
    lastUpdateTime: "2021-02-10T20:27:26Z"
    message: install timeout
    phase: Failed
    reason: InstallCheckFailed
  - lastTransitionTime: "2021-02-10T20:27:27Z"
    lastUpdateTime: "2021-02-10T20:27:27Z"
    message: APIServices not installed
    phase: Pending
    reason: NeedsReinstall
  - lastTransitionTime: "2021-02-10T20:27:27Z"
    lastUpdateTime: "2021-02-10T20:27:27Z"
    message: all requirements found, attempting install
    phase: InstallReady
    reason: AllRequirementsMet
  - lastTransitionTime: "2021-02-10T20:27:27Z"
    lastUpdateTime: "2021-02-10T20:27:27Z"
    message: waiting for install components to report healthy
    phase: Installing
    reason: InstallSucceeded
  - lastTransitionTime: "2021-02-10T20:27:27Z"
    lastUpdateTime: "2021-02-10T20:27:29Z"
    message: APIServices not installed
    phase: Installing
    reason: InstallWaiting
  lastTransitionTime: "2021-02-10T20:27:27Z"
  lastUpdateTime: "2021-02-10T20:27:29Z"
  message: APIServices not installed
  phase: Installing
  reason: InstallWaiting
  requirementStatus:
  - group: operators.coreos.com
    kind: ClusterServiceVersion
    message: CSV minKubeVersion (1.11.0) less than server version (v1.18.12)
    name: packageserver
    status: Present
    version: v1alpha1
  - group: apiregistration.k8s.io
    kind: APIService
    message: ""
    name: v1.packages.operators.coreos.com
    status: DeploymentFound
    version: v1
  - dependents:
    - group: rbac.authorization.k8s.io
      kind: PolicyRule
      message: cluster rule:{"verbs":["create","get"],"apiGroups":["authorization.k8s.io"],"resources":["subjectaccessreviews"]}
      status: Satisfied
      version: v1
    - group: rbac.authorization.k8s.io
      kind: PolicyRule
      message: cluster rule:{"verbs":["get","list","watch"],"apiGroups":[""],"resources":["configmaps"]}
      status: Satisfied
      version: v1
    - group: rbac.authorization.k8s.io
      kind: PolicyRule
      message: cluster rule:{"verbs":["get","list","watch"],"apiGroups":["operators.coreos.com"],"resources":["catalogsources"]}
      status: Satisfied
      version: v1
    - group: rbac.authorization.k8s.io
      kind: PolicyRule
      message: cluster rule:{"verbs":["get","list"],"apiGroups":["packages.operators.coreos.com"],"resources":["packagemanifests"]}
      status: Satisfied
      version: v1
    group: ""
    kind: ServiceAccount
    message: ""
    name: olm-operator-serviceaccount
    status: Present
    version: v1
@judexzhu judexzhu added the kind/bug Categorizes issue or PR as related to a bug. label Feb 10, 2021
@judexzhu
Copy link
Author

Update:

The issue is gone after I upgrade my cluster from v1.18.12 to v1.19.7.

Don't know what causes the issue, the Kubernetes version or Hyperkube

Hyperkube has been completely dropped on v1.19.7, so I use binary and separate Kubernetes components docker images to replace it.

k get clusterserviceversion -n olm
NAME            DISPLAY          VERSION   REPLACES   PHASE
packageserver   Package Server   0.17.0               Succeeded

Feel free to close this issue, if need more information, I will be very glad to help.

Thanks

@exdx
Copy link
Member

exdx commented Mar 26, 2021

Thanks @judexzhu , I will close since it looks like your issue has been resolved. Packageserver, like any APIService, can have intermittent connectivity issues with the api-server (or the api-server is not configured to handle APIServices correctly). It should recover but it some cases it may not. May be hyperkube related.

@exdx exdx closed this as completed Mar 26, 2021
@SoMuchForSubtlety
Copy link

SoMuchForSubtlety commented Aug 21, 2021

I am experiencing the same issue on a k3s cluster. It's blocking me from deleting namespaces

status:
  conditions:
  - lastTransitionTime: "2021-08-21T19:55:14Z"
    message: 'Discovery failed for some groups, 1 failing: unable to retrieve the
      complete list of server APIs: packages.operators.coreos.com/v1: the server is
      currently unable to handle the request'
    reason: DiscoveryFailed
    status: "True"
    type: NamespaceDeletionDiscoveryFailure

running kubectl delete apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com "fixes" the issue

@ddevaal
Copy link

ddevaal commented Aug 25, 2021

I have the same issue as the person above me. This is really annoying. Can this be fixed?

@ooichman
Copy link

Thank guys,
I had the same issue on k3s and the command :
kubectl delete apiservices.apiregistration.k8s.io v1.packages.operators.coreos.com
did the work

thanks

@gisyrus
Copy link

gisyrus commented Jun 13, 2022

Thank you @SoMuchForSubtlety you saved my day!

@romulus-ai
Copy link

@SoMuchForSubtlety @gisyrus had a quiet similar issue, the problem was that I have not opened the ports for kubeapi to reache OLM packageserver at port 5443. May a not to harmfull solution as deleting the apiservice.

@Crazy-Hopper
Copy link

@SoMuchForSubtlety @gisyrus had a quiet similar issue, the problem was that I have not opened the ports for kubeapi to reache OLM packageserver at port 5443. May a not to harmfull solution as deleting the apiservice.

This seems to be the only correct solution.

@relaxdiego
Copy link

relaxdiego commented Feb 5, 2024

I didn't have any issues with ports like @romulus-ai. Instead, the cause of this error:

error: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request

Was because everything in the olm namespace was deleted at the same time which prevents the olm operator from cleaning up all resources. I think the sequence of deletion should be:

  1. Delete the ClusterServiceVersion named packageserver which will cause the olm operator to delete OLM including the apiservice registration v1.packages.operators.coreos.com.
  2. Delete the olm namespace.

I believe the permanent fix for this is to separate the packageserver definition in olm.yaml into its own file so that it's more obvious what the sequence should be.

@rrmistry
Copy link

For us, the certificate for package server had expired.

The solution was to delete the secret, delete the relevant pods, and wait until the secret and pods get recreated.

Source: https://access.redhat.com/solutions/6999798

Commands:

# OpenShift Environment (can be compatible with kubectl / Kubernetes relevant commands)

# Delete the certs
oc delete secret catalog-operator-serving-cert olm-operator-serving-cert packageserver-service-cert -n openshift-operator-lifecycle-manager

# Delete the relevant pods
oc delete pod -l 'app in (catalog-operator, olm-operator, packageserver, package-server-manager)' -n openshift-operator-lifecycle-manager

# Wait and keep checking
oc get pods -n openshift-operator-lifecycle-manager
# Repeat 🔁 until all `Running` or `Completed`

# Delete the old API once all pods are healthy
oc delete apiservice v1.packages.operators.coreos.com

# Verify certificate
oc get apiservice v1.packages.operators.coreos.com -o jsonpath='{.spec.caBundle}' | base64 -d | openssl x509 -noout -text

@XiangyuFan17
Copy link

just a friendly reminder, make sure that your deployment "packageserver" is running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests