We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Platform EKS 1.20, with Bottlerocket 1.1.2
To Reproduce Apply howto-k8s-mtls-sds-based walkthrough
Describe the bug After applying howto-k8s-mtls-sds-based walkthrough, the agents keeps restarting with the following logs:
time="2021-07-13T16:17:23Z" level=warning msg="Current umask 0022 is too permissive; setting umask 0027." time="2021-07-13T16:17:23Z" level=info msg="Starting agent with data directory: \"/run/spire\"" time="2021-07-13T16:17:23Z" level=info msg="Plugin loaded." built-in_plugin=true plugin_name=k8s_sat plugin_services="[]" plugin_type=NodeAttestor subsystem_name=catalog time="2021-07-13T16:17:23Z" level=info msg="Plugin loaded." built-in_plugin=true plugin_name=memory plugin_services="[]" plugin_type=KeyManager subsystem_name=catalog time="2021-07-13T16:17:23Z" level=info msg="Plugin loaded." built-in_plugin=true plugin_name=k8s plugin_services="[]" plugin_type=WorkloadAttestor subsystem_name=catalog time="2021-07-13T16:17:23Z" level=info msg="Plugin loaded." built-in_plugin=true plugin_name=unix plugin_services="[]" plugin_type=WorkloadAttestor subsystem_name=catalog time="2021-07-13T16:17:23Z" level=debug msg="No pre-existing agent SVID found. Will perform node attestation" path=/run/spire/agent_svid.der subsystem_name=attestor time="2021-07-13T16:17:23Z" level=debug msg="Starting checker" name=agent subsystem_name=health time="2021-07-13T16:17:23Z" level=info msg="Starting workload API" subsystem_name=endpoints time="2021-07-13T16:18:20Z" level=debug msg="New active connection to workload API" subsystem_name=workload_api time="2021-07-13T16:18:20Z" level=warning msg="container id not found" attempt=1 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:20Z" level=warning msg="container id not found" attempt=2 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:21Z" level=warning msg="container id not found" attempt=3 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:21Z" level=warning msg="container id not found" attempt=4 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:22Z" level=warning msg="container id not found" attempt=5 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:22Z" level=warning msg="container id not found" attempt=6 container_id=containerd-01ea2a5c2c7fa5f455900a8b3e15ec27b983e982c9226d78a5ad3cafa004e568 retry_interval=500ms subsystem_name=built-in_plugin.k8s time="2021-07-13T16:18:22Z" level=error msg="Failed to collect all selectors for PID" error="workload attestor \"k8s\" failed: rpc error: code = Canceled desc = context canceled" pid=2363728 subsystem_name=workload_api time="2021-07-13T16:18:22Z" level=debug msg="PID attested to have selectors" pid=2363728 selectors="[type:\"unix\" value:\"uid:0\" type:\"unix\" value:\"user:root\" type:\"unix\" value:\"gid:0\" type:\"unix\" value:\"group:root\" ]" subsystem_name=workload_api time="2021-07-13T16:18:22Z" level=debug msg="Closing connection to workload API" subsystem_name=workload_api
When trying to list the agents, the found attested agents keeps increasing
kubectl exec -n spire spire-server-0 -- /opt/spire/bin/spire-server agent list Found 73 attested agents
and when trying to test agent connectivity from inside the agent container, I get the following error:
/opt/spire/bin/spire-agent api fetch -socketPath /run/spire/sockets/agent.sock rpc error: code = DeadlineExceeded desc = context deadline exceeded
The following command is used for registration:
kubectl exec -n spire spire-server-0 -- \ /opt/spire/bin/spire-server entry create \ -spiffeID spiffe://${TRUST_DOMAIN}/ns/spire/sa/spire-agent \ -selector k8s_sat:cluster:${EKS_CLUSTER_NAME} \ -selector k8s_sat:agent_ns:spire \ -selector k8s_sat:agent_sa:spire-agent \ -node
and the following configurations are used:
apiVersion: v1 kind: Namespace metadata: name: spire --- apiVersion: v1 kind: ServiceAccount metadata: name: spire-server namespace: spire --- apiVersion: v1 kind: ServiceAccount metadata: name: spire-agent namespace: spire --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: spire-server-trust-role rules: - apiGroups: ["authentication.k8s.io"] resources: ["tokenreviews"] verbs: ["create"] - apiGroups: [""] resources: ["configmaps"] verbs: ["patch", "get", "list"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: spire-server-trust-role-binding subjects: - kind: ServiceAccount name: spire-server namespace: spire roleRef: kind: ClusterRole name: spire-server-trust-role apiGroup: rbac.authorization.k8s.io --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: spire-agent-cluster-role rules: - apiGroups: [""] resources: ["pods","nodes","nodes/proxy"] verbs: ["get"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: spire-agent-cluster-role-binding subjects: - kind: ServiceAccount name: spire-agent namespace: spire roleRef: kind: ClusterRole name: spire-agent-cluster-role apiGroup: rbac.authorization.k8s.io --- apiVersion: v1 kind: ConfigMap metadata: name: spire-bundle namespace: spire --- apiVersion: v1 kind: ConfigMap metadata: name: spire-server namespace: spire data: server.conf: | server { bind_address = "0.0.0.0" bind_port = "8081" registration_uds_path = "/tmp/spire-registration.sock" trust_domain = "${TRUST_DOMAIN}" data_dir = "/run/spire/data" log_level = "DEBUG" ca_key_type = "rsa-2048" default_svid_ttl = "1h" ca_subject = { country = ["US"], organization = ["SPIFFE"], common_name = "", } } plugins { DataStore "sql" { plugin_data { database_type = "sqlite3" connection_string = "/run/spire/data/datastore.sqlite3" } } NodeAttestor "k8s_sat" { plugin_data { clusters = { "${EKS_CLUSTER_NAME}" = { use_token_review_api_validation = true service_account_whitelist = ["spire:spire-agent"] } } } } NodeResolver "noop" { plugin_data {} } KeyManager "disk" { plugin_data { keys_path = "/run/spire/data/keys.json" } } Notifier "k8sbundle" { plugin_data { } } } --- apiVersion: apps/v1 kind: StatefulSet metadata: name: spire-server namespace: spire labels: app: spire-server spec: replicas: 1 selector: matchLabels: app: spire-server serviceName: spire-server template: metadata: namespace: spire labels: app: spire-server spec: serviceAccountName: spire-server containers: - name: spire-server image: gcr.io/spiffe-io/spire-server:0.10.0 args: - -config - /run/spire/config/server.conf ports: - containerPort: 8081 volumeMounts: - name: spire-config mountPath: /run/spire/config readOnly: true - name: spire-data mountPath: /run/spire/data readOnly: false livenessProbe: exec: command: - /opt/spire/bin/spire-server - healthcheck failureThreshold: 2 initialDelaySeconds: 15 periodSeconds: 60 timeoutSeconds: 3 volumes: - name: spire-config configMap: name: spire-server volumeClaimTemplates: - metadata: name: spire-data namespace: spire spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: Service metadata: name: spire-server namespace: spire spec: type: NodePort ports: - name: grpc port: 8081 targetPort: 8081 protocol: TCP selector: app: spire-server --- apiVersion: v1 kind: ConfigMap metadata: name: spire-agent namespace: spire data: agent.conf: | agent { data_dir = "/run/spire" log_level = "DEBUG" server_address = "spire-server" server_port = "8081" socket_path = "/run/spire/sockets/agent.sock" trust_bundle_path = "/run/spire/bundle/bundle.crt" trust_domain = "${TRUST_DOMAIN}" enable_sds = true } plugins { NodeAttestor "k8s_sat" { plugin_data { cluster = "${EKS_CLUSTER_NAME}" } } KeyManager "memory" { plugin_data { } } WorkloadAttestor "k8s" { plugin_data { skip_kubelet_verification = true } } WorkloadAttestor "unix" { plugin_data { } } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: spire-agent namespace: spire labels: app: spire-agent spec: selector: matchLabels: app: spire-agent template: metadata: namespace: spire labels: app: spire-agent spec: hostPID: true hostNetwork: true dnsPolicy: ClusterFirstWithHostNet serviceAccountName: spire-agent initContainers: - name: init image: gcr.io/spiffe-io/wait-for-it args: ["-t", "30", "spire-server:8081"] containers: - name: spire-agent image: gcr.io/spiffe-io/spire-agent:0.10.0 args: ["-config", "/run/spire/config/agent.conf"] volumeMounts: - name: spire-config mountPath: /run/spire/config readOnly: true - name: spire-bundle mountPath: /run/spire/bundle - name: spire-agent-socket mountPath: /run/spire/sockets readOnly: false livenessProbe: exec: command: - /opt/spire/bin/spire-agent - healthcheck - -socketPath - /run/spire/sockets/agent.sock failureThreshold: 2 initialDelaySeconds: 15 periodSeconds: 60 timeoutSeconds: 3 volumes: - name: spire-config configMap: name: spire-agent - name: spire-bundle configMap: name: spire-bundle - name: spire-agent-socket hostPath: path: /run/spire/sockets type: DirectoryOrCreate
Please your help if we doing anything wrong. Best regards
The text was updated successfully, but these errors were encountered:
achevuru
No branches or pull requests
Platform
EKS 1.20, with Bottlerocket 1.1.2
To Reproduce
Apply howto-k8s-mtls-sds-based walkthrough
Describe the bug
After applying howto-k8s-mtls-sds-based walkthrough, the agents keeps restarting with the following logs:
When trying to list the agents, the found attested agents keeps increasing
and when trying to test agent connectivity from inside the agent container, I get the following error:
The following command is used for registration:
and the following configurations are used:
Please your help if we doing anything wrong.
Best regards
The text was updated successfully, but these errors were encountered: