Skip to content

Latest commit

 

History

History
389 lines (275 loc) · 15.4 KB

troubleshoot-scan.hbs.md

File metadata and controls

389 lines (275 loc) · 15.4 KB

Troubleshooting Supply Chain Security Tools - Scan

Debugging commands

Run these commands to get more logs and details about the errors around scanning. The TaskRuns and pods persist for a predefined amount of seconds before getting deleted. (deleteScanJobsSecondsAfterFinished is the tap pkg variable that defines this)

Debugging Tekton TaskRun

To retrieve TaskRun events:

kubectl describe taskrun TASKRUN-NAME -n DEV-NAMESPACE

WHERE TASKRUN-NAME is the name of the TaskRun.

Debugging Scan pods

Run the following to get error logs from a pod when scan pods are in a failing state:

kubectl logs scan-pod-name -n DEV-NAMESPACE

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

See here for more details about debugging Kubernetes pods.

The following is an example of a successful scan run output:

scan:
  cveCount:
    critical: 20
    high: 120
    medium: 114
    low: 9
    unknown: 0
  scanner:
    name: Grype
    vendor: Anchore
    version: v0.37.0
  reports:
  - /workspace/scan.xml
eval:
  violations:
  - CVE node-fetch GHSA-w7rc-rwvf-8q5r Low
store:
  locations:
  - https://metadata-store-app.metadata-store.svc.cluster.local:8443/api/sources?repo=hound&sha=5805c6502976c10f5529e7f7aeb0af0c370c0354&org=houndci

A scan run that has an error means that one of the init containers: scan-plugin, metadata-store-plugin, compliance-plugin, summary, or any other additional containers had a failure.

To inspect for a specific init container in a pod:

kubectl logs scan-pod-name -n DEV-NAMESPACE -c init-container-name

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

See Debug Init Containers in the Kubernetes documentation for debug init container tips.

Debugging SourceScan and ImageScan

To retrieve status conditions of an SourceScan and ImageScan, run:

kubectl describe sourcescan <sourcescan> -n DEV-NAMESPACE

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

kubectl describe imagescan <imagescan> -n DEV-NAMESPACE

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

Under Status.Conditions, for a condition look at the "Reason", "Type", "Message" values that use the keyword "Error" to investigate issues.

Debugging Scanning within a SupplyChain

See here for Tanzu workload commands for tailing build and runtime logs and getting workload status and details.

Viewing the Scan-Controller manager logs

To retrieve scan-controller manager logs:

kubectl -n scan-link-system logs -f deployment/scan-link-controller-manager -c manager

Restarting Deployment

If you encounter an issue with the scan-link controller not starting, run the following to restart the deployment to see if it's reproducible or flaking upon starting:

kubectl rollout restart deployment scan-link-controller-manager -n scan-link-system

Troubleshooting issues

Troubleshooting Grype in Airgap Environments

For information about issues with Grype in air-gap environments, see Using Grype in offline and air-gapped environments.

Missing target SSH secret

Scanning source code from a private source repository requires an SSH secret present in the namespace and referenced as grype.targetSourceSshSecret in tap-values.yaml. See Installing the Tanzu Application Platform Package and Profiles.

If a private source scan is triggered and the secret cannot be found, the scan pod includes a FailedMount warning in Events with the message MountVolume.SetUp failed for volume "ssh-secret" : secret "secret-ssh-auth" not found, where secret-ssh-auth is the value specified in grype.targetSourceSshSecret.

Missing target image pull secret

Scanning an image from a private registry requires an image pull secret to exist in the Scan CRs namespace and be referenced as grype.targetImagePullSecret in tap-values.yaml. See Installing the Tanzu Application Platform Package and Profiles.

If a private image scan is triggered and the secret is not configured, the scan TaskRun's pod's step-scan-plugin container fails with the following error:

Error: GET https://dev.registry.tanzu.vmware.com/v2/vse-dev/spring-petclinic/manifests/sha256:128e38c1d3f10401a595c253743bee343967c81e8f22b94e30b2ab8292b3973f: UNAUTHORIZED: unauthorized to access repository: vse-dev/spring-petclinic, action: pull: unauthorized to access repository: vse-dev/spring-petclinic, action: pull

Deactivate Supply Chain Security Tools (SCST) - Store

SCST - Store is required to install SCST - Scan. If you install without the SCST - Store, you must edit the configurations to deactivate the Store:

---
metadataStore:
  url: ""

Install the package with the edited configurations by running:

tanzu package install scan-controller \
  --package-name scanning.apps.tanzu.vmware.com \
  --version VERSION \
  --namespace tap-install \
  --values-file tap-values.yaml

Resolving Incompatible Syft Schema Version

You might encounter the following error:

The provided SBOM has a Syft Schema Version which doesn't match the version that is supported by Grype...

This means that the Syft Schema Version from the provided SBOM doesn't match the version supported by the installed grype-scanner. There are two different methods to resolve this incompatibility issue:

  • (Preferred method) Install a version of Tanzu Build Service that provides an SBOM with a compatible Syft Schema Version.

  • Deactivate the failOnSchemaErrors in grype-values.yaml. See Install Supply Chain Security Tools - Scan. Although this change bypasses the check on Syft Schema Version, it does not resolve the incompatibility issue and produces a partial scanning result.

    syft:
      failOnSchemaErrors: false

Resolving incompatible scan policy

If your scan policy appears to not be enforced, it might be because the Rego file defined in the scan policy is incompatible with the scanner that is being used. For example, the Grype Scanner outputs in the CycloneDX XML format while the Snyk Scanner outputs SPDX JSON.

See Sample ScanPolicy for Snyk in SPDX JSON format for an example of a ScanPolicy formatted for SPDX JSON.

Could not find CA in secret

If you encounter the following issue, it might be due to not exporting app-tls-cert to the correct namespace:

{"level":"error","ts":"2022-06-08T15:20:48.43237873Z","logger":"setup","msg":"Could not find CA in Secret","err":"unable to set up connection to Supply Chain Security Tools - Store"}

Configure ns_for_export_app_cert in your tap-values.yaml.

metadata_store:
  ns_for_export_app_cert: "DEV-NAMESPACE"

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

If there are multiple developer namespaces, use ns_for_export_app_cert: "*".

Blob Source Scan is reporting wrong source URL

A Source Scan for a blob artifact can cause reporting in the status.artifact and status.compliantArtifact the wrong URL for the resource, passing the remote SSH URL instead of the cluster local fluxcd one. One symptom of this issue is the image-builder failing with a ssh:// is an unsupported protocol error message.

You can confirm you're having this problem by running kubectl describe in the affected resource and comparing the spec.blob.url value against the status.artifact.blob.url. The problem occurs if they are different URLs. For example:

kubectl describe sourcescan SOURCE-SCAN-NAME -n DEV-NAMESPACE

Where:

  • SOURCE-SCAN-NAME is the name of the source scan you want to configure.
  • DEV-NAMESPACE is the name of the developer namespace you want to use. And compare the output:
...
spec:
  blob:
    ...
    url: http://source-controller.flux-system.svc.cluster.local./gitrepository/sample/repo/8d4cea98b0fa9e0112d58414099d0229f190f7f1.tar.gz
    ...
status:
  artifact:
    blob:
      ...
      url: ssh://[email protected]:sample/repo.git
  compliantArtifact:
    blob:
      ...
      url: ssh://[email protected]:sample/repo.git

Workaround: This problem happens in SCST - Scan v1.2.0 when you use a Grype Scanner ScanTemplates earlier than v1.2.0, because this is a deprecated path. To fix this problem, upgrade your Grype Scanner deployment to v1.2.0 or later. See Upgrading Supply Chain Security Tools - Scan for step-by-step instructions.

Resolving failing scans that block a Supply Chain

If the Supply Chain is not progressing due to CVEs found in either the SourceScan or ImageScan, see the CVE triage workflow in [Triaging and Remediating CVEs](triaging-and-remediating-cves.hbs.md.

Policy not defined in the Tanzu Application Platform GUI

If you encounter No policy has been defined, it might be because the Tanzu Application Platform GUI is unable to view the Scan Policy resource.

Confirm that the Scan Policy associated with a SourceScan or ImageScan exists. For example, the scanPolicy in the scan matches the name of the Scan Policy.

kubectl describe sourcescan NAME -n DEV-NAMESPACE
kubectl describe imagescan NAME -n DEV-NAMESPACE
kubectl get scanpolicy NAME -n DEV-NAMESPACE

Where DEV-NAMESPACE is the name of the developer namespace you want to use.

Add the app.kubernetes.io/part-of label to the Scan Policy. See Enable Tanzu Application Platform GUI to view ScanPolicy Resource.

Lookup error when connecting to SCST - Store

If your scan pod is failing, you might see the following connection error in the logs:

dial tcp: lookup metadata-store-app.metadata-store.svc.cluster.local on 10.100.0.10:53: no such host

A connection error while attempting to connect to the local cluster URL causes this error. If this is a multicluster deployment, set the grype.metadataStore.url property in your Build profile values.yaml. You must set the ingress domain of SCST - Store which is deployed in the View cluster. For information about this configuration, see Install Build profile.

Sourcescan error with SCST - Store endpoint without a prefix

If your Source Scan resource is failing, the status might show this error:

Error: endpoint require 'http://' or 'https://' prefix

This is because the grype.metadataStore.url value in the Tanzu Application Platform profile values.yaml was not configured with the correct prefix. Verify that the URL starts with either http:// or https://.

Deprecated pre-v1.2 templates

If the scan phase is in Error and the status condition message is:

Summary logs could not be retrieved: . error opening stream pod logs reader: container summary is not valid for pod scan-grypeimagescan-sample-public-zmj2g-hqv5g

This error might be a consequence of using Grype Scanner ScanTemplates shipped with SCST - Scan v1.1 or earlier. These ScanTemplates are deprecated and are not supported in Tanzu Application Platform v1.4.0 and later.

There are two options to resolve this issue:

  • Option 1: Upgrade to the latest Grype Scanner version. This automatically replaces the old ScanTemplates with the upgraded ScanTemplates.

  • Option 2: Create a ScanTemplate. Follow the steps in Create a scan template.

Incorrectly configured self-signed certificate

The following error in the pod logs indicate that the self-signed certificate might be incorrectly configured:

x509: certificate signed by unknown authority

To resolve this issue, ensure that shared.ca_cert_data contains the required certificate. For an example of setting up the shared self-signed certificate, see Build profile.

For information about shared.ca_cert_data, see View possible configuration settings for your package.

Unable to pull scan controller and scanner images from a specified registry

The docker field and related sub-fields by SCST - Scan Controller, Grype Scanner, or Snyk Scanner are deprecated in Tanzu Application Platform v1.4.0. Previously these text boxes might be used to populate the registry-credentials secret. You might encounter the following error during installation:

UNAUTHORIZED: unauthorized to access repository

The recommended migration path for users setting up their namespaces manually is to add registry credentials to both the developer namespace and the scan-link-system namespace, using these instructions.

Important This step does not apply to users who used --export-to-all-namespaces when setting up the Tanzu Application Platform package repository.

Grype database not available

Prior to running a scan, the Grype scanner downloads a copy of its database. If the database fails to download, the following log message might appear.

Vulnerability DB [no update available] New version of grype is available: 0.50.2 [0000] WARN unable to check for vulnerability database update 1 error occurred: * failed to load vulnerability db: vulnerability database is corrupt (run db update to correct): database metadata not found: ~/Library/Caches/grype/db/3

To resolve this issue, ensure that Grype has access to its vulnerability database:

  • If you have set up a mirror of the vulnerability database, verify that it is populated and reachable.
  • If you did not set up a mirror, Grype manages its database behind the scenes. Verify that the cluster has access to https://anchore.com/.

This issue is unrelated to Supply Chain Security Tools for Tanzu – Store.

Scanner Pod restarts once in SCST - Scan v1.5.0 or later

For SCST - Scan v1.5.0 or later, you see scanner pods restart:

Pods
   NAME                                  READY   STATUS      RESTARTS   AGE
   my-scan-45smk-pod                     0/9     Completed   1          14m

One restart in scanner pods is expected with successful scans. To support Tanzu Service Mesh (TSM) integration, jobs were replaced with TaskRuns. This restart is an artifact of how Tekton cleans up sidecar containers by patching the container spec.