Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skaffold running out of memory if there are many configurations and there is artifact conflict #9536

Open
dmavrommatis opened this issue Oct 3, 2024 · 0 comments

Comments

@dmavrommatis
Copy link

I have a monorepo with a central skaffold.yaml file that includes all other skaffold.yaml files with requires. At some point someone in the configuration included an artifact that is duplicate of another. Normally skaffold throws an error like this

source: /tmp/skaffold/skaffold.yaml, in module "other": source: /tmp/skaffold/other2.yaml, in module "postgresql2": source: /tmp/skaffold/other.yaml, in module "postgresql": source: /tmp/skaffold/skaffold.yaml, in module "traefik": duplicate image "postgresql-ci" found in sources /tmp/skaffold/other.yaml and /tmp/skaffold/other2.yaml: artifact image names must be unique across all configurations
source: /tmp/skaffold/skaffold.yaml, in module "other" on line 9 column 14: source: /tmp/skaffold/other2.yaml, in module "postgresql2" on line 9 column 14: source: /tmp/skaffold/other.yaml, in module "postgresql" on line 9 column 14: source: /tmp/skaffold/skaffold.yaml, in module "traefik" on line 9 column 14: duplicate image "postgresql-ci" found in sources /tmp/skaffold/other.yaml and /tmp/skaffold/other2.yaml: artifact image names must be unique across all configurations

In my case; trying to deploy traefik service with skaffold deploy -m traefik, hungs at Helm release traefik not installed. Installing... and then kills my laptop as it runs out of memory. Scaling down the configuration to only require two other configs makes it so it prints out the error instead.

For the example I provided this message appears 15 times instead of just once, adding 2 more requires goes up to 134 times, adding 2 more requires goes up to 518, etc. In my real repository I have 47 configurations so it looks like it implodes on memory use. I think having multiple configurations somehow introduces a recursive loop for this check that can kill the hosts.

Expected behavior

Skaffold should not hang and run out of memory if on a multi-configuration environment someone introduces the same artifact twice.

Actual behavior

Skaffold hangs on the deploy step indefinitely and kills the host with OOM.

Information

  • Skaffold version: v2.13.2
  • Operating system: fedora 41
  • Installed via: skaffold.dev
  • Contents of skaffold.yaml:

skaffold.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: traefik
deploy:
  helm:
    releases:
      - name: traefik
        repo: https://traefik.github.io/charts
        remoteChart: traefik
        version: 30.0.0
---
apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: other
requires:
  - path: ./other.yaml
  - path: ./other2.yaml

other.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: postgresql
build:
  tagPolicy:
    gitCommit: {}
  artifacts:
    - image: postgresql-ci
      docker:
        dockerfile: Dockerfile.postgres
  local:
    useBuildkit: true
deploy:
  helm:
    releases:
      - name: postgresql
        repo: https://charts.bitnami.com/bitnami
        remoteChart: postgresql
        version: 14.3.1
        setValueTemplates:
          image:
            registry: "{{.IMAGE_DOMAIN_postgresql_ci}}"
            repository: "{{.IMAGE_REPO_NO_DOMAIN_postgresql_ci}}"
            tag: "{{.IMAGE_TAG_postgresql_ci}}"

other2.yaml

apiVersion: skaffold/v4beta10
kind: Config
metadata:
  name: postgresql2
build:
  tagPolicy:
    gitCommit: {}
  artifacts:
    - image: postgresql-ci
      docker:
        dockerfile: Dockerfile.postgres
  local:
    useBuildkit: true
deploy:
  helm:
    releases:
      - name: postgresql
        repo: https://charts.bitnami.com/bitnami
        remoteChart: postgresql
        version: 14.3.1
        setValueTemplates:
          image:
            registry: "{{.IMAGE_DOMAIN_postgresql_ci}}"
            repository: "{{.IMAGE_REPO_NO_DOMAIN_postgresql_ci}}"
            tag: "{{.IMAGE_TAG_postgresql_ci}}"

Dockerfile.postgres

FROM bitnami/postgresql:15
USER 1001:0

Steps to reproduce the behavior

  1. skaffold deploy -m traefik

then add more other.yaml files and rerun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@dmavrommatis and others