Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hackathon 2024][Gadolinium][Docker] Use lightweight base image #112

Open
rducasse opened this issue May 30, 2024 · 0 comments
Open

[Hackathon 2024][Gadolinium][Docker] Use lightweight base image #112

rducasse opened this issue May 30, 2024 · 0 comments
Assignees
Labels
ecoCode-CI/CD Hackathon 2024 New issues tagged during the hackathon 2024 spotter

Comments

@rducasse
Copy link
Contributor

rducasse commented May 30, 2024

Rule title

Use lightweight base image

Language and platform

Docker

Rule description

When creating Docker images, developers may start with base images that include unnecessary software components or libraries. This practice can lead to bloated images, increasing their size and resource consumption. For instance, using a base image like Ubuntu or CentOS, which come with a plethora of pre-installed packages, even if they are not required for the specific application, results in heavy Docker images. These images take longer to build, consume more storage space, and slow down deployment processes.
Developers should instead opt for lightweight base images tailored to the specific needs of their applications. For example, instead of using a full-fledged Linux distribution as the base image, choosing a slim or alpine variant significantly reduces image size and resource overhead. By using minimalistic base images, developers optimize resource utilization, improve build and start times, conserve storage space...

Non-compliant:

FROM python:3.8
# Image size: 356MB

Compliant:

FROM python:3.8-alpine
# Image size: 17MB

Non-compliant:

FROM debian:12.5
…
# Only the last FROM command impacts the final image size
FROM node:18:
# Image size: 352MB

Compliant:

FROM debian:12.5
…
# Only the last FROM command impacts the final image size
FROM node:18-slim:
# Image size: 65MB

Non-compliant:

FROM azul/zulu-openjdk:21
# Image size: 190MB

Compliant:

FROM azul/zulu-openjdk-distroless:21
# Image size: 153MB

Rule short description

Avoid using heavy Docker base images. Use minimal, application-specific base images, like alpine, or specialized, resource-optimized versions (e.g. python:3.8-alpine instead of python:3.8).

Rule justification

Why it matters:

  • For the image itself:
    • Image Size Reduction : Larger images require more storage, more time to transfer, and more resources to load into memory and execute.
  • For the containers created from the image:
    • Startup time: The more layers an image has, the longer it will take for the container to start up because each layer must be pulled from the registry and loaded into memory. This can be noticeable if you have many small layers, as the overhead of loading each layer can add up.
    • Disk usage: Each layer in an image adds to the size of the final image, which in turn affects the disk usage of the container. This can become a problem if you have many images or if you're working with limited disk space.
    • Security: Each layer in an image represents a potential attack surface, so minimizing the number of layers can improve the security of your container. This is because each layer can potentially contain vulnerabilities or malicious code that could be exploited

Do these versions frequently exist ?:

  • Yes, around 40% of the 1000 most popular images on dockerhub has a slim or alpine version (verified with a python script using the Dockerhub API on the first 1000 most popular images)

Official documentation :
https://docs.docker.com/develop/develop-images/guidelines/

Expert article :
https://www.fullstack.com/labs/resources/blog/small-is-beautiful-how-container-size-impacts-deployment-and-resource-usage
Measurement (from scientific article https://assets-eu.researchsquare.com/files/rs-3276965/v1_covered_8dc408b5-6997-486a-89c8-a5c66fddf60e.pdf?c=1693976849) :

Docker Images with Original Image size Obtained Image size Image size Reduced
Being lazy couch potatoes 1.47GB 1.47GB 0%
Optimizing the parent image 1.47GB 642MB 56.32%

Severity / Remediation Cost

Severity: Major (Huge differences in the size of the images, e.g. 100Mo instead of 1Go, see impacts in previous section)

Cost: Easy (only the image version to change, so no need to understand the logic but potential impacts)

Implementation principle

To detect the non use of a lightweight docker image, we will have to find the last use of the FROM command in the Dockerfile (the ones before does not impact the final size of the image).
If the image used contains alpine or slim in the tag (version), or contains distroless in the repository (image name), we can consider the image lightweight, otherwise we can consider the rule broken.

Possible false positive: corporate custom images that do not follow naming conventions (alpine, slim, distroless).

@rducasse rducasse added Hackathon 2024 New issues tagged during the hackathon 2024 spotter labels May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecoCode-CI/CD Hackathon 2024 New issues tagged during the hackathon 2024 spotter
Projects
None yet
Development

No branches or pull requests

3 participants