Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hackathon 2024][Gadolinium][Docker] Don’t include unnecessary files in image #110

Open
Neobody-0 opened this issue May 30, 2024 · 0 comments
Assignees
Labels
ecoCode-CI/CD Hackathon 2024 New issues tagged during the hackathon 2024 spotter

Comments

@Neobody-0
Copy link

Neobody-0 commented May 30, 2024

Rule title

Don’t include unnecessary files in image

Language and platform

Docker

Rule description

Eliminating unnecessary files will naturally decrease the image size. A smaller image shortens build time and lowers storage costs, contributing to energy savings and environmentally friendly coding practices.

One way to avoid unnecessary files is to use a .dockerignore file.

When executing a build command, the build client searches for a .dockerignore file in the context's root directory. Should this file be present, it excludes files and directories matching the patterns specified within from the build context prior to sending it to the builder.
https://docs.docker.com/build/building/context/#dockerignore-files

Rule short description

Exclude files not relevant to the build, without restructuring your source repository. https://docs.docker.com/develop/develop-images/guidelines/

Rule justification

In this article as an example : Ten simple rules for writing Dockerfiles for reproducible data science - PMC (nih.gov), sometimes we could introduce data not useful for final image (data, temporary files, dependencies, etc), as mentioned in the article “Storing data files outside of the container allows handling of very large or sensitive datasets, e.g., proprietary data or private information. Do not include such data in an image! To avoid publishing sensitive data by accident, you can add the data directory to the .dockerignore file, which excludes files and directories from the build context, i.e., the set of files considered by docker build. Ignoring data files also speeds up the build in cases where there are very large files or many small files.

Why it matters:

  • Image Size Reduction : Larger images necessitate increased storage space, longer transfer times, and more resources for loading into memory and processing.
  • Speed Up Build : Ignoring data files can accelerate the build process when large files or numerous small files are involved.
  • Security : Temporary files may hold sensitive data, including secrets or debug information, posing a security risk and potential for data leakage.

Severity / Remediation Cost

Severity : Major, some files could be huge, for example node_modules can have up to 400Mo (like 20x the size of an alpine image) .

Remediation cost : Easy, users need to add or complete the docker ignore file.

Implementation principle

The feasibility of the implementation hinges on the ability to scan the .dockerignore file with SonarQube. If this is achievable, we can verify its presence and possibly employ a template (similar to .gitignore) to enumerate all the files that should be omitted.

An enhancement to this rule, though potentially challenging to implement, would be to examine the base image in the Dockerfile to identify the technology and apply a corresponding template for the .dockerignore file.

@Neobody-0 Neobody-0 added Hackathon 2024 New issues tagged during the hackathon 2024 spotter labels May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecoCode-CI/CD Hackathon 2024 New issues tagged during the hackathon 2024 spotter
Projects
None yet
Development

No branches or pull requests

3 participants