Use staged builds to minimize final image sizes #1031

eero-t · 2024-10-25T17:33:05Z

Description

Stage image build so that final image does not have redundant things like:

Git tool and its deps (e.g. Perl)
Git repo history
Test directories

And drop explicit installation of:

langchain_core: GenAIComps installs langchain which already depends on that
jemalloc & GLX: nothing uses them (in any of the ChatQnA services), and for testing[1] it's trivial to create separate image adding those on top
File descriptor limit increase to in ~/.bashrc, as these images run Python programs, not Bash scripts

[1] I assume those files were there to test this: https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#switch-memory-allocator

Issues

#225

Type of change

Others (image size improvement / Dockerfile cleanup)

Dependencies

n/a (this removes redundant Git, Perl, jemalloc, GLX dependencies from final images)

Tests

This is draft / example for fixing #225

I have not tested it apart from verifying that images still build.

Notes

In a proper fix, non-unique part of the Dockerfiles would be a separate base image, generated with GenAIComps repo Dockerfile, and Dockerfiles in this repository would depend on that image instead of python-slim.

However, that requires co-operation between these two repositories and ability to push image(s) to OPEA Docker hub project i.e. it needs to be done by a member of this project.

So that redundant things do not end in final image: - Git repo history - Test directories - Git tool and its deps And drop explicit installation of: - jemalloc & GLX: nothing uses them (in ChatQnA at least), and for testing it's trivial to create image adding those on top: https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html#switch-memory-allocator - langchain_core: GenAIComps install langchain which already depends on that Signed-off-by: Eero Tamminen <[email protected]>

eero-t · 2024-10-25T21:24:08Z

None of the test failures are due to my changes.

CodeGen Gaudi test TGI fail is due to it trying to load HuggingFace model it has no rights for:

Access to model meta-llama/CodeLlama-7b-hf is restricted and you are not in the authorized list.
Visit https://huggingface.co/meta-llama/CodeLlama-7b-hf to ask for access.

CodeGen Xeon test TGI seems to fail due to: Could not import SGMV kernel from Punica.

VisualQnA Gaudi & Xeon tests fail is due to NPM dependency conflict for it's Node.js Svelte UI container build (which spec is not touched by this PR).

eero-t requested a review from lvliang-intel as a code owner October 25, 2024 17:33

eero-t marked this pull request as draft October 25, 2024 17:33

eero-t mentioned this pull request Oct 25, 2024

Why containers use hundreds of MBs for Vim/Perl/OpenGL? #225

Open

eero-t force-pushed the staged-images branch from 734bfd0 to 3e49050 Compare October 25, 2024 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use staged builds to minimize final image sizes #1031

Use staged builds to minimize final image sizes #1031

eero-t commented Oct 25, 2024

eero-t commented Oct 25, 2024

Use staged builds to minimize final image sizes #1031

Are you sure you want to change the base?

Use staged builds to minimize final image sizes #1031

Conversation

eero-t commented Oct 25, 2024

Description

Issues

Type of change

Dependencies

Tests

Notes

eero-t commented Oct 25, 2024