-
Hey everyone. So I decided to optimize this image. Instead of installing all libraries, I'm copying Is there a better way to do this? And how can I reduce image even more? Personally I was shocked that libraries can take up so much space. Here is my Dockerfile FROM mambaorg/micromamba:1.5.6 as base
WORKDIR /app
# pip requirements are already on this file, it's a cool feature by new conda versions
COPY _container_files/from_history_mamba_env.yaml from_history_mamba_env.yaml
RUN --mount=type=cache,target=/var/cache/apt \
micromamba install -y -n base -f from_history_mamba_env.yaml && \
micromamba clean --all --force-pkgs-dirs --yes && \
rm -f from_history_mamba_env.yaml
ARG MAMBA_DOCKERFILE_ACTIVATE=1
RUN python -m spacy download ru_core_news_sm && \
python -m nltk.downloader stopwords -d '/opt/conda/nltk_data'
######### SECOND STAGE #########
FROM mambaorg/micromamba:1.5.6 as filter_bot_container
# micromamba has no python on board so we should install it
RUN --mount=type=cache,target=/var/cache/apt \
micromamba install -y -n base -c conda-forge python==3.11 && \
micromamba clean --all --force-pkgs-dirs --yes
ARG MAMBA_DOCKERFILE_ACTIVATE=1
WORKDIR /app
COPY --from=base /opt/conda/lib/python3.11/site-packages /python_packages
COPY --from=base /opt/conda/nltk_data /opt/conda/nltk_data
USER root
# some necessary dependencies
RUN apt-get update && apt-get install --yes libatlas-base-dev libpq5
USER mambauser
ENV PYTHONPATH=/python_packages:$PYTHONPATH
COPY .. .
CMD python -m main And here is the head of the content size list in (base) root@9730e001ab04:/# du -h python_packages/ | sort -rh
627M python_packages/
86M python_packages/lingua/language-models
86M python_packages/lingua
76M python_packages/pandas
71M python_packages/scipy
52M python_packages/sklearn
45M python_packages/ru_core_news_sm/ru_core_news_sm-3.7.0
45M python_packages/ru_core_news_sm
39M python_packages/numpy
38M python_packages/pandas/tests
33M python_packages/blis
32M python_packages/ru_core_news_sm/ru_core_news_sm-3.7.0/vocab
31M python_packages/spacy
25M python_packages/pyrogram
22M python_packages/pandas/_libs
20M python_packages/pyrogram/raw
19M python_packages/sqlalchemy
18M python_packages/numpy/core
17M python_packages/pip
16M python_packages/pymorphy3_dicts_ru/data
16M python_packages/pymorphy3_dicts_ru
14M python_packages/pip/_vendor
14M python_packages/nltk
12M python_packages/pandas/core
11M python_packages/scipy/stats
11M python_packages/scipy/sparse
11M python_packages/pyrogram/raw/types
9.1M python_packages/cryptography
9.0M python_packages/pygments
8.9M python_packages/scipy/linalg
8.6M python_packages/cryptography/hazmat
8.5M python_packages/scipy/optimize
7.7M python_packages/pygments/lexers
7.7M python_packages/cryptography/hazmat/bindings
7.4M python_packages/scipy/special
7.2M python_packages/spacy/pipeline
6.8M python_packages/numpy/core/tests
6.6M python_packages/sklearn/utils
6.3M python_packages/sklearn/metrics
... As you can see, And I read this blogpost that was mentioned in docs. The author suggests removing all these kinds of stuff like Is there no way to do some magic trick for a better reduction mechanism? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @aulasau . It is unclear to me how a multi-stage build is helping you here. I guess you avoid having Have you tried using one of our alpine based images? That would be an easy way to save a few MBs. You are missing a I imagine with a couple of |
Beta Was this translation helpful? Give feedback.
Hi @aulasau . It is unclear to me how a multi-stage build is helping you here. I guess you avoid having
from_history_mamba_env.yaml
in one of your layers, but this optimization seems like overkill for a file that small.Have you tried using one of our alpine based images? That would be an easy way to save a few MBs. You are missing a
&& rm -rf /var/lib/apt/lists/*
on the end of yourRUN apt-get ...
That will save you a little bit as well.I imagine with a couple of
find ... -delete
commands you could trim this down by removingtest
/tests
directories, but those type of optimizations cannot be universally applied.