docker in stages

#advanced

Stone by stone the wall is built

When working with docker containers we may end up with a larger-than-we-want image. Many times this happens because we need to build software from source: the usual download of a tar.gz file, then the run config, make, and install. This generates a lot of unnecessary files, and requires installing additional libraries to compile.

Here I'll post a recipe to avoid unnecessary large containers, while keeping all the libraries/software we need. This process is based on simple ideas:

  1. build heavy software/libraries in a large container
  2. create a small container and copy just want you need from the large container
  3. run the small container, with all the functionalities you need.

In the following Dockerfile pseudo-code we have a large container named baselayer inside which we build the heavy software called SOME_PACKAGE. In the same Dockerfile we start another container named small. In this second container we use apt-get, conda, and other package managers to install just the libraries we need, with no additionals.

But as we also need SOME_PACKAGE already built in baselayer. The thing is we just copy the files that make SOME_PACKAGE work! Run the COPY of the files generated in the large container:

COPY --from=baselayer /usr/local/lib/SOME_PACKAGE/include /usr/local/include
COPY --from=baselayer /usr/local/lib/SOME_PACKAGE/lib /usr/local/lib

The Dockerfile will look like this:

To run the compilation of the docker image simply tell docker you want to target a specific tag (baselayer or small). In this case we want to build our image for the small tag:

docker build --target small -t small_container . -f stages.Dockerfile

This will save you some space, making the deployment of your containers more optimized. Besides, is more elegant... isn't?