HOME  → POSTS  → 2018

Creating Smaller Docker Containers for Your Apps

Engineering for Site Reliability1980 words10 minutes to read

When it comes to Docker containers, the smaller, the better. Smaller containers are easier to work with, deploy faster, and tend to have fewer security vulnerabilities.

Docker Logo

Big is Bad

I worked at WePay during the transition from a monolithic application in the datacenter to a series of microservices running in the cloud. I spent a lot of time working on the Vagrant-based CentOS development environment for the monolith, and also started maintaining a custom CentOS base image in Google Cloud.

As we were all learning about Docker, images, containers, and how it all worked together, the director of DevOps declared (unilaterally) that we should create Docker base images for the various languages we were using (PHP, Python, Java, Go), and they should all be built on a core CentOS 7 Docker image.

Now, many parts of this make sense:

  1. Having a base disk image for our hosts that builds-in all of the shared functionality we needed.

  2. Having a base Docker image that every application referenced with FROM in their Dockerfiles which included shared patterns for logging and metrics.

  3. Having an optimized image for specific languages made it easier for developers using those languages to rapidly spin-up new application containers.

But there were also some major drawbacks to this approach.

  1. Developers wanted to run composer install and pip install requirements.txt from inside the container. This often required development dependencies to be installed in the containers.

  2. One of our Java micro-service applications (CentOS 7 + Oracle Java + application code + development dependencies) clocked in at 1.8 GB.

  3. Our build system was frequently buckling under the weight of caching and transferring large Docker images between its cluster and our Artifactory installation.

Fat Guy eating a donut and cheese-whiz

Now, some of this can be chalked up to learning a new technology. Some of these are growing pains that were incurred at the same time as chunking apart our monolithic PHP app into Java/Python/Golang microservices. Some of this was hubris by people who made unilateral decisions. But we’d made it to the cloud. We’d made it to microservices. And I’m sure that WePay’s development practices have improved greatly over the last couple of years since I left.

Smaller is better

In my current gig, my team has gone all-in with Docker, the AWS cloud, Infrastructure-as-Code, CI/CD practices, and the SRE support model. I’ll spend some time talking about these other topics in a future post, but I do want to talk about some process magic that makes it nearly effortless to deploy to Production multiple times per day with exceptionally little stress.

Use Alpine Linux

Alpine Linux is the 5 MB successor to Busybox, which provides a few additional tools to Busybox’s 2 MB image size.

Alpine Linux size compared to other Docker images.

Generally speaking, you should always use Alpine Linux.

I use the word “generally” because there are certain exceptions to this (otherwise) strong recommendation. The most important of which is that while larger Linux distributions which use the GNU’s glibc library for the C Standard Library implementation, Alpine, Busybox, and others use a different library called musl.

You can take a look at the differences between musl and glibc, but the part that matters to you is that there is some software that exists which depends on the non-standard parts of glibc that haven’t been implemented in musl yet. What this means, practically speaking, are that things like the %P marker for strftime() doesn’t work as documented.

Learn to love the layer cache

Docker images use layers to overlay newer changes over previous changes using a technology called UnionFS. This works similarly to Git, where all of the changes that ever happened are still inside the repository, but when you pull the master branch, you’re pulling down dozens (or hundreds, or thousands) of layers that all need to resolve into the current state of the branch.

With Docker, each of these layers is introduced by the RUN statement inside a Dockerfile.

FROM nginx:1.15.1-alpine

ENV RUNTIME_DEPS ca-certificates curl

RUN echo "http://dl-cdn.alpinelinux.org/alpine/v3.7/main" >> /etc/apk/repositories
RUN apk upgrade --no-cache --update
RUN apk add --no-cache --virtual .runtime-deps $RUNTIME_DEPS
RUN chmod -Rf 0777 /var/log

...

If a change is made to an earlier layer, then the layer cache is invalidated for all of the later layers, and those later layers need to be rebuilt. (Because of this, it’s always a good idea to put infrequently changed commands first, and more frequently changed commands last.)

Unfortunately, many people (including myself) read that Docker image layers have filesize overhead built into them. In order to make your containers smaller, you should combine all of your commands into a single RUN statement. The side effect is that any time you need to change anything inside that RUN statement, Docker needs to rebuild everything from scratch — since it’s all in the same layer (which changed).

FROM nginx:1.15.1-alpine

ENV RUNTIME_DEPS ca-certificates curl

RUN echo "http://dl-cdn.alpinelinux.org/alpine/v3.7/main" >> /etc/apk/repositories && \
    apk upgrade --no-cache --update && \
    apk add --no-cache --virtual .runtime-deps $RUNTIME_DEPS && \
    chmod -Rf 0777 /var/log

...

By leveraging the RUN statement as it was intended, you get to take advantage of faster re-build times by leveraging the layer cache. This means that any layers (e.g., RUN statements) which haven’t changed since the last build do not need to be built again!

Yes, your development Docker image may be a little larger, but we will address this later in this post.

Installed dependencies should be runtime-only

This is the one that kills me the most because it can be so wasteful, and it stems from not understanding how to use the tools in your toolbox.

Firstly, use a .dockerignore file. Again, this is very similar to how a .gitignore file works — you don’t need everything you use for development to end up inside your Docker image, so use .dockerignore to avoid development dependencies.

You never need your .git/ directory to be copied into a Docker image. Once you’ve resolved your application dependencies, you also don’t need your composer.json, package.json, requirements.txt, or other package manager definitions in there. You only need your vendored code. (Even then, you don’t need the tests for the vendored code either, most of the time. You should ignore those as well.)

Some dependencies need to build binaries for the OS they’re running inside of. For example, Node.js apps often rely on Oniguruma. Many Python applications rely on MySQLdb. Both of these require that you install compilation tools and compile them on the OS that they run in.

Some companies solve this problem by installing GCC inside the Docker image.

Marty McFly looking very confused.

A better solution is to have build-time and run-time dependencies, wherein you uninstall the build-time dependencies once you’re done with them.

Here is an example of a PHP app that includes Redis support and installs the New Relic agent extension.

FROM php:7.2.8-fpm-alpine3.7

# Needed at build-time, then can be uninstalled.
ENV BUILD_DEPS alpine-sdk coreutils wget git autoconf re2c

# Should remain inside the container for runtime purposes.
ENV PERSISTENT_DEPS net-tools hiredis-dev gmp-dev

# PHP extensions to install.
ENV INSTALL_EXTENSIONS gmp json opcache pdo pdo_mysql

# New Relic values.
ENV NR_INSTALL_SILENT 1
ENV NR_VERSION 8.0.0.204

# Update the packages in the container to their latest security patches.
RUN apk upgrade --no-cache --update

# Install your build-time and runtime dependencies.
# Give these groups of dependencies names like `.build-deps`
# and `.persistent-deps` that we can refer to later.
RUN apk add --no-cache --virtual .build-deps $BUILD_DEPS
RUN apk add --no-cache --virtual .persistent-deps $PERSISTENT_DEPS

# Install the PHP extensions we need from the PHP repository.
# https://github.com/php/php-src/tree/master/ext
RUN docker-php-ext-install $INSTALL_EXTENSIONS

# Install the New Relic agent extension for PHP.
RUN wget -O /tmp/newrelic-php5.tar.gz https://download.newrelic.com/php_agent/archive/$NR_VERSION/newrelic-php5-$NR_VERSION-linux-musl.tar.gz
RUN tar -zxvf /tmp/newrelic-php5.tar.gz -C /usr/local/lib
RUN /usr/local/lib/newrelic*/newrelic-install install
RUN rm /usr/local/etc/php/conf.d/newrelic.ini

# Install the phpiredis extension for PHP.
RUN git clone https://github.com/nrk/phpiredis.git && cd phpiredis && phpize && ./configure --enable-phpiredis && make && make install

# Uninstall the grouping of dependencies called `.build-deps`.
RUN apk del .build-deps

...

Flattening your (base) images

Flattening your images is an extra step that you can take to make your images as small as possible. This is particularly useful if you are building/providing base images for other people to consume downstream.

As Thomas Uhrig writes:

We can use this mechanism to flatten and shrink a Docker container. If we save an image to the disk, its whole history will be preserved, but if we export a container, its history gets lost and the resulting tarball will be much smaller.

# Launch the container from a Docker image
docker run <image> --detach

# Export the running container to a tarball
docker export <container> > /tmp/docker-image.tar
 
# Import it back into Docker
cat /tmp/docker-image.tar | docker import - php-fpm:without-layers

By running a container and exporting the data as a tarball, you can remove all of the intermediate layers and history from the final image, removing filesize overhead and reducing the overall image size.

At the time of this writing, the latest version of PHP is 7.2.8 (actually, 7.2.9 was cut yesterday, but the updated image hasn’t been released yet), which builds on top of the Alpine Linux 3.7 image.

The Alpine Linux image clocks in at just under 5 MB. The PHP image adds a few layers, and brings things up to 78 MB. So far, both of these are smaller than the base CentOS or Ubuntu images.

Our application includes the New Relic agent for PHP, a few extensions, our application code, and our Composer vendor directory (without dev-dependencies).

composer install --prefer-dist --no-dev

We should remove things like tests from our vendor directory, but we haven’t done that yet at the time of this writing. With all of our (wonderfully cached) layers, this brings the decompressed image size to 408 MB.

After stripping out the history and removing all of the individual layers from the image (via a process called flattening), our final decompressed image size is a mere 197 MB in size.

$ docker images

REPOSITORY     TAG                   IMAGE ID       CREATED          SIZE
alpine         3.7                   791c3e2ebfcb   5 weeks ago      4.2MB
php            7.2.7-fpm-alpine3.7   9cf17fea14c0   5 weeks ago      78.3MB
php-fpm        with-layers           94121f6a6537   29 seconds ago   408MB
php-fpm        without-layers        8468ea1ee874   4 seconds ago    197MB

When you push your image up to a Docker registry (e.g., Docker Hub, Amazon ECR, Google Container Registry, Quay.io, Artifactory), the images will be compressed. Our final Docker image, compressed-at-rest, is only 72 MB.

A small, 72 MB Docker image for our application is small and easy enough to push into our CI/CD pipeline in only a few seconds, and puts very little network or storage strain on our internal systems. It’s fast to download into my local development environment, and every step of the development and build processes are automated.

Reduced security vulnerabilities

Over my career, I’ve observed that engineers view the topic of “security” primarily through the lense of their job role.

  • Application engineers tend to view security as things like XSS vulnerabilities and SQL injections.

  • System engineers tend to view security as things like CVEs and intrusions.

  • Security engineers tend to see those things + TLS certificates + CIS Benchmarks + secrets management and rotation + user permissions + …

In this context, I’m referring primarily to security vulnerabilties along the lines of Heartbleed, ShellShock, and httpoxy. Because there is so little software installed by default, the attack surface is substantially reduced — oftentimes to the point where there are zero known vulnerabilities anywhere in your application container.

Logo for the Heartbleed vulnerability.

Logo for the Heartbleed vulnerability.

This is entirely unheard of in CentOS, Ubuntu, and other larger distributions. As a matter of fact, when our application went live and we underwent review with the security team, they scanned our hosts and containers with zero unpatched vulnerabilities and thought that the scan was bad or their software was broken.

Conclusion

The most important things to take away from this are:

  1. Big Docker images are a bad thing.
  2. Use Alpine Linux. Seriously.
  3. Remove your build-time dependencies.
  4. Flatten your images if you’re sharing them.
  5. The less software that is installed, the fewer security vulnerabilities there will be.
  6. Making your images as small as possible can greatly reduce the burden on the rest of your infrastructure.

Ryan Parman

is an engineering manager with over 20 years of experience across software development, site reliability engineering, and security. He is the creator of SimplePie and AWS SDK for PHP, patented multifactor-authentication-as-a-service at WePay, defined much of the CI/CD and SRE disciplines at McGraw-Hill Education, and came up with the idea of “serverless, event-driven, responsive functions in the cloud” while at Amazon Web Services in 2010. Ryan's aptly-named blog, , is where he writes about ideas longer than . Ambivert. Curious. Not a coffee drinker.