Notes Docker in Practice Part 2

Disclosure: I may earn affiliate revenue or commissions if you purchase products from links on my website. The prospect of compensation does not influence what I write about or how my posts are structured. The vast majority of articles on my website do not contain any affiliate links.

What have I been doing in the week since I wrote part one? Docker. A healthy amount of it at that. As much as I’d love to proudly shelve Docker in Practice between all the other books in my apartment that tend not to be great conversation starters, I must return the book to my colleague’s desk tomorrow morning. Before I do that, I have gleefully prepared the follow-up to my first post.

Technique 49: Dockerfile tricks for keeping your build small

The first place you should look when trying to trim the size of a docker image is the Dockerfile itself. Most commands in this file will create a new layer. Layers, when designed carefully, will allow for faster builds because all layers preceding a changed layer will remain cached. Only the layer following will need to be rebuilt. Layers are also useful if there is a common command set between different applications. If the layers are identical, they’ll only have to be built once.

However, each layer takes up space. By chaining commands (especially RUN commands for package installation), the number of layers can be reduced drastically. Consolidating commands can reduce the overall flexibility of the build and will lead to longer rebuilds if a chained command changes, so it really depends on the use case. The approach that has worked best for me has been freely building a Dockerfile and then, once the application is functioning as expected, slimming down the Dockerfile as a final step before testing the application in a sim environment. This is also probably the best approach when collaborating with others, as unnecessary premature optimizations may result in eccentric design decisions getting made (E.G. if your image size isn’t constrained by defined requirements, don’t be a hero).

Technique 51: Tiny Docker images with BusyBox and Alpine

I’ve almost completed transforming our containers using the CentOS base image to Alpine. The space savings is incredible, as CentOS is roughly 200mb and Alpine is roughly 5mb. Alpine might strike you as one of those operating systems that L33t Coders talk about at Ultimate Frisbee tournaments, but it turns out that it’s an incredibly useful base image with wide applications.

Using a minimal base image seems like a hassle at first, but for microservices, I’ve found it to be nearly the same as using CentOS in terms of how many packages I have to install and how the Dockerfile is arranged. Of course, some commands are different and the software packages provided by each might not be exactly analogous, but it’s painless to set up after you’ve gained some experience with it. Big fan of Alpine.

Other slimming techniques

Obviously, there are quite a few more ways that you can use to minimize an image. The most obvious one is that in some cases you can just build binaries and load them in Docker instead of going through all the trouble of package importing and building and storing source code and such. This adds a layer of complexity to Docker and makes containers much more opaque, from making docker exec less useful to eliminating most of the useful metadata less useful. I think that in most cases this is unnecessary.

Other techniques that can be used include cleaning up unused space like what’s in the /var folder and also using inotify-tools to find what files are being referenced and removing others. This process involves making changes to the container on the fly and then using docker commit to create the slimmed-down image. I think this is kind of silly, but there’s no doubting its usefulness.

Technique 54: Big can be beautiful

Don’t get me started on monoliths. I’m a skeptic for a good reason, but, that being said, I do see the value in a base image. I had to dismantle a bear of a monolith at work. My goal was to create a new, slim monolith. What actually happened was that I realized how much unnecessary bullshit was in the original base image and now some of the more simple services don’t use the refactored version at all. Really, its usefulness was in its providing a clear structure for adapting Swagger specifications into functional python microservices. Now, as I am pushing for regression testing, Swarm, and compose are coming into wider usage, I predict that the ‘monolith’ will stage a comeback.

Whatever you do, please heed this warning: “Be careful about what you add to the base–once it’s in the base image, it’s hard to remove, and the image can bloat fast”

Technique 59: Containing a complex development environment
Technique 60: Running the jenkins master within a Docker container

At work, I am engaged in a war of attrition with the SysEng group to get privileged access to Jenkins. However, as long as I have permission to use Docker freely, I can easily spin up my own instances of Jenkins, Gitlab, and any other instances of our development environment that are locked down. For anyone trying to gain traction for Docker adoption in the office, demonstrating how easy it is to spin up the entire whatever-you-call-it environment is probably the best way.

Now, I do this mainly to diagnose issues and to test out some of my nutty ideas, not to give a middle finger to the SysEng guys, who I respect very much. However, Jenkins slaves tend to diverge wildly from original intended purpose and usage once teams in the organization start to “get it.” This is a good thing, but I think that the Jenkins pipeline of the future, especially when dealing with docker images, should be run inside Docker containers.

On this topic, Gitlab’s pipelines are nothing to scoff at and offer native support for building and testing Docker images (Yo dawg, we built a docker image in your docker image so you can…). I’ll stop making that joke now.

Technique 65: Sharing Docker objects as TAR files

Recently, our Docker registry went down. Nobody really knew what to do besides me since when I was working on my first project and before I realized that there was a registry, I had to save and load my images in order to distribute them.

These are useful commands to know. Docker save produces a TAR file from the image, while docker load (presumably run after transferring the tarball over SSH) loads the image.

Technique 67: Using confd to enable zero-downtime switchover

“When is your downtime window?”

“Well, we don’t really have one.”

This is a reality faced with many production microservices. There are many solutions to enable zero-downtime switchovers, and confd is probably the most simple solution. Personally, I use HAProxy (in a docker container) and it’s been really, really useful.

Technique 68: A simple Docker Compose cluster

“At the time of wrigin, Docker Compose isn’t recommened for use in production.”

Well, now it is. And it’s wonderful. I find joy in converting applications to use docker compose.

Technique 78: A seamless Docker cluster with Swarm

The Swarm discussed in this book is actually the deprecated version of the current swarm mode. Basically, Docker/Moby has aligned the container orchestration platform with its vision and best practices, and I’d like to think that they know best when it comes to things like this. When it comes down to it, Swarm is so much easier than any of the other options recounted in this book (with the possible exception of using systemd, but it’s kind of a different use case), and it is the only solution that makes sense for minor league users of docker who don’t need a fully-featured platform like Rancher, Kubernetes, or Apache Mesos.

Technique 89: Logging your containers to the host’s syslog
Technique 90: Sending Docker logs to your host’s output system

General guidance here is that you don’t want to be left without logs when you need them. Ideally, you should implement a logging methodology that is widely adopted and consistent with the goals of your organization. I think that using rsyslog and forwarding all logs to a common location is probably the best bet, as rsyslog also contains a module to forward logs to the ELK Stack.

Technique 95: Using Docker to run cron jobs

Cron–what a headache. While it’s possible to use a tool like Ansible, it’s also viable to use Docker as a cron change delivery mechanism. I find this concept interesting and will explore implementing it in the workplace–the only negative being that suddenly dockerd has to be run (and maintained) on critical trading hosts. We’ll see how it goes.

Technique 101: Debugging containers that fail on specific hosts

This happened to me once, and I can tell you that that was not a fun week. I spoke with the guy who maintained the code repository that I thought was the root of my problem, and he recommended I use strace. Updated docker security settings mean that you have to start up the container with certain flags to be able to run it, but I found it to provide a useful glimpse into the Linux system calls getting made by the container. More than that, it’s taught me a lot about how Linux works in general.

It’s time to close the book on this one. Though a new version is due out this summer, the current version is a fantastic learning tool. Thanks for reading.

Notes on Docker in Practice Part 2

You May Also Like

The Phoenix Project Review

The Case for Pinning Versions of Docker Dependencies

Avoid Bastardizing Your Docker Images