What is a Docker Image? A Complete Guide for Beginners

[2024-05-22 03:14:22] ERROR: Failed to pull image “registry.internal/frontend-monolith:latest”
[2024-05-22 03:14:22] Kubelet: FailedToPull “failed to register layer: Error processing tar file(exit status 1): write /usr/local/lib/node_modules/very-large-useless-package: no space left on device”
[2024-05-22 03:15:01] CRITICAL: Node ip-10-0-42-11.ec2.internal is DiskPressure
[2024-05-22 03:15:10] ALERT: Production environment is down. 0/15 pods running.

$ docker history registry.internal/frontend-monolith:latest
IMAGE CREATED CREATED BY SIZE COMMENT
2 hours ago /bin/sh -c #(nop) CMD [“npm” “start”] 0B
2 hours ago /bin/sh -c npm install && npm run build 1.85GB
2 hours ago /bin/sh -c #(nop) COPY dir:7a8… in /app 850MB
2 hours ago /bin/sh -c apt-get update && apt-get upgrade -y 420MB
3 months ago /bin/sh -c #(nop) FROM node:20.11.1 1.1GB

It’s 4:00 AM. My coffee is cold, my eyes feel like they’ve been rubbed with sandpaper, and I’ve just spent the last three days of my life cleaning up a mess that shouldn’t have existed in the first place. The logs above are the autopsy of a “modern” deployment. A single docker image, pushed by a developer who thinks disk space is a magical, infinite resource provided by the cloud gods, managed to bring down an entire production cluster.

The Kubelet died. The storage driver choked. The nodes went into a death spiral because someone decided that a 3.2GB container image was a reasonable way to ship a frontend application. This isn’t just a technical failure; it’s a failure of discipline. It’s the result of a generation of engineers who treat the containerization process as a “black box” where they can dump their entire local development environment and hope the orchestrator figures it out.

I’m tired of it. I’m tired of the “it works on my machine” mentality being packaged into a bloated tarball and shoved into a registry. Let’s talk about why your docker image is a liability.

The 2.5GB “Hello World” and the Death of Common Sense

The first thing I saw when I looked at the Dockerfile for this disaster was FROM node:20.11.1. Not the slim version. Not the Alpine version. The full-fat, Debian-based, everything-including-the-kitchen-sink image. Why? Because the developer “didn’t want to deal with missing dependencies.”

When you pull a base image like that, you aren’t just pulling Node.js. You are pulling a full operating system distribution. You’re pulling build tools, compilers, headers, and documentation for libraries you will never, ever call. You are pulling gcc, make, and python3.12.2 (yes, inside a Node image) just in case some obscure native module needs to compile during npm install.

But it gets worse. Look at that apt-get upgrade -y line in the history. That is a crime against humanity. When you run an upgrade inside a Dockerfile layer, you are effectively creating a “snowflake” image. You are saying, “I want whatever the Debian mirrors happen to have at this exact microsecond.” It breaks build idempotency. It ensures that no two builds will ever be the same. And because of how UnionFS works, that layer is now a permanent part of your docker image.

UnionFS is a stackable filesystem. Every command in your Dockerfile creates a new layer. These layers are additive. If you install 400MB of updates in one layer and then try to “clean up” in the next, you haven’t saved a single byte. The 400MB is still there, buried in the lower layers, taking up space on the disk, consuming bandwidth during the pull, and slowing down the container start time. The kernel has to mount every single one of these layers using the overlay2 storage driver, merging the upperdir and lowerdir into a merged view. Every layer you add increases the complexity of the filesystem lookups. Every layer is a potential performance bottleneck.

Layer Caching: How Your Order of Operations is Killing the Build Server

I watched the CI/CD logs for this project. The build took 22 minutes. For a React app. Why? Because the developer put COPY . . at the top of the Dockerfile.

FROM node:20.11.1
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build

This is the peak of inefficiency. By copying the entire source directory before running npm install, you invalidate the cache for the most expensive part of the build—the dependency installation—every single time a single character changes in a README file.

The Docker daemon calculates a hash for the files being copied. If that hash changes, the cache for that layer and every subsequent layer is discarded. You are forcing the build server to reach out to the registry, download hundreds of megabytes of node_modules, and write them to disk over and over again.

A sane person—someone who actually cares about the health of the registry and the build server—would do this:

FROM node:20.11.1-slim
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

By copying only the manifest files first, you allow Docker to cache the npm ci layer. Unless you change your dependencies, that 1.8GB of node_modules cruft stays in the cache. But even then, why are we shipping node_modules at all? Why is the production docker image carrying around the source code, the test suites, the linter configurations, and the node_modules folder that contains babel, webpack, and typescript? None of those are needed at runtime. They are build-time artifacts. Shipping them to production is like shipping the scaffolding along with the finished skyscraper.

The “docker history” Audit: Finding the Hidden Skeletons in Your Layers

When the cluster started screaming, I ran docker history --no-trunc. It’s a horror story in plain text. I found layers that were nothing but chown commands.

Did you know that if you run RUN chown -R node:node /app on a directory that contains 1GB of data, you have just added another 1GB to your docker image? The overlay2 driver doesn’t just change the metadata of the files. Because layers are immutable, it has to copy the files to the “upper” layer to apply the new ownership. You now have two copies of your data sitting on the disk.

I also found the .git directory. 800MB of git history, hidden inside the image. The developer forgot a .dockerignore file. So, every time they built the docker image, the entire history of the project—every deleted branch, every large binary ever accidentally committed, every secret ever pushed and then “removed”—was sent to the Docker daemon as part of the build context.

The build context is a tarball of your current directory. If you don’t have a .dockerignore, you are sending garbage over the wire. You are making the COPY . . command even more bloated than it already is. A proper .dockerignore should be the first thing you write. It should exclude node_modules, .git, dist, build, *.log, .env, and anything else that isn’t strictly necessary for the build.

And let’s talk about ADD vs COPY. I see people using ADD like it’s a generic “put file here” command. It’s not. ADD has “magic” behavior. It can fetch files from remote URLs. It can automatically extract tarballs. This magic is dangerous. It’s non-deterministic. If you ADD a tarball from a URL, you have no guarantee that the content of that URL hasn’t changed. Use COPY. It’s explicit. It’s predictable. Predictability is the only thing that keeps me from losing my mind when the OOM killer starts reaping processes at 3 AM.

Multi-Stage Builds: The Only Way to Keep Your Sanity and Your Disk Space

If you aren’t using multi-stage builds in 2024, you shouldn’t be allowed to touch a production cluster. It is the single most effective tool we have to combat docker image bloat, and yet I see it ignored constantly.

A multi-stage build allows you to use a heavy, dependency-rich image for the build phase and then copy only the resulting artifacts into a tiny, minimal image for the runtime phase. Here is what the autopsy of that 3.2GB image should have looked like:

# Stage 1: The Build
FROM node:20.11.1-bookworm-slim AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: The Runtime
FROM nginx:1.25.4-alpine3.19.1
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

The difference? The first stage might be 1.5GB. The second stage—the one that actually gets pushed to the registry and pulled by the Kubernetes nodes—is about 30MB.

30MB vs 3.2GB.

Think about the implications for a second. A 30MB docker image pulls in seconds. It doesn’t trigger disk pressure alerts. It doesn’t saturate the network interface of the node. It doesn’t fill up the local image cache and force the Kubelet to start garbage collecting images that are actually needed.

When you use COPY --from, you are cherry-picking the only things that matter. You are leaving behind the npm cache, the source code, the compilers, and the thousands of transient dependencies that exist only to transform your code into a static bundle. You are shipping the product, not the factory.

Distroless, Scratch, and the Art of Minimalist Suffering

For the truly disciplined, even Alpine 3.19.1 is too much. Alpine is great—it’s small, it uses musl instead of glibc, and it has a decent package manager (apk). But it still has a shell. It still has a package manager. It still has a filesystem hierarchy that a motivated attacker could use once they find a vulnerability in your application.

If you are shipping a Go binary or a statically linked Rust application, your docker image should start with FROM scratch.

scratch is an empty image. It has zero bytes. No shell, no /bin/ls, no /etc/passwd. You just COPY your binary in and you’re done. This is the ultimate form of containerization. It’s the smallest possible attack surface. It’s the most efficient use of resources.

If you’re running something like Java or Python 3.12.2 and you can’t go full scratch, use “Distroless” images. These are images maintained by Google that contain only your application and its runtime dependencies. They don’t have a shell. You can’t exec into them to poke around. This makes developers cry because they can’t “debug” by running ls inside the container, but you know what? We have logs for that. We have telemetry. We have distributed tracing. You don’t need a shell in production. A shell in production is just a gift for someone who wants to turn your cluster into a crypto-miner.

When you reduce an image to its bare essentials, you aren’t just saving space. You are reducing the cognitive load of the SRE. I don’t have to worry about whether a vulnerability in libssl or ncurses is going to trigger a P0 security alert if those libraries aren’t even in the image.

The Security Tax of Laziness: Why Your Image is a CVE Playground

Every single megabyte of “cruft” you leave in your docker image is a potential entry point for an exploit. That 3.2GB image I had to delete? It had 412 known vulnerabilities. 85 of them were “Critical” or “High.”

Why? Because it was based on an old version of a full Debian image that hadn’t been patched in months. It had curl, wget, git, and a dozen other tools that are perfect for a post-exploitation scenario. If an attacker found a remote code execution (RCE) bug in the React app’s server-side rendering component, they would have a full suite of tools ready and waiting for them to start lateral movement through the network.

When you use a bloated docker image, you are paying a “security tax.” You are forcing the security team to sift through thousands of false positives in the container scans. You are making it impossible to distinguish between a real threat and the background noise of a poorly maintained base image.

And let’s talk about the OOM killer. When your container is bloated, it’s not just the disk that suffers. Memory management becomes a nightmare. A bloated runtime often has a larger memory footprint. In a Kubernetes environment, we set requests and limits. If your docker image is so large that it takes 5 minutes to pull and another 2 minutes to initialize because it’s scanning a massive filesystem, the liveness and readiness probes are going to fail. The orchestrator will kill the container and try again. This is the “CrashLoopBackOff” hell that I spent my Saturday night navigating.

The “No space left on device” error is just the tip of the iceberg. It’s the final symptom of a systemic lack of care. It’s what happens when we prioritize “developer velocity” over operational stability. We’ve made it so easy to build and push a docker image that we’ve forgotten that someone actually has to run the damn thing.

I’m looking at the Prometheus dashboard now. The disk usage is back to normal. The pull times are down from minutes to seconds. The new images—the ones I rebuilt using multi-stage builds and Alpine—are sitting at a comfortable 45MB. The nodes are happy. The OOM killer is dormant.

But I know it won’t last. Tomorrow, some “full stack” wizard will decide they need a new library, and instead of adding it to the package.json, they’ll just apt-get install it in a new layer. They’ll push a 4GB image, and the cycle will start all over again.

If you’re reading this and you’re a developer: look at your docker history. If you see layers that are larger than your actual application code, you have failed. If you aren’t using a .dockerignore, you are being lazy. If you are shipping a shell to production, you are being reckless.

Disk space isn’t infinite. Bandwidth isn’t free. And my sleep is definitely not something you should be sacrificing because you couldn’t be bothered to learn how a filesystem works. Audit your docker image before I have to do it for you. Because next time, I might just let the cluster stay down.

I’m going to bed. Don’t page me unless the data center is literally on fire. And even then, check the docker image sizes first. It’s probably just another bloated layer trying to consume the world.

Related Articles

Explore more insights and best practices:

Leave a Comment