Master Docker Compose: A Guide to Multi-Container Apps

[2023-10-14 03:14:22] ERROR: Could not bind to port 8080. Address already in use.
[2023-10-14 03:14:22] DEBUG: Attempting to kill existing container ‘api_v2_final_FINAL’…
[2023-10-14 03:14:23] Error response from daemon: No such container: api_v2_final_FINAL
[2023-10-14 03:14:23] CRITICAL: Bash script ‘deploy_magic.sh’ exited with code 127.
[2023-10-14 03:14:23] CRITICAL: Database connection string ‘localhost:5432’ failed.
[2023-10-14 03:14:23] FATAL: Production is down. 404 errors spiking.
[2023-10-14 03:14:24] SMS ALERT: [SRE_TEAM] – Wake up. The world is ending. Kevin pushed a script.

It’s 4:00 AM. I’ve consumed enough caffeine to kill a small horse, and I’m staring at a terminal screen that looks like a digital crime scene. The culprit? A 400-line bash script written by a junior developer who thought he could "simplify" our deployment process by manually wrapping `docker run` commands. 

He didn't use a manifest. He didn't use an orchestrator. He used "hope" and a series of nested `if` statements that checked for the existence of PID files that hadn't been relevant since 2012. 

The result? A cascading failure where the API tried to start before the database, the database couldn't find its volume because the path was hardcoded to a directory on Kevin's laptop, and the frontend was trying to talk to a Redis instance that existed only in the ethereal plane of a misconfigured bridge network.

If you are still using manual bash scripts to manage your containers, you are a liability. If you aren't using **docker compose**, you are essentially playing Jenga with a live grenade. This isn't about "streamlining" your workflow. This is about survival. This is about having a single source of truth that doesn't rely on the fragile memory of a human who hasn't slept.

## The Manual Script That Tried To Kill Me

Let’s look at what I found when I logged into the production jump box. Kevin’s script was a masterpiece of incompetence. It tried to manage container lifecycles using `grep` and `awk` to find container IDs. 

```bash
# DO NOT DO THIS. EVER.
docker stop $(docker ps -a -q --filter name=web)
docker rm $(docker ps -a -q --filter name=web)
docker run -d --name web_app_v3 -p 80:8080 --link db_prod:db myapp:latest

The --link flag? That’s been deprecated for years. It’s a ghost. A relic. And yet, there it was, failing because the db_prod container had crashed five minutes earlier due to an unhandled OOM (Out Of Memory) event that the script didn’t bother to monitor.

When you use docker compose, you aren’t just running containers; you are defining a state. You are telling the Docker Engine: “This is what the world should look like. Make it so.” With docker compose version v2.20.2, we have the power to define dependencies, healthchecks, and resource constraints in a way that a bash script never could.

The manual approach fails because it is imperative. “Do this, then do that.” If “this” fails, “that” happens anyway, or the whole thing hangs. docker compose is declarative. It doesn’t care about the “how” as much as the “what.”

YAML: The Indentation-Sensitive Hell We Deserve

People complain about YAML. They say the indentation is finicky. They say it’s hard to read. To those people, I say: try reading a 500-line bash script with unquoted variables and no error handling at 3 AM. I will take a docker-compose.yml file any day of the week.

Below is the reconstruction of our stack. This is the blueprint of sanity. It uses docker compose to ensure that every service knows its place, its limits, and its neighbors.

version: '3.9'

services:
  db:
    image: postgres:15-alpine
    container_name: production_db
    restart: unless-stopped
    environment:
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: app_production
    volumes:
      - db_data:/var/lib/postgresql/data
    networks:
      - backend_internal
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d app_production"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          cpus: '0.50'
          memory: 512M

  redis:
    image: redis:7-alpine
    container_name: production_cache
    restart: always
    networks:
      - backend_internal
    command: ["redis-server", "--appendonly", "yes"]

  api:
    build:
      context: ./api
      dockerfile: Dockerfile
    image: our-registry.com/api-service:v2.4.1
    container_name: production_api
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started
    environment:
      - DATABASE_URL=postgres://${DB_USER}:${DB_PASSWORD}@db:5432/app_production
      - REDIS_URL=redis://redis:6379/0
    networks:
      - backend_internal
      - frontend_external
    restart: on-failure:3

  frontend:
    image: nginx:stable-alpine
    container_name: production_frontend
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./dist:/usr/share/nginx/html:ro
    depends_on:
      - api
    networks:
      - frontend_external

networks:
  frontend_external:
    driver: bridge
  backend_internal:
    internal: true

volumes:
  db_data:
    driver: local

Look at that. It’s beautiful. It’s a contract. It specifies that the api service won’t even try to start until the db service passes its pg_isready healthcheck. It isolates the database in a backend_internal network so some script kiddie can’t hit it directly from the public internet. It sets memory limits so a single memory leak doesn’t take down the entire host.

Networking: Why Everything Is Broken By Default

In Kevin’s manual nightmare, he was trying to connect services using host IP addresses. Do you know what happens to a container’s IP address when it restarts? It changes. It’s ephemeral. Relying on hardcoded IPs in a containerized environment is like building a house on quicksand.

With docker compose, we get automatic service discovery. The api service doesn’t need to know that the database is at 172.18.0.3. It just needs to know that the service is named db. The internal DNS resolver provided by the Docker Engine handles the rest.

Notice the network configuration in the YAML above. We have two distinct networks: frontend_external and backend_internal.

  1. frontend_external: This is where the Nginx container lives, exposing ports 80 and 443 to the world.
  2. backend_internal: This is a dark room. No outside traffic allowed. The database and Redis live here. The API acts as the bridge, sitting on both networks.

This is basic security posture, yet without docker compose, managing these bridge networks manually requires a series of docker network create, docker network connect, and docker network disconnect commands that no human can be trusted to execute correctly under pressure.

The Dependency Lie: Why depends_on Isn’t Enough

Junior devs often think that adding depends_on: - db to their docker-compose.yml is enough. It isn’t. All depends_on does in its simplest form is ensure that the db container has started. It doesn’t mean the database is ready.

Postgres takes time to initialize. It has to check its WAL logs, verify its data files, and open its sockets. If your API tries to connect the millisecond the Postgres container starts, it will crash.

This is why we use the long-form depends_on syntax with condition: service_healthy. By pairing this with a robust healthcheck, we ensure the orchestration layer actually understands the state of the application.

# What happens when I run the command correctly
$ docker compose up -d
[+] Running 5/5
 ⠿ Network production_frontend_external  Created
 ⠿ Network production_backend_internal   Created
 ⠿ Container production_db               Healthy
 ⠿ Container production_cache            Started
 ⠿ Container production_api              Started
 ⠿ Container production_frontend         Started

If the database fails its healthcheck, the API won’t start. The system fails safely. It doesn’t enter a “half-alive” state where the frontend is up but showing 500 errors because the backend is in a crash loop.

Secrets and Environment Variables: Stop Putting Passwords in Git

I found a file in the repository called config_FINAL_v2.sh. Inside were the production database credentials in plain text. I felt my left eye start to twitch.

docker compose supports .env files. This allows us to separate our configuration from our definition. The docker-compose.yml stays in version control, while the .env file—containing the actual secrets—stays on the secure build server or is injected at runtime.

# .env file - DO NOT COMMIT
DB_USER=admin_prod_user
DB_PASSWORD=a_very_long_and_complex_password_that_kevin_would_never_guess

In the YAML, we reference these using ${VARIABLE_NAME}. It’s clean. It’s standard. It doesn’t involve sed or grep or any other text-processing wizardry that inevitably breaks when someone puts a special character in their password.

Persistence is Futile (Unless You Use Volumes)

The most heartbreaking part of the 48-hour outage was the data loss. Kevin’s script didn’t use volumes. He thought that as long as the container was running, the data was safe. When he ran his “cleanup” command—docker rm $(docker ps -a -q)—he wiped the database. Six months of user logs, gone.

We had backups, but restoring them took 12 hours because the backup script was also written by Kevin and it was trying to upload to an S3 bucket that didn’t exist.

In docker compose, volumes are first-class citizens.

volumes:
  db_data:
    driver: local

By mapping db_data:/var/lib/postgresql/data, we ensure that the data lives on the host’s storage, independent of the container’s lifecycle. You can stop the container, delete it, upgrade the image to Postgres 16, and recreate it—the data stays. This is the difference between a toy and a production system.

The Build vs. Image Argument

One of the most powerful features of docker compose is the ability to handle both local development and production deployments using the same file (or overrides).

In development, you might use the build directive:

api:
  build:
    context: .
    dockerfile: Dockerfile.dev
  volumes:
    - .:/app

In production, you use the image directive to pull a pre-built, scanned, and tagged image from your private registry:

api:
  image: our-registry.com/api-service:v2.4.1

This ensures parity. The environment I’m running on my workstation is the same environment running in the cloud. We aren’t dealing with “it works on my machine” syndrome because the docker-compose.yml defines the entire machine.

Verifying the State of the World

When I finally got the stack running using docker compose, I needed to verify that everything was actually working. I didn’t use ps -ef | grep app. I used the tools built for the job.

$ docker compose ps

NAME                  IMAGE                         COMMAND                  SERVICE             CREATED             STATUS                          PORTS
production_api        our-registry.com/api:v2.4.1   "python manage.py ru…"   api                 10 minutes ago      Up 10 minutes                   
production_cache      redis:7-alpine                "docker-entrypoint.s…"   redis               10 minutes ago      Up 10 minutes                   6379/tcp
production_db         postgres:15-alpine            "docker-entrypoint.s…"   db                  10 minutes ago      Up 10 minutes (healthy)         5432/tcp
production_frontend   nginx:stable-alpine           "/docker-entrypoint.…"   frontend            10 minutes ago      Up 10 minutes                   0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp

The (healthy) tag next to the database is my security blanket. It means the pg_isready command is passing. It means the socket is open. It means I can go to sleep for at least twenty minutes before the next alert.

And when things do go wrong—because they always do—I don’t have to hunt through /var/log/syslog or some obscure file Kevin created in /tmp. I just stream the logs.

$ docker compose logs --tail=20 -f api

production_api  | [2023-10-14 04:45:12] INFO: Starting server on port 8000
production_api  | [2023-10-14 04:45:12] INFO: Connecting to database at db:5432
production_api  | [2023-10-14 04:45:13] INFO: Database connection established.
production_api  | [2023-10-14 04:45:13] INFO: Connecting to Redis at redis:6379
production_api  | [2023-10-14 04:45:13] INFO: Redis connection established.
production_api  | [2023-10-14 04:45:14] INFO: Application is ready to receive traffic.
production_api  | [2023-10-14 04:46:01] GET /api/v1/health 200 OK
production_api  | [2023-10-14 04:46:05] GET /api/v1/users 200 OK

The Final Stand: Why I’m Not a Goat Farmer (Yet)

I’ve spent the last 48 hours cleaning up a mess that should never have happened. I’ve seen things in that bash script that will haunt my nightmares—unclosed loops, variables named var1, var2, and var3_new, and a complete lack of respect for the principles of idempotent infrastructure.

We are moving everything to docker compose. No exceptions. If a service isn’t in the compose file, it doesn’t exist. If a configuration parameter isn’t in the .env file, it isn’t real.

Is docker compose perfect? No. It has its quirks. YAML can be a pain. Sometimes the internal DNS resolver gets confused if you do too many hot-reloads. But compared to the alternative—the manual, script-driven chaos that nearly destroyed this company—it is a godsend.

It provides a common language for developers and SREs. It allows us to version our infrastructure. It allows us to spin up an entire replica of production in seconds for testing.

I’m tired. My eyes are burning, and I’m pretty sure I can hear the server rack humming in my sleep. But the stack is up. The healthchecks are green. The junior developer has been banned from using chmod +x on any file ending in .sh.

I’ve looked at the prices of land in Vermont. I’ve researched the dietary needs of goats. It’s a tempting life. No logs, no containers, no 3 AM calls. Just me, some animals, and a complete lack of internet connectivity. But as long as I have a well-structured docker-compose.yml and a functioning container runtime, I’ll stay in the trenches.

Just don’t let Kevin touch the production environment again. If I see one more docker run command without a --restart policy, I’m quitting on the spot.

Now, if you’ll excuse me, I’m going to go find a place to sleep where the sound of cooling fans can’t reach me. Use docker compose. Don’t be a Kevin. Your SREs will thank you, or at the very least, they won’t plot your demise in the middle of the night.

“`bash

Final check. Everything is green.

$ docker compose ps –filter “status=running”

Related Articles

Explore more insights and best practices:

Output shows all services operational.

Sanity restored… for now.

Leave a Comment