[2023-10-14 03:14:22] ERROR: Could not bind to port 8080. Address already in use.
[2023-10-14 03:14:22] DEBUG: Attempting to kill existing container ‘api_v2_final_FINAL’…
[2023-10-14 03:14:23] Error response from daemon: No such container: api_v2_final_FINAL
[2023-10-14 03:14:23] CRITICAL: Bash script ‘deploy_magic.sh’ exited with code 127.
[2023-10-14 03:14:23] CRITICAL: Database connection string ‘localhost:5432’ failed.
[2023-10-14 03:14:23] FATAL: Production is down. 404 errors spiking.
[2023-10-14 03:14:24] SMS ALERT: [SRE_TEAM] – Wake up. The world is ending. Kevin pushed a script.
It’s 4:00 AM. I’ve consumed enough caffeine to kill a small horse, and I’m staring at a terminal screen that looks like a digital crime scene. The culprit? A 400-line bash script written by a junior developer who thought he could "simplify" our deployment process by manually wrapping `docker run` commands.
He didn't use a manifest. He didn't use an orchestrator. He used "hope" and a series of nested `if` statements that checked for the existence of PID files that hadn't been relevant since 2012.
The result? A cascading failure where the API tried to start before the database, the database couldn't find its volume because the path was hardcoded to a directory on Kevin's laptop, and the frontend was trying to talk to a Redis instance that existed only in the ethereal plane of a misconfigured bridge network.
If you are still using manual bash scripts to manage your containers, you are a liability. If you aren't using **docker compose**, you are essentially playing Jenga with a live grenade. This isn't about "streamlining" your workflow. This is about survival. This is about having a single source of truth that doesn't rely on the fragile memory of a human who hasn't slept.
## The Manual Script That Tried To Kill Me
Let’s look at what I found when I logged into the production jump box. Kevin’s script was a masterpiece of incompetence. It tried to manage container lifecycles using `grep` and `awk` to find container IDs.
```bash
# DO NOT DO THIS. EVER.
docker stop $(docker ps -a -q --filter name=web)
docker rm $(docker ps -a -q --filter name=web)
docker run -d --name web_app_v3 -p 80:8080 --link db_prod:db myapp:latest
The --link flag? That’s been deprecated for years. It’s a ghost. A relic. And yet, there it was, failing because the db_prod container had crashed five minutes earlier due to an unhandled OOM (Out Of Memory) event that the script didn’t bother to monitor.
When you use docker compose, you aren’t just running containers; you are defining a state. You are telling the Docker Engine: “This is what the world should look like. Make it so.” With docker compose version v2.20.2, we have the power to define dependencies, healthchecks, and resource constraints in a way that a bash script never could.
The manual approach fails because it is imperative. “Do this, then do that.” If “this” fails, “that” happens anyway, or the whole thing hangs. docker compose is declarative. It doesn’t care about the “how” as much as the “what.”
Table of Contents
YAML: The Indentation-Sensitive Hell We Deserve
People complain about YAML. They say the indentation is finicky. They say it’s hard to read. To those people, I say: try reading a 500-line bash script with unquoted variables and no error handling at 3 AM. I will take a docker-compose.yml file any day of the week.
Below is the reconstruction of our stack. This is the blueprint of sanity. It uses docker compose to ensure that every service knows its place, its limits, and its neighbors.
version: '3.9'
services:
db:
image: postgres:15-alpine
container_name: production_db
restart: unless-stopped
environment:
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: app_production
volumes:
- db_data:/var/lib/postgresql/data
networks:
- backend_internal
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d app_production"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: '0.50'
memory: 512M
redis:
image: redis:7-alpine
container_name: production_cache
restart: always
networks:
- backend_internal
command: ["redis-server", "--appendonly", "yes"]
api:
build:
context: ./api
dockerfile: Dockerfile
image: our-registry.com/api-service:v2.4.1
container_name: production_api
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
environment:
- DATABASE_URL=postgres://${DB_USER}:${DB_PASSWORD}@db:5432/app_production
- REDIS_URL=redis://redis:6379/0
networks:
- backend_internal
- frontend_external
restart: on-failure:3
frontend:
image: nginx:stable-alpine
container_name: production_frontend
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./dist:/usr/share/nginx/html:ro
depends_on:
- api
networks:
- frontend_external
networks:
frontend_external:
driver: bridge
backend_internal:
internal: true
volumes:
db_data:
driver: local
Look at that. It’s beautiful. It’s a contract. It specifies that the api service won’t even try to start until the db service passes its pg_isready healthcheck. It isolates the database in a backend_internal network so some script kiddie can’t hit it directly from the public internet. It sets memory limits so a single memory leak doesn’t take down the entire host.
Networking: Why Everything Is Broken By Default
In Kevin’s manual nightmare, he was trying to connect services using host IP addresses. Do you know what happens to a container’s IP address when it restarts? It changes. It’s ephemeral. Relying on hardcoded IPs in a containerized environment is like building a house on quicksand.
With docker compose, we get automatic service discovery. The api service doesn’t need to know that the database is at 172.18.0.3. It just needs to know that the service is named db. The internal DNS resolver provided by the Docker Engine handles the rest.
Notice the network configuration in the YAML above. We have two distinct networks: frontend_external and backend_internal.
- frontend_external: This is where the Nginx container lives, exposing ports 80 and 443 to the world.
- backend_internal: This is a dark room. No outside traffic allowed. The database and Redis live here. The API acts as the bridge, sitting on both networks.
This is basic security posture, yet without docker compose, managing these bridge networks manually requires a series of docker network create, docker network connect, and docker network disconnect commands that no human can be trusted to execute correctly under pressure.
The Dependency Lie: Why depends_on Isn’t Enough
Junior devs often think that adding depends_on: - db to their docker-compose.yml is enough. It isn’t. All depends_on does in its simplest form is ensure that the db container has started. It doesn’t mean the database is ready.
Postgres takes time to initialize. It has to check its WAL logs, verify its data files, and open its sockets. If your API tries to connect the millisecond the Postgres container starts, it will crash.
This is why we use the long-form depends_on syntax with condition: service_healthy. By pairing this with a robust healthcheck, we ensure the orchestration layer actually understands the state of the application.
# What happens when I run the command correctly
$ docker compose up -d
[+] Running 5/5
⠿ Network production_frontend_external Created
⠿ Network production_backend_internal Created
⠿ Container production_db Healthy
⠿ Container production_cache Started
⠿ Container production_api Started
⠿ Container production_frontend Started
If the database fails its healthcheck, the API won’t start. The system fails safely. It doesn’t enter a “half-alive” state where the frontend is up but showing 500 errors because the backend is in a crash loop.
Secrets and Environment Variables: Stop Putting Passwords in Git
I found a file in the repository called config_FINAL_v2.sh. Inside were the production database credentials in plain text. I felt my left eye start to twitch.
docker compose supports .env files. This allows us to separate our configuration from our definition. The docker-compose.yml stays in version control, while the .env file—containing the actual secrets—stays on the secure build server or is injected at runtime.
# .env file - DO NOT COMMIT
DB_USER=admin_prod_user
DB_PASSWORD=a_very_long_and_complex_password_that_kevin_would_never_guess
In the YAML, we reference these using ${VARIABLE_NAME}. It’s clean. It’s standard. It doesn’t involve sed or grep or any other text-processing wizardry that inevitably breaks when someone puts a special character in their password.
Persistence is Futile (Unless You Use Volumes)
The most heartbreaking part of the 48-hour outage was the data loss. Kevin’s script didn’t use volumes. He thought that as long as the container was running, the data was safe. When he ran his “cleanup” command—docker rm $(docker ps -a -q)—he wiped the database. Six months of user logs, gone.
We had backups, but restoring them took 12 hours because the backup script was also written by Kevin and it was trying to upload to an S3 bucket that didn’t exist.
In docker compose, volumes are first-class citizens.
volumes:
db_data:
driver: local
By mapping db_data:/var/lib/postgresql/data, we ensure that the data lives on the host’s storage, independent of the container’s lifecycle. You can stop the container, delete it, upgrade the image to Postgres 16, and recreate it—the data stays. This is the difference between a toy and a production system.
The Build vs. Image Argument
One of the most powerful features of docker compose is the ability to handle both local development and production deployments using the same file (or overrides).
In development, you might use the build directive:
api:
build:
context: .
dockerfile: Dockerfile.dev
volumes:
- .:/app
In production, you use the image directive to pull a pre-built, scanned, and tagged image from your private registry:
api:
image: our-registry.com/api-service:v2.4.1
This ensures parity. The environment I’m running on my workstation is the same environment running in the cloud. We aren’t dealing with “it works on my machine” syndrome because the docker-compose.yml defines the entire machine.
Verifying the State of the World
When I finally got the stack running using docker compose, I needed to verify that everything was actually working. I didn’t use ps -ef | grep app. I used the tools built for the job.
$ docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
production_api our-registry.com/api:v2.4.1 "python manage.py ru…" api 10 minutes ago Up 10 minutes
production_cache redis:7-alpine "docker-entrypoint.s…" redis 10 minutes ago Up 10 minutes 6379/tcp
production_db postgres:15-alpine "docker-entrypoint.s…" db 10 minutes ago Up 10 minutes (healthy) 5432/tcp
production_frontend nginx:stable-alpine "/docker-entrypoint.…" frontend 10 minutes ago Up 10 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp
The (healthy) tag next to the database is my security blanket. It means the pg_isready command is passing. It means the socket is open. It means I can go to sleep for at least twenty minutes before the next alert.
And when things do go wrong—because they always do—I don’t have to hunt through /var/log/syslog or some obscure file Kevin created in /tmp. I just stream the logs.
$ docker compose logs --tail=20 -f api
production_api | [2023-10-14 04:45:12] INFO: Starting server on port 8000
production_api | [2023-10-14 04:45:12] INFO: Connecting to database at db:5432
production_api | [2023-10-14 04:45:13] INFO: Database connection established.
production_api | [2023-10-14 04:45:13] INFO: Connecting to Redis at redis:6379
production_api | [2023-10-14 04:45:13] INFO: Redis connection established.
production_api | [2023-10-14 04:45:14] INFO: Application is ready to receive traffic.
production_api | [2023-10-14 04:46:01] GET /api/v1/health 200 OK
production_api | [2023-10-14 04:46:05] GET /api/v1/users 200 OK
The Final Stand: Why I’m Not a Goat Farmer (Yet)
I’ve spent the last 48 hours cleaning up a mess that should never have happened. I’ve seen things in that bash script that will haunt my nightmares—unclosed loops, variables named var1, var2, and var3_new, and a complete lack of respect for the principles of idempotent infrastructure.
We are moving everything to docker compose. No exceptions. If a service isn’t in the compose file, it doesn’t exist. If a configuration parameter isn’t in the .env file, it isn’t real.
Is docker compose perfect? No. It has its quirks. YAML can be a pain. Sometimes the internal DNS resolver gets confused if you do too many hot-reloads. But compared to the alternative—the manual, script-driven chaos that nearly destroyed this company—it is a godsend.
It provides a common language for developers and SREs. It allows us to version our infrastructure. It allows us to spin up an entire replica of production in seconds for testing.
I’m tired. My eyes are burning, and I’m pretty sure I can hear the server rack humming in my sleep. But the stack is up. The healthchecks are green. The junior developer has been banned from using chmod +x on any file ending in .sh.
I’ve looked at the prices of land in Vermont. I’ve researched the dietary needs of goats. It’s a tempting life. No logs, no containers, no 3 AM calls. Just me, some animals, and a complete lack of internet connectivity. But as long as I have a well-structured docker-compose.yml and a functioning container runtime, I’ll stay in the trenches.
Just don’t let Kevin touch the production environment again. If I see one more docker run command without a --restart policy, I’m quitting on the spot.
Now, if you’ll excuse me, I’m going to go find a place to sleep where the sound of cooling fans can’t reach me. Use docker compose. Don’t be a Kevin. Your SREs will thank you, or at the very least, they won’t plot your demise in the middle of the night.
“`bash
Final check. Everything is green.
$ docker compose ps –filter “status=running”
Related Articles
Explore more insights and best practices:
- Top Cybersecurity Jobs In 2024 Careers Salary And Skills
- 413 Request Entity Too Large
- Docker Best Practices Build Production Ready Containers