Beep. Beep. Beep. Beep.
The sound isn’t even a sound anymore. It’s a physical weight pressing against my temples. 3:14 AM. The blue light of my monitor is the only thing keeping my retinas from fusing shut. I’ve been on rotation for 72 hours because Dave decided to “optimize” the etcd maintenance script and then promptly went on a hiking trip in a dead zone.
The cluster is screaming. I’m staring at a terminal window where the logs are scrolling so fast they look like static. My coffee is cold, my soul is leaking out of my ears, and the CEO just Slack-messaged me asking, “Hey, I’m at this conference, what is Kubernetes exactly? Is it why the site is down?”
Listen close, you suit-wearing vulture. I’m going to tell you exactly what this monster is, but I’m not going to use any of those glossy slide-deck metaphors. There are no captains, no ships, and no “seamless” transitions here. There is only technical debt, leaky abstractions, and the slow, grinding decay of my mental health.
Welcome to the post-mortem of my sanity.
Table of Contents
1. Denial: It’s Just a Container, Right?
When you ask what is Kubernetes, you’re usually looking for a nice, clean definition. You want to hear that it’s a “platform for automating deployment, scaling, and management of containerized applications.” That’s the lie they tell you so you’ll sign the cloud bill.
In reality, Kubernetes is a distributed state machine designed to hide the fact that Linux is hard. Back in v1.18, we thought we had a handle on it. We thought, “Oh, it’s just Docker with a brain.” We were wrong. It is a massive, bloated API server sitting on top of a fragile consensus algorithm, pretending that your hardware doesn’t exist.
At its core, Kubernetes is an abstraction layer for Linux primitives. It takes things like namespaces (which isolate what a process can see) and cgroups (which limit what a process can consume) and wraps them in a layer of YAML so thick you can’t see the kernel anymore. When you “run a pod,” you aren’t running a magical cloud entity. You are asking the kubelet—a binary running on a physical or virtual node—to talk to a Container Runtime Interface (CRI), like containerd or CRI-O, to set up a series of Linux namespaces.
It’s denial. We deny that the underlying hardware matters. We deny that networking is hard. We pretend that if we just describe our “desired state” in a text file, the universe will conspire to make it so.
$ kubectl get events --all-namespaces --sort-by='.lastTimestamp'
NAMESPACE LAST SEEN TYPE REASON OBJECT MESSAGE
default 12s Warning FailedScheduling pod/api-gateway-7f5d69g 0/3 nodes are available: 3 Insufficient cpu.
kube-system 8s Warning Unhealthy pod/etcd-main-node Liveness probe failed: HTTP probe failed with statuscode: 500
prod 5s Warning BackOff pod/db-migration-v2 Back-off restarting failed container
default 2s Normal Scheduled pod/nginx-666 Successfully assigned default/nginx-666 to node-03
prod 1s Warning FailedMount pod/legacy-app MountVolume.SetUp failed for volume "data" : rpc error: code = Internal desc = target not found
Look at that. That’s the heartbeat of denial. Insufficient cpu. Liveness probe failed. This is the system telling you that your abstraction is crashing into the brick wall of reality.
2. Anger: The YAML Indentation That Broke the Camel’s Back
You want to talk about anger? Let’s talk about YAML. Kubernetes is governed by the “Declarative Model.” This means instead of telling the computer how to do something, you tell it what you want, and then you pray to the gods of the reconciliation loop that it actually happens.
The reconciliation loop is the infinite “while” loop at the heart of the kube-controller-manager. It looks at the “Current State” (which is usually “on fire”) and compares it to the “Desired State” (the YAML you wrote). If they don’t match, it tries to fix it.
But here’s the catch: the “fix” often involves more YAML. You end up with manifests that are 400 lines long just to run a simple Go binary. If you miss two spaces in your indentation on line 247, the whole thing fails with an error message that looks like it was written by a cryptographer on acid.
Here is a “simple” deployment manifest I’m currently staring at. It’s a monument to our collective failure as a species:
apiVersion: apps/v1
kind: Deployment
metadata:
name: over-engineered-microservice
namespace: prod-west-2
labels:
app: nightmare
tier: backend
version: v1.30.1
spec:
replicas: 3
selector:
matchLabels:
app: nightmare
template:
metadata:
labels:
app: nightmare
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nightmare
topologyKey: "kubernetes.io/hostname"
containers:
- name: app-container
image: our-registry.io/bloated-image:latest@sha256:deadbeef1234567890
resources:
limits:
cpu: "500m"
memory: "1024Mi"
requests:
cpu: "250m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
exec:
command:
- /bin/sh
- -c
- "ps aux | grep app"
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-creds
key: password
Look at the affinity section. We have to spend twenty lines of code just to tell the cluster: “Please don’t put all the pods on the same machine so that when the machine dies, the whole site doesn’t go dark.” This is what we call “cloud-native resilience.” I call it “babysitting a temperamental toddler.”
3. Bargaining: Trying to Replicate the Control Plane with Bash and Hope
At 3:45 AM, you start bargaining. You think, “Maybe I don’t need this. Maybe I could have just used a bash script and some systemd units.” But you can’t. Because you need the Control Plane.
The Control Plane is the brain. It consists of the kube-apiserver, kube-scheduler, kube-controller-manager, and the dark heart: etcd.
etcd is a distributed key-value store. It uses the Raft consensus algorithm. To understand what is Kubernetes, you have to understand Raft. Raft is a way for a group of computers to agree on a single state, even if some of them are lying or dead. It relies on a “Leader.” If the leader dies, the “Followers” hold an election.
The problem? Elections take time. And in the world of CAP theorem (Consistency, Availability, Partition Tolerance), etcd chooses Consistency and Partition Tolerance. It will happily sacrifice Availability. If your network has a hiccup and the nodes can’t talk to each other, etcd stops accepting writes. The cluster freezes. The API server starts returning 500 errors.
I’m currently bargaining with a three-node etcd cluster that has lost quorum.
# etcdctl endpoint status --write-out=table
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://10.0.1.10:2379 | 8e9e05b5a640dd01 | 3.5.13 | 156 MB | false | false | 12 | 1004567 | 1004560 | |
| http://10.0.1.11:2379 | 7d4e05b5a640ee02 | 3.5.13 | 156 MB | false | false | 12 | 1004568 | 1004560 | |
| http://10.0.1.12:2379 | | | | | | | | | Error |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Node 3 is gone. It’s not just “off.” It’s corrupted. The WAL (Write-Ahead Log) is mangled. I’m trying to restore from a snapshot taken four hours ago, knowing full well that any “desired state” changes made since then are vaporized. This is the “bargain.” We get high availability for our apps, but we pay for it with the extreme fragility of the state store.
4. Depression: Staring into the Abyssal Void of the Kube-Proxy
If the Control Plane is the brain, the networking is the nervous system, and it’s currently suffering from multiple sclerosis.
When people ask what is Kubernetes, they rarely want to hear about the Container Network Interface (CNI). But the CNI is where the real pain lives. Kubernetes doesn’t actually have a built-in networking solution. It just has a specification. You have to choose a plugin: Calico, Flannel, Cilium, Weave. Each one is a different flavor of hell.
Let’s talk about kube-proxy. This is the component that manages the “Service” abstraction. When you hit a Service IP, kube-proxy uses iptables or IPVS to mangle your packets and redirect them to a pod IP.
Have you ever looked at an iptables dump on a node with 500 services? It’s a 10,000-line scroll of doom. Every packet that enters the node has to be evaluated against these rules. It’s a linear search. It’s inefficient. It’s 1990s technology trying to support 2024 scale.
And then there’s the CNI plugin itself. Let’s take Calico. It uses BGP (Border Gateway Protocol)—the same protocol that runs the actual Internet—to distribute routes between your nodes. Think about that. You are running a mini-Internet inside your rack just so Pod A can talk to Pod B.
If the CNI fails, you get the dreaded ContainerCreating status. You check the logs, and you see:
NetworkPlugin cni failed to set up pod "nginx-666_default" network: failed to delegate: failed to set up bridge: "cni0" already has an IP address
You spend four hours debugging a bridge interface that shouldn’t exist, only to realize that a stale veth pair is hanging around from a pod that died three days ago. You start to wonder if the “Abyssal Void” is actually just a 10.0.0.0/8 subnet that no one bothered to document.
The complexity of CNI is staggering. You have to worry about MTU (Maximum Transmission Unit) sizes. If your CNI uses VXLAN encapsulation, it adds overhead to every packet. If your MTU is set to 1500 but your underlying network only supports 1450 because of the encapsulation, your packets will be fragmented or dropped. Your database connections will hang. Your API calls will time out. And you will sit there, at 4:15 AM, wondering why curl works but your application doesn’t. It’s because of a 50-byte header. That is the depth of the depression.
5. Acceptance: Embracing the Distributed State Machine
I’ve reached acceptance. Not because I like it, but because I have no other choice. The etcd cluster is finally coming back online after I manually scrubbed the data directory and forced a new member addition.
To truly answer what is Kubernetes, you have to accept that it is a “Reconciliation Engine.” It is a system that exists to constantly correct itself.
When you submit a YAML file to the kube-apiserver, the following happens:
1. The API Server validates the YAML (and usually complains about a missing field).
2. It stores the object in etcd.
3. The Scheduler sees a new Pod object with no nodeName. It looks at the resource requests and the available nodes and picks the one that is the least overloaded (or the one that is the most broken, it’s a coin toss).
4. The Scheduler updates the Pod object with the nodeName.
5. The Kubelet on that specific node is “watching” the API server. It sees its name on the Pod.
6. The Kubelet pulls the image using the CRI.
7. The Kubelet tells the CNI to give the pod an IP.
8. The Kubelet starts the container.
9. The Kubelet reports back to the API server: “I’m running.”
This is the “Distributed State Machine.” It’s a series of independent actors watching a central database and acting on changes. It’s beautiful in a horrifying, “I-can’t-believe-this-actually-works” kind of way.
But then you see the reality in the logs:
$ kubectl describe pod api-gateway-7f5d69g
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned default/api-gateway-7f5d69g to node-01
Normal Pulling 3m (x3 over 4m) kubelet Pulling image "our-registry.io/bloated-image:latest"
Warning Failed 3m (x3 over 4m) kubelet Failed to pull image: rpc error: code = Unknown desc = Error response from daemon: Get https://our-registry.io/v2/: net/http: TLS handshake timeout
Warning Failed 3m (x3 over 4m) kubelet Error: ImagePullBackOff
Normal BackOff 2m (x6 over 4m) kubelet Back-off pulling image "our-registry.io/bloated-image:latest"
Warning Unhealthy 1m (x10 over 3m) kubelet Liveness probe failed: Get "http://10.244.1.45:8080/healthz": dial tcp 10.244.1.45:8080: connect: connection refused
Acceptance is knowing that ImagePullBackOff is your new best friend. Acceptance is knowing that CrashLoopBackOff usually means you forgot to set an environment variable. Acceptance is realizing that Kubernetes isn’t here to make your life easy; it’s here to make your failures standardized.
6. The Final Bill: Why We Suffer Through It Anyway
So, CEO, if you’re still awake and haven’t closed this tab to go look at more AI-generated art, here is the answer.
What is Kubernetes?
It is the most expensive, complex, and frustrating way to run a “Hello World” app ever devised by man. It is a system that requires a dedicated team of sleep-deprived engineers to maintain a “cloud-native” posture. It is a collection of binaries that spend 90% of their time talking to each other and 10% of their time actually running your code.
We suffer through it because the alternative is worse. The alternative is “Snowflake Servers.” The alternative is “It works on my machine.” Kubernetes gives us a common language for our misery. It gives us a way to describe infrastructure that—theoretically—can be moved from AWS to GCP to Azure without rewriting everything (though we all know that’s a lie because of LoadBalancer annotations and CSI driver differences).
It’s 4:45 AM now. The etcd cluster is green. The api-gateway pods are finally Running. The iptables rules have settled. My PagerDuty is quiet, for now.
Kubernetes is a mirror. It reflects the complexity of your organization. If your app is a mess, Kubernetes will make it a distributed mess. If your team doesn’t understand networking, Kubernetes will ensure no one understands networking.
I’m going to close my laptop now. I’m going to try to sleep for three hours before the next “reconciliation loop” decides that my desired state of “being asleep” doesn’t match the current state of “production is down.”
Don’t ask me “what is” Kubernetes again. Just pay the cloud bill and leave me alone. I have more YAML to write.
Related Articles
Explore more insights and best practices: