Kubernetes Issues
Recovery steps can affect shared environments. Prefer reversible or inspect-only steps first, then escalate to stronger actions only when the evidence supports it.
Use this page when the cluster itself is misbehaving, not just a single app inside it.
Some commands here are forceful cleanup commands. Confirm the symptom before you run them.
9. Namespace Stuck in Terminating
Problem: kubectl get ns shows a namespace stuck in Terminating indefinitely.
Cause: Finalizers on the namespace or its child resources prevent deletion.
What this means in plain language: Kubernetes is waiting for cleanup steps that never finished, so the namespace never fully disappears.
Risk level: Destructive. This bypasses normal cleanup.
Fix: Force-remove finalizers from the stuck namespace:
kubectl get ns <namespace> -o json | python3 -c "
import json,sys; ns=json.load(sys.stdin); ns['spec']['finalizers']=[]; json.dump(ns,sys.stdout)
" | kubectl replace --raw "/api/v1/namespaces/<namespace>/finalize" -f -
10. Namespace Not Found on Fresh Cluster
Problem: Applications fail with namespaces "X" not found on a fresh cluster.
Cause: Target namespaces do not exist yet and the Application does not create them.
What this means in plain language: ArgoCD is trying to deploy into a namespace that does not exist yet.
Risk level: Safe configuration change.
Fix: Add CreateNamespace=true to every Application in root/:
syncPolicy:
syncOptions:
- CreateNamespace=true
11. Bitnami Image Pull Failure
Problem: Pod image pull fails for a semver-pinned Bitnami tag like bitnami/redis:8.2.1-debian-12-r0.
Cause: Bitnami stopped publishing semver-pinned tags on Docker Hub.
What this means in plain language: the image tag you asked Kubernetes to pull is no longer published upstream.
Risk level: Safe chart/config change.
Fix (option 1): Pin by digest in Helm values:
image:
digest: sha256:<current-digest>
Fix (option 2): Upgrade the vendored chart to the latest version, which defaults to tag: latest.
12. ESO Not Syncing Secrets After Fresh Build
Problem: External Secrets Operator is not syncing secrets after a fresh cluster build.
Cause: The aws-credentials Kubernetes Secret (bootstrap dependency) is missing.
What this means in plain language: the secret-sync controller is missing the one bootstrap credential it needs in order to fetch everything else.
Risk level: Sensitive but routine. You are creating cluster credentials.
Fix:
kubectl create secret generic aws-credentials \
-n external-secrets \
--from-literal=access-key-id=$AWS_ACCESS_KEY_ID \
--from-literal=secret-access-key=$AWS_SECRET_ACCESS_KEY
kubectl rollout restart deployment/external-secrets -n external-secrets
Background: Vendored Charts
Crawbl vendors all third-party Helm charts into Git instead of fetching them from external repos at sync time.
This is background context rather than a failure mode, but it explains why some chart-related fixes look unusual.
Vendoring prevents OOM kills, avoids external repo downtime during sync, and pins exact versions.
Adding a New Chart
helm repo add <repo-name> <repo-url> && helm repo update
helm pull <repo-name>/<chart-name> --version <version> --untar -d components/<name>/chart/
Then create root/<name>.yaml (Application CR), components/<name>/envs/dev.yaml (overrides), commit and push.
Upgrading a Chart
rm -rf components/<name>/chart
helm pull <repo-name>/<chart-name> --version <new-version> --untar -d components/<name>/chart/
git diff components/<name>/chart # review changes before committing
git add components/<name>/chart && git commit -m "Upgrade <name> to <new-version>" && git push
ArgoCD auto-syncs within 3 minutes. For OCI charts, use helm pull oci://<registry>/<chart> --version <version>.
13. Chart Upgrade Breaks Sync
Problem: After upgrading a vendored chart, ArgoCD sync fails with validation errors or immutable field conflicts.
Cause: New chart version changed resource specs that Kubernetes cannot update in-place (StatefulSet volume claims, CRDs, renamed resources).
What this means in plain language: the new chart version changed a resource in a way Kubernetes cannot safely patch in place.
Risk level: Depends on the fix. Review first, then choose the least disruptive option.
Fix: Review the diff before committing. For immutable fields, delete the StatefulSet with --cascade=orphan (see ArgoCD issue #5). For CRD conflicts, apply CRDs manually first:
kubectl apply -f components/<name>/chart/crds/
Helm Repository Reference
helm repo add jetstack https://charts.jetstack.io
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add external-dns https://kubernetes-sigs.github.io/external-dns/
# Envoy Gateway uses OCI — no repo add needed
🔗 Terms On This Page
If a term below is unfamiliar, open its glossary entry. For the full list, go to Internal Glossary.
- ArgoCD: The GitOps deployment system that keeps the cluster aligned with what is committed in Git.
- External Secrets Operator: The controller that copies secrets from AWS Secrets Manager into Kubernetes Secrets.
- Helm Chart: A packaged set of Kubernetes templates and values used to deploy an application.