Skip to main content

CrashLoopBackOff

Before You Change Anything

Start with inspection and narrowing steps first. Some fixes in debugging pages mutate shared resources, so separate observation from recovery.

A pod in CrashLoopBackOff means Kubernetes keeps starting the container and the container keeps crashing.

Kubernetes waits longer between retries each time: 10s, 20s, 40s, and so on up to 5 minutes.

Use this page when the container starts, fails, and restarts, not when it is stuck pulling images or waiting on scheduling.

Symptoms

kubectl get pods -n backend

Output shows:

NAME                            READY   STATUS             RESTARTS   AGE
orchestrator-6f8b4c9d7-x2k4p 0/1 CrashLoopBackOff 5 3m
Debug: CrashLoopBackOff
What do you see in the pod logs?

If you do not want to use the decision tree, use this short recovery flow:

1
Step 1

Check the logs first

Start with the orchestrator logs.

kubectl logs -n backend deployment/orchestrator
kubectl logs -n backend deployment/orchestrator --previous

Common patterns:

  • connection refused on port 5432
  • secret not found or key not found
  • migration failed
  • panic:
2
Step 2

Check pod events

Events tell you whether the failure is about mount problems, scheduling, or repeated container exits.

kubectl describe pod -n backend -l app=orchestrator
3
Step 3

Diagnose by error type

Match the error to the right branch.

For missing secrets:

kubectl get secret -n backend
kubectl get externalsecret -n backend
kubectl get secret orchestrator-vault-secrets -n backend -o yaml

For database failures:

kubectl get pods -n backend -l app.kubernetes.io/name=postgresql
kubectl logs -n backend -l app.kubernetes.io/name=postgresql

For migration failures:

kubectl port-forward -n backend svc/backend-postgresql 5432:5432
psql -h localhost -U crawbl -d crawbl -c "SELECT * FROM schema_migrations;"

For Go panics, read the stack trace, fix the code path, and redeploy.

4
Step 4

Restart only after the root cause is fixed

If the fix was a config or secret update, restart the workload.

kubectl rollout restart deployment/orchestrator -n backend

If the fix required code, push to main and let CI deliver the new image.

5
Step 5

Verify recovery

Watch the pod return to Running and 1/1, then hit the health endpoint.

kubectl get pods -n backend -w
curl -s https://dev.api.crawbl.com/v1/health

What's next: Secret Sync Failures

🔗 Terms On This Page

If a term below is unfamiliar, open its glossary entry. For the full list, go to Internal Glossary.

  • CrashLoopBackOff: A Kubernetes restart state where a container keeps crashing and retries are spaced farther apart.
  • External Secrets Operator: The controller that copies secrets from AWS Secrets Manager into Kubernetes Secrets.