The Crawbl Logs Guide
A step-by-step reference for searching and understanding logs in the Crawbl dev cluster. No prior LogsQL experience required.
Chapter 1: Getting Started
What is VictoriaLogs and why we use it
VictoriaLogs is a log database that collects output from every container in the Crawbl cluster automatically.
- No more SSH-ing into machines or running
kubectl logsagainst individual pods - Open a browser, type a query, and search across every service at once
- Lightweight -- runs as a single pod
- Efficient storage -- compresses logs well
- Purpose-built query language -- designed specifically for log data
- Replaces heavier alternatives like Elasticsearch or Loki for our dev cluster
How logs flow from your app to VictoriaLogs
Every log line takes this path:
Your app writes to stdout/stderr
|
v
containerd (the container runtime) writes it to a file on disk
|
v
Fluent Bit (runs on every node) reads the file, attaches Kubernetes
metadata (namespace, pod name, container name), and ships it over HTTP
|
v
VictoriaLogs stores the enriched record and makes it searchable
This happens automatically for every container. You do not need to configure anything in your application -- just write to stdout.
How to open the UI
Open your browser and go to:
https://dev.logs.crawbl.com/select/vmui
No credentials are required in the dev environment.
You will see three areas:
| Area | Location | Purpose |
|---|---|---|
| Query bar | Top center | Type queries here. Press Enter to run. |
| Time range picker | Top right | Controls the search window. Defaults to last 1 hour. |
| Results pane | Below | Shows matching log lines. Click any line to expand metadata. |
Your first query
Type * into the query bar and press Enter.
- This matches every log line in the selected time range
- You will see logs from the orchestrator, Redis, ArgoCD, Fluent Bit, and everything else mixed together
If the results are empty, check the time range picker. Extend it to "Last 24 hours" to confirm logs exist.
Chapter 2: Understanding the Log Structure
Every log line has metadata
Fluent Bit attaches metadata fields from the Kubernetes API to every log line it ships. These fields tell you exactly where the log came from without needing to read the message itself.
The fields you will use
kubernetes.namespace_name -- which namespace the pod lives in
| Namespace | What lives here |
|---|---|
backend | Orchestrator, webhook, reaper, PostgreSQL, Redis, pgweb, docs, website |
userswarms | ZeroClaw AI agent pods (one per user workspace) |
monitoring | Fluent Bit, VictoriaMetrics, VictoriaLogs |
argocd | ArgoCD server, repo server, application controller, Redis |
cert-manager | Certificate management controllers |
envoy-gateway-system | Envoy Gateway and proxy pods |
external-dns | DNS record sync controller |
external-secrets | Secrets sync from AWS Secrets Manager |
userswarm-controller | Metacontroller (creates agent pods) |
kubernetes.pod_name -- the full pod name including the random suffix
| Pod name | What it is |
|---|---|
orchestrator-795499fd8b-sgctg | The Crawbl API server |
userswarm-webhook-7db4d6cdcd-wq6rx | The webhook that creates agent pods |
e2e-reaper-29587560-k65dv | CronJob that cleans up test resources |
backend-postgresql-0 | The PostgreSQL database |
backend-redis-master-0 | The Redis instance |
zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0 | A ZeroClaw agent pod |
victoria-logs-0 | VictoriaLogs itself |
kubernetes.container_name -- the container name within the pod
This is often the most useful field for filtering.
Common values: orchestrator, webhook, reaper, zeroclaw, redis, postgresql, fluent-bit, server (ArgoCD), repo-server (ArgoCD), application-controller (ArgoCD), vmsingle (VictoriaMetrics), vlogs (VictoriaLogs), docs, website.
stream -- stdout vs stderr
stdout-- normal outputstderr-- error output
Go panics, stack traces, and fatal errors always go to stderr. Filter on stream="stderr" to catch them fast.
How to use _stream filters
The _stream:{...} syntax tells VictoriaLogs which logs to look at. Fields inside the braces are ANDed together -- all conditions must match.
Inside _stream:{...}, field values must use exact match with =. Do not use regex operators like =~ inside _stream:{} -- use kubernetes.container_name (which is stable across pod restarts) instead of trying to regex-match on kubernetes.pod_name. If you need pattern matching, filter outside the stream selector using field:~"pattern".
Filter by namespace:
_stream:{kubernetes.namespace_name="backend"}
What this does: Returns all logs from every pod in the backend namespace.
Filter by namespace + container:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"}
What this does: Narrows to only the orchestrator container in the backend namespace.
The four stream fields are: kubernetes.namespace_name, kubernetes.pod_name, kubernetes.container_name, and stream. You can combine any of them inside _stream:{...}.
Chapter 3: Logs for Every Crawbl Service
This chapter covers every service running in the cluster. For each one, you get the exact query to see its logs and a query for when things go wrong.
Orchestrator
The main backend API -- handles authentication, user management, swarm requests, and all mobile app traffic.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"}
Filter errors only:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"} level:ERROR OR level:WARN
- JSON structured logs via Go's
slog-- Fluent Bit extracts each field to top level, solevel,msg,method,path,request_idare all directly searchable _msgshows the human-readable message value (e.g.request started), not the raw JSON string- Normal operation shows INFO-level request logs
- Filter by any extracted field, e.g.:
method:POST level:ERRORfor failed POST requests
UserSwarm Webhook
Receives requests from Metacontroller and creates ZeroClaw agent pods in the
userswarmsnamespace.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="webhook"}
Filter errors only:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="webhook"} error OR panic OR "exit code"
- Pod creation events and validation results
- Resource allocation decisions
- Panic or exit code messages indicate crash failures
E2E Reaper
A CronJob that periodically cleans up resources created by end-to-end tests.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="reaper"}
Filter errors only:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="reaper"} error OR failed
- Counts of deleted resources
- Successful cleanup confirmations
- Failed deletion attempts
The reaper runs as a CronJob, so its pod name changes on each run. Filter by container_name="reaper" which stays the same across all runs.
ZeroClaw Runtime
AI agent runtimes -- each user workspace gets its own pod in the
userswarmsnamespace.
See all logs:
_stream:{kubernetes.namespace_name="userswarms"}
Filter errors only:
_stream:{kubernetes.namespace_name="userswarms"} error OR "exit code" OR OOMKilled OR CrashLoopBackOff
- Agent startup and initialization messages
- LLM API call results and errors
- OOMKilled or CrashLoopBackOff indicate resource issues
Use the full pod name for a specific workspace:
_stream:{kubernetes.namespace_name="userswarms", kubernetes.pod_name="zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0"}
PostgreSQL
The primary data store for the platform.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.pod_name="backend-postgresql-0"}
Filter errors only:
_stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL OR "deadlock detected" OR "too many connections"
- Connection counts and slow query warnings
- Checkpoint activity
- Startup and recovery messages
PostgreSQL uses uppercase ERROR and FATAL in its log output -- these are not the same as lowercase error. Use the exact casing shown above.
Redis
Handles caching and pub/sub messaging for the platform.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.pod_name="backend-redis-master-0"}
Filter errors only:
_stream:{kubernetes.pod_name="backend-redis-master-0"} error OR "OOM" OR "maxmemory" OR "connection refused"
- Connection events and client counts
- Memory warnings (
maxmemory,OOM) - Persistence status (RDB/AOF save results)
Redis produces very few logs during normal operation. If this query returns empty results, that is expected -- Redis only logs significant events like startup, shutdown, or memory warnings. Use the Metrics Guide to monitor Redis health via redis_up and redis_memory_used_bytes instead.
ArgoCD
Syncs the cluster state to match what is committed in the
crawbl-argocd-appsGit repo.
See all logs:
_stream:{kubernetes.namespace_name="argocd"}
See only the server (handles syncs):
_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="server"}
See the application controller (detects drift):
_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="application-controller"}
Filter errors only:
_stream:{kubernetes.namespace_name="argocd"} error OR failed OR "sync failed" OR "ComparisonError"
- Sync status changes and health check results
ComparisonErrormeans manifest generation failedsync failedusually points to invalid YAML or missing resources
Envoy Gateway
The public entry point for all traffic. Handles TLS termination and routes requests to backend services.
See all logs:
_stream:{kubernetes.namespace_name="envoy-gateway-system"}
Filter errors only:
_stream:{kubernetes.namespace_name="envoy-gateway-system"} error OR "503" OR "upstream connect" OR "no healthy upstream"
503errors mean the upstream service is downno healthy upstreammeans Envoy cannot reach the backend podsupstream connectfailures indicate networking issues
Cert-Manager
Automatically provisions and renews TLS certificates from Let's Encrypt using DNS-01 challenges via Cloudflare.
See all logs:
_stream:{kubernetes.namespace_name="cert-manager"}
Filter errors only:
_stream:{kubernetes.namespace_name="cert-manager"} error OR "challenge failed" OR "not ready" OR "acme"
- Certificate issuance and renewal events
- DNS-01 challenge progress
- ACME protocol errors or rate limits
External DNS
Automatically creates and updates Cloudflare DNS records to point at the cluster's load balancer.
See all logs:
_stream:{kubernetes.namespace_name="external-dns"}
Filter errors only:
_stream:{kubernetes.namespace_name="external-dns"} error OR "failed" OR "403" OR "rate limit"
- DNS record create/update events
- Cloudflare API errors (
403, rate limits) - Sync interval logs
External Secrets
Reads secrets from AWS Secrets Manager and creates matching Kubernetes Secret objects.
See all logs:
_stream:{kubernetes.namespace_name="external-secrets"}
Filter errors only:
_stream:{kubernetes.namespace_name="external-secrets"} error OR "SecretSyncError" OR "AccessDeniedException" OR "not found"
- Secret sync success/failure events
AccessDeniedExceptionmeans IAM permissions issueSecretSyncErrormeans the secret exists but could not be written to Kubernetes
Fluent Bit
Collects logs from every node and ships them to VictoriaLogs.
See all logs:
_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="fluent-bit"}
Filter errors only:
_stream:{kubernetes.container_name="fluent-bit"} error OR "retry" OR "chunk" OR "backpressure"
- Retry counts and chunk errors indicate delivery problems
- Backpressure warnings mean VictoriaLogs cannot ingest fast enough
- If logs are missing from other services, check Fluent Bit first
If Fluent Bit is unhealthy, no logs are being collected from any service. This is the first thing to check when logs seem to be missing.
VictoriaMetrics
Stores cluster and application metrics. Exposes a Prometheus-compatible API.
See all logs:
_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vmsingle"}
Filter errors only:
_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vmsingle"} error OR "out of memory" OR "disk"
- Ingestion rate and storage usage
- Out-of-memory or disk-full warnings
- Scrape target errors
VictoriaLogs
The log storage system you are querying right now. Its own logs help diagnose ingestion or storage problems.
See all logs:
_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vlogs"}
Filter errors only:
_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vlogs"} error OR "disk" OR "ingestion"
- Ingestion errors or slow flushes
- Disk space warnings
- Query timeout messages
Docs Site
The Docusaurus documentation site served at
dev.docs.crawbl.com.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"}
Filter errors only:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"} error OR "502" OR "upstream"
- Nginx access and error logs
502means the upstream Docusaurus process crashed- Static asset 404s
Website
The public-facing
crawbl.commarketing site.
See all logs:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="website"}
Filter errors only:
_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="website"} error OR "502" OR "upstream"
- Nginx access and error logs
502means the upstream process crashed- Static asset 404s
Chapter 4: Common Troubleshooting Scenarios
Each scenario walks you through the exact queries to run, in order.
"The API is returning 500 errors"
Step 1 -- Check the orchestrator for errors:
_stream:{kubernetes.container_name="orchestrator"} level:ERROR
What this does: Shows all ERROR-level log lines from the orchestrator.
Step 2 -- If the error mentions the database, check PostgreSQL:
_stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL
What this does: Shows PostgreSQL errors and fatal messages.
Step 3 -- If the error mentions Redis, check Redis:
_stream:{kubernetes.pod_name="backend-redis-master-0"} error
What this does: Shows all Redis error logs.
Step 4 -- Check if the problem is at the gateway level (request never reaching the orchestrator):
_stream:{kubernetes.namespace_name="envoy-gateway-system"} "503" OR "no healthy upstream"
What this does: Shows gateway-level failures where requests could not be routed.
Start at the orchestrator and work outward. Most 500s originate in the API code itself, not infrastructure.
"A user's AI agent isn't starting"
Step 1 -- Check the webhook for pod creation errors:
_stream:{kubernetes.container_name="webhook"} error
What this does: Shows errors during agent pod creation.
Step 2 -- Check if the agent pod exists and is logging:
_stream:{kubernetes.namespace_name="userswarms"}
What this does: Shows all logs from agent pods.
Step 3 -- Look for crash loops or OOM kills in agent pods:
_stream:{kubernetes.namespace_name="userswarms"} "exit code" OR OOMKilled OR error
What this does: Surfaces agent pods that are crashing or running out of memory.
Step 4 -- Check the metacontroller for scheduling issues:
_stream:{kubernetes.namespace_name="userswarm-controller"}
What this does: Shows metacontroller logs to diagnose why a pod was not scheduled.
If Step 2 returns nothing, the pod was never created. Focus on the webhook (Step 1) and metacontroller (Step 4).
"ArgoCD sync is failing"
Step 1 -- Check for sync errors across all ArgoCD components:
_stream:{kubernetes.namespace_name="argocd"} "sync failed" OR error
What this does: Shows all sync failures and errors across ArgoCD.
Step 2 -- Narrow to the repo server (where manifest generation happens):
_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="repo-server"} error
What this does: Shows errors during Helm/Kustomize rendering.
Step 3 -- Check if a specific app is mentioned:
_stream:{kubernetes.namespace_name="argocd"} "orchestrator" error
What this does: Filters ArgoCD errors related to the orchestrator app.
Replace "orchestrator" with the name of whatever application is failing.
"TLS certificate isn't renewing"
Step 1 -- Check cert-manager for challenge failures:
_stream:{kubernetes.namespace_name="cert-manager"} error OR "challenge" OR "not ready"
What this does: Shows certificate issuance errors and challenge status.
Step 2 -- Check if the Cloudflare API token is valid:
_stream:{kubernetes.namespace_name="cert-manager"} "403" OR "unauthorized" OR "cloudflare"
What this does: Surfaces authentication failures with the Cloudflare API.
Step 3 -- Verify the external-secrets operator synced the Cloudflare token:
_stream:{kubernetes.namespace_name="external-secrets"} "cloudflare" OR error
What this does: Checks if the secret containing the Cloudflare token was delivered to the cluster.
If the Cloudflare token expired or was rotated, all certificate renewals will fail. Update it in AWS Secrets Manager and restart external-secrets.
"DNS records aren't updating"
Step 1 -- Check external-dns for errors:
_stream:{kubernetes.namespace_name="external-dns"} error OR "failed"
What this does: Shows all external-dns errors.
Step 2 -- Look for Cloudflare API rate limits or auth issues:
_stream:{kubernetes.namespace_name="external-dns"} "rate limit" OR "403" OR "unauthorized"
What this does: Surfaces API authentication or throttling problems.
"The database is slow"
Step 1 -- Check PostgreSQL for slow query warnings:
_stream:{kubernetes.pod_name="backend-postgresql-0"} "duration" OR "slow" OR "lock"
What this does: Surfaces slow queries, lock waits, and duration warnings.
Step 2 -- Check if connections are being exhausted:
_stream:{kubernetes.pod_name="backend-postgresql-0"} "too many connections" OR "remaining connection"
What this does: Shows connection pool exhaustion warnings.
Step 3 -- Cross-reference with orchestrator logs to see which requests are slow:
_stream:{kubernetes.container_name="orchestrator"} (level:WARN OR level:ERROR) AND (database OR postgres OR sql)
What this does: Correlates application-level warnings with database issues.
Check connection counts first. Most "slow database" issues are actually connection pool exhaustion.
"Redis is not responding"
Step 1 -- Check Redis logs directly:
_stream:{kubernetes.pod_name="backend-redis-master-0"} error OR "OOM"
What this does: Shows Redis errors and out-of-memory events.
Step 2 -- Check the orchestrator for Redis connection errors:
_stream:{kubernetes.container_name="orchestrator"} "redis" OR "connection refused"
What this does: Shows application-side Redis connection failures.
"A new deployment broke something -- what changed?"
Set your time range to the 10 minutes around the deployment before running these queries.
Step 1 -- Check for errors across the backend namespace:
_stream:{kubernetes.namespace_name="backend"} error OR panic OR fatal
What this does: Broad sweep for any errors in the backend after deploy.
Step 2 -- Watch the orchestrator's startup sequence (set time range to just after the deploy):
_stream:{kubernetes.container_name="orchestrator"} | sort by (_time) asc
What this does: Shows the orchestrator boot sequence in chronological order.
Step 3 -- Check if ArgoCD had issues during the sync:
_stream:{kubernetes.namespace_name="argocd"} "sync" error OR failed
What this does: Shows ArgoCD sync failures that may have caused a bad rollout.
Chapter 5: Advanced Queries
Combining conditions (AND, OR, NOT)
| Operator | Syntax | Example |
|---|---|---|
| AND | Space-separated words | error database -- lines with both words |
| OR | OR between words | error OR panic -- lines with either word |
| NOT | NOT before a word | error NOT "404" -- errors excluding 404s |
AND example:
_stream:{kubernetes.container_name="orchestrator"} error database
What this does: Matches lines containing both "error" AND "database".
OR example:
_stream:{kubernetes.container_name="orchestrator"} error OR panic
What this does: Matches lines containing either "error" or "panic".
NOT example:
_stream:{kubernetes.container_name="orchestrator"} error NOT "404"
What this does: Shows errors but excludes 404-related lines.
AND, OR, and NOT must be uppercase. Lowercase and, or, not will be treated as literal words to search for.
Regex matching
Use container_name for stable filtering instead of regex on pod names. Container names do not change across restarts or CronJob runs:
_stream:{kubernetes.namespace_name="userswarms", kubernetes.container_name="zeroclaw"}
What this does: Matches all ZeroClaw workspace pods regardless of pod name suffix.
Pattern matching on field values -- use field:~"pattern" outside the stream selector:
_stream:{kubernetes.namespace_name="userswarms"} kubernetes.pod_name:~"zeroclaw-workspace-81a5.*"
What this does: First selects all logs from the userswarms namespace, then filters to pods matching the pattern.
Pattern matching in the log message -- use re() in a filter pipe:
_stream:{kubernetes.container_name="orchestrator"} | filter _msg:~"user_id=[0-9]+"
What this does: Finds log lines containing a numeric user_id field.
Counting and statistics
Count errors per container:
_stream:{kubernetes.namespace_name="backend"} error | stats by (kubernetes.container_name) count() as errors
What this does: Groups error logs by container name and counts them.
Count errors over time (spot spikes):
_stream:{kubernetes.container_name="orchestrator"} level:ERROR | stats count() as error_count
What this does: Shows the total error count, useful for detecting spikes in a time range.
Sorting results
Most recent first:
_stream:{kubernetes.container_name="orchestrator"} error | sort by (_time) desc
What this does: Shows the newest errors at the top.
Oldest first (follow a startup sequence):
_stream:{kubernetes.container_name="orchestrator"} | sort by (_time) asc | limit 100
What this does: Shows the first 100 log lines in chronological order.
Time-based filtering
Relative time (last N minutes/hours/days):
_stream:{kubernetes.namespace_name="backend"} error _time:5m
What this does: Shows errors from the last 5 minutes only.
| Shorthand | Meaning |
|---|---|
_time:5m | Last 5 minutes |
_time:1h | Last 1 hour |
_time:24h | Last 24 hours |
_time:7d | Last 7 days |
Exact time range:
_stream:{kubernetes.namespace_name="backend"} error _time:[2026-04-04T14:00:00Z, 2026-04-04T14:30:00Z]
What this does: Shows errors within a precise 30-minute window.
Use relative time (_time:5m) for quick checks. Use exact ranges when investigating a known incident window.
JSON field filtering (for structured logs)
The orchestrator and webhook emit JSON logs via Go's slog. Fluent Bit's parser filter automatically extracts every top-level JSON key into a separate field before the record reaches VictoriaLogs. This means you do not need to match raw JSON substrings -- fields are already indexed and directly searchable.
The orchestrator emits JSON like:
{"time":"2026-04-04T12:00:00Z","level":"INFO","msg":"request received","method":"GET","path":"/v1/health","request_id":"abc123"}
After Fluent Bit parses it, VictoriaLogs receives individual fields: level, _msg, method, path, request_id, etc.
Old (wrong) -- matching a raw JSON substring:
_stream:{kubernetes.container_name="orchestrator"} "level":"ERROR"
New (correct) -- querying the extracted field directly:
_stream:{kubernetes.container_name="orchestrator"} level:ERROR
Filter by method and level:
_stream:{kubernetes.container_name="orchestrator"} method:POST level:ERROR
What this does: Finds ERROR-level logs for POST requests.
Filter by path:
_stream:{kubernetes.container_name="orchestrator"} path:/v1/auth level:ERROR
What this does: Finds ERROR-level logs for the /v1/auth endpoint.
_msg vs messageThe _msg field contains the human-readable msg value from slog (e.g. request started), not the raw JSON string. Use _msg when you want to search or display the log message text.
Any service that writes JSON to stdout gets this treatment for free. Fluent Bit's parser filter detects JSON output and promotes every top-level key to its own searchable field -- no per-service configuration required.
Selecting specific fields
_stream:{kubernetes.container_name="orchestrator"} error | fields _time, message
What this does: Strips away Kubernetes metadata and shows only timestamp and message.
Limiting results
_stream:{kubernetes.namespace_name="backend"} error | limit 20
What this does: Returns only the first 20 matches.
Start with a small limit when exploring. You can always increase it once you know the query returns what you want.
Combining pipes
Pipes chain left to right with |:
_stream:{kubernetes.namespace_name="backend"} error
| fields _time, kubernetes.container_name, message
| sort by (_time) desc
| limit 50
What this does: Gets error logs from backend, keeps only three fields, sorts newest first, and returns the top 50.
Chapter 6: Quick Reference Card
| I want to... | Query |
|---|---|
| See everything | * |
| See all orchestrator logs | _stream:{kubernetes.container_name="orchestrator"} |
| See orchestrator errors | _stream:{kubernetes.container_name="orchestrator"} level:ERROR |
| See all backend namespace logs | _stream:{kubernetes.namespace_name="backend"} |
| See errors across all namespaces | error OR panic OR fatal |
| See webhook logs | _stream:{kubernetes.container_name="webhook"} |
| See all agent runtime logs | _stream:{kubernetes.namespace_name="userswarms"} |
| See a specific agent pod | _stream:{kubernetes.namespace_name="userswarms", kubernetes.pod_name="zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0"} |
| See PostgreSQL errors | _stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL |
| See Redis logs | _stream:{kubernetes.pod_name="backend-redis-master-0"} |
| See ArgoCD sync errors | _stream:{kubernetes.namespace_name="argocd"} "sync failed" OR error |
| See cert-manager issues | _stream:{kubernetes.namespace_name="cert-manager"} error |
| See Envoy Gateway logs | _stream:{kubernetes.namespace_name="envoy-gateway-system"} |
| See external-dns logs | _stream:{kubernetes.namespace_name="external-dns"} |
| See Fluent Bit logs | _stream:{kubernetes.container_name="fluent-bit"} |
| See only stderr output | _stream:{kubernetes.container_name="orchestrator", stream="stderr"} |
| Count errors per container | _stream:{kubernetes.namespace_name="backend"} error | stats by (kubernetes.container_name) count() as errors |
| See last 5 minutes only | _stream:{kubernetes.container_name="orchestrator"} _time:5m |
| Find a specific error message | _stream:{kubernetes.namespace_name="backend"} "connection refused" |
| See the docs site logs | _stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"} |
Retention
Logs are retained for 14 days. Records older than 14 days are automatically deleted. If you need to investigate something older, check whether any exports or captures exist before the window closes.
🔗 Terms On This Page
If a term below is unfamiliar, open its glossary entry. For the full list, go to Internal Glossary.