Перейти к основному содержимому

The Crawbl Logs Guide

A step-by-step reference for searching and understanding logs in the Crawbl dev cluster. No prior LogsQL experience required.


Chapter 1: Getting Started

What is VictoriaLogs and why we use it

VictoriaLogs is a log database that collects output from every container in the Crawbl cluster automatically.

  • No more SSH-ing into machines or running kubectl logs against individual pods
  • Open a browser, type a query, and search across every service at once
Why VictoriaLogs?
  • Lightweight -- runs as a single pod
  • Efficient storage -- compresses logs well
  • Purpose-built query language -- designed specifically for log data
  • Replaces heavier alternatives like Elasticsearch or Loki for our dev cluster

How logs flow from your app to VictoriaLogs

Every log line takes this path:

Your app writes to stdout/stderr
|
v
containerd (the container runtime) writes it to a file on disk
|
v
Fluent Bit (runs on every node) reads the file, attaches Kubernetes
metadata (namespace, pod name, container name), and ships it over HTTP
|
v
VictoriaLogs stores the enriched record and makes it searchable
подсказка

This happens automatically for every container. You do not need to configure anything in your application -- just write to stdout.

How to open the UI

Open your browser and go to:

https://dev.logs.crawbl.com/select/vmui

No credentials are required in the dev environment.

You will see three areas:

AreaLocationPurpose
Query barTop centerType queries here. Press Enter to run.
Time range pickerTop rightControls the search window. Defaults to last 1 hour.
Results paneBelowShows matching log lines. Click any line to expand metadata.

Your first query

Type * into the query bar and press Enter.

  • This matches every log line in the selected time range
  • You will see logs from the orchestrator, Redis, ArgoCD, Fluent Bit, and everything else mixed together
подсказка

If the results are empty, check the time range picker. Extend it to "Last 24 hours" to confirm logs exist.


Chapter 2: Understanding the Log Structure

Every log line has metadata

Fluent Bit attaches metadata fields from the Kubernetes API to every log line it ships. These fields tell you exactly where the log came from without needing to read the message itself.

The fields you will use

kubernetes.namespace_name -- which namespace the pod lives in

NamespaceWhat lives here
backendOrchestrator, webhook, reaper, PostgreSQL, Redis, pgweb, docs, website
userswarmsZeroClaw AI agent pods (one per user workspace)
monitoringFluent Bit, VictoriaMetrics, VictoriaLogs
argocdArgoCD server, repo server, application controller, Redis
cert-managerCertificate management controllers
envoy-gateway-systemEnvoy Gateway and proxy pods
external-dnsDNS record sync controller
external-secretsSecrets sync from AWS Secrets Manager
userswarm-controllerMetacontroller (creates agent pods)

kubernetes.pod_name -- the full pod name including the random suffix

Pod nameWhat it is
orchestrator-795499fd8b-sgctgThe Crawbl API server
userswarm-webhook-7db4d6cdcd-wq6rxThe webhook that creates agent pods
e2e-reaper-29587560-k65dvCronJob that cleans up test resources
backend-postgresql-0The PostgreSQL database
backend-redis-master-0The Redis instance
zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0A ZeroClaw agent pod
victoria-logs-0VictoriaLogs itself

kubernetes.container_name -- the container name within the pod

This is often the most useful field for filtering.

Common values: orchestrator, webhook, reaper, zeroclaw, redis, postgresql, fluent-bit, server (ArgoCD), repo-server (ArgoCD), application-controller (ArgoCD), vmsingle (VictoriaMetrics), vlogs (VictoriaLogs), docs, website.

stream -- stdout vs stderr

  • stdout -- normal output
  • stderr -- error output
warning

Go panics, stack traces, and fatal errors always go to stderr. Filter on stream="stderr" to catch them fast.

How to use _stream filters

The _stream:{...} syntax tells VictoriaLogs which logs to look at. Fields inside the braces are ANDed together -- all conditions must match.

_stream only supports exact match

Inside _stream:{...}, field values must use exact match with =. Do not use regex operators like =~ inside _stream:{} -- use kubernetes.container_name (which is stable across pod restarts) instead of trying to regex-match on kubernetes.pod_name. If you need pattern matching, filter outside the stream selector using field:~"pattern".

Filter by namespace:

_stream:{kubernetes.namespace_name="backend"}

What this does: Returns all logs from every pod in the backend namespace.

Filter by namespace + container:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"}

What this does: Narrows to only the orchestrator container in the backend namespace.

Available stream fields

The four stream fields are: kubernetes.namespace_name, kubernetes.pod_name, kubernetes.container_name, and stream. You can combine any of them inside _stream:{...}.


Chapter 3: Logs for Every Crawbl Service

This chapter covers every service running in the cluster. For each one, you get the exact query to see its logs and a query for when things go wrong.


Orchestrator

The main backend API -- handles authentication, user management, swarm requests, and all mobile app traffic.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"}

Filter errors only:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="orchestrator"} level:ERROR OR level:WARN
What to look for
  • JSON structured logs via Go's slog -- Fluent Bit extracts each field to top level, so level, msg, method, path, request_id are all directly searchable
  • _msg shows the human-readable message value (e.g. request started), not the raw JSON string
  • Normal operation shows INFO-level request logs
  • Filter by any extracted field, e.g.: method:POST level:ERROR for failed POST requests

UserSwarm Webhook

Receives requests from Metacontroller and creates ZeroClaw agent pods in the userswarms namespace.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="webhook"}

Filter errors only:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="webhook"} error OR panic OR "exit code"
What to look for
  • Pod creation events and validation results
  • Resource allocation decisions
  • Panic or exit code messages indicate crash failures

E2E Reaper

A CronJob that periodically cleans up resources created by end-to-end tests.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="reaper"}

Filter errors only:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="reaper"} error OR failed
What to look for
  • Counts of deleted resources
  • Successful cleanup confirmations
  • Failed deletion attempts
к сведению

The reaper runs as a CronJob, so its pod name changes on each run. Filter by container_name="reaper" which stays the same across all runs.


ZeroClaw Runtime

AI agent runtimes -- each user workspace gets its own pod in the userswarms namespace.

See all logs:

_stream:{kubernetes.namespace_name="userswarms"}

Filter errors only:

_stream:{kubernetes.namespace_name="userswarms"} error OR "exit code" OR OOMKilled OR CrashLoopBackOff
What to look for
  • Agent startup and initialization messages
  • LLM API call results and errors
  • OOMKilled or CrashLoopBackOff indicate resource issues
Filtering a specific workspace

Use the full pod name for a specific workspace:

_stream:{kubernetes.namespace_name="userswarms", kubernetes.pod_name="zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0"}

PostgreSQL

The primary data store for the platform.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.pod_name="backend-postgresql-0"}

Filter errors only:

_stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL OR "deadlock detected" OR "too many connections"
What to look for
  • Connection counts and slow query warnings
  • Checkpoint activity
  • Startup and recovery messages
warning

PostgreSQL uses uppercase ERROR and FATAL in its log output -- these are not the same as lowercase error. Use the exact casing shown above.


Redis

Handles caching and pub/sub messaging for the platform.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.pod_name="backend-redis-master-0"}

Filter errors only:

_stream:{kubernetes.pod_name="backend-redis-master-0"} error OR "OOM" OR "maxmemory" OR "connection refused"
What to look for
  • Connection events and client counts
  • Memory warnings (maxmemory, OOM)
  • Persistence status (RDB/AOF save results)
примечание

Redis produces very few logs during normal operation. If this query returns empty results, that is expected -- Redis only logs significant events like startup, shutdown, or memory warnings. Use the Metrics Guide to monitor Redis health via redis_up and redis_memory_used_bytes instead.


ArgoCD

Syncs the cluster state to match what is committed in the crawbl-argocd-apps Git repo.

See all logs:

_stream:{kubernetes.namespace_name="argocd"}

See only the server (handles syncs):

_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="server"}

See the application controller (detects drift):

_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="application-controller"}

Filter errors only:

_stream:{kubernetes.namespace_name="argocd"} error OR failed OR "sync failed" OR "ComparisonError"
What to look for
  • Sync status changes and health check results
  • ComparisonError means manifest generation failed
  • sync failed usually points to invalid YAML or missing resources

Envoy Gateway

The public entry point for all traffic. Handles TLS termination and routes requests to backend services.

See all logs:

_stream:{kubernetes.namespace_name="envoy-gateway-system"}

Filter errors only:

_stream:{kubernetes.namespace_name="envoy-gateway-system"} error OR "503" OR "upstream connect" OR "no healthy upstream"
What to look for
  • 503 errors mean the upstream service is down
  • no healthy upstream means Envoy cannot reach the backend pods
  • upstream connect failures indicate networking issues

Cert-Manager

Automatically provisions and renews TLS certificates from Let's Encrypt using DNS-01 challenges via Cloudflare.

See all logs:

_stream:{kubernetes.namespace_name="cert-manager"}

Filter errors only:

_stream:{kubernetes.namespace_name="cert-manager"} error OR "challenge failed" OR "not ready" OR "acme"
What to look for
  • Certificate issuance and renewal events
  • DNS-01 challenge progress
  • ACME protocol errors or rate limits

External DNS

Automatically creates and updates Cloudflare DNS records to point at the cluster's load balancer.

See all logs:

_stream:{kubernetes.namespace_name="external-dns"}

Filter errors only:

_stream:{kubernetes.namespace_name="external-dns"} error OR "failed" OR "403" OR "rate limit"
What to look for
  • DNS record create/update events
  • Cloudflare API errors (403, rate limits)
  • Sync interval logs

External Secrets

Reads secrets from AWS Secrets Manager and creates matching Kubernetes Secret objects.

See all logs:

_stream:{kubernetes.namespace_name="external-secrets"}

Filter errors only:

_stream:{kubernetes.namespace_name="external-secrets"} error OR "SecretSyncError" OR "AccessDeniedException" OR "not found"
What to look for
  • Secret sync success/failure events
  • AccessDeniedException means IAM permissions issue
  • SecretSyncError means the secret exists but could not be written to Kubernetes

Fluent Bit

Collects logs from every node and ships them to VictoriaLogs.

See all logs:

_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="fluent-bit"}

Filter errors only:

_stream:{kubernetes.container_name="fluent-bit"} error OR "retry" OR "chunk" OR "backpressure"
What to look for
  • Retry counts and chunk errors indicate delivery problems
  • Backpressure warnings mean VictoriaLogs cannot ingest fast enough
  • If logs are missing from other services, check Fluent Bit first
осторожно

If Fluent Bit is unhealthy, no logs are being collected from any service. This is the first thing to check when logs seem to be missing.


VictoriaMetrics

Stores cluster and application metrics. Exposes a Prometheus-compatible API.

See all logs:

_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vmsingle"}

Filter errors only:

_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vmsingle"} error OR "out of memory" OR "disk"
What to look for
  • Ingestion rate and storage usage
  • Out-of-memory or disk-full warnings
  • Scrape target errors

VictoriaLogs

The log storage system you are querying right now. Its own logs help diagnose ingestion or storage problems.

See all logs:

_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vlogs"}

Filter errors only:

_stream:{kubernetes.namespace_name="monitoring", kubernetes.container_name="vlogs"} error OR "disk" OR "ingestion"
What to look for
  • Ingestion errors or slow flushes
  • Disk space warnings
  • Query timeout messages

Docs Site

The Docusaurus documentation site served at dev.docs.crawbl.com.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"}

Filter errors only:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"} error OR "502" OR "upstream"
What to look for
  • Nginx access and error logs
  • 502 means the upstream Docusaurus process crashed
  • Static asset 404s

Website

The public-facing crawbl.com marketing site.

See all logs:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="website"}

Filter errors only:

_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="website"} error OR "502" OR "upstream"
What to look for
  • Nginx access and error logs
  • 502 means the upstream process crashed
  • Static asset 404s

Chapter 4: Common Troubleshooting Scenarios

Each scenario walks you through the exact queries to run, in order.


"The API is returning 500 errors"

Step 1 -- Check the orchestrator for errors:

_stream:{kubernetes.container_name="orchestrator"} level:ERROR

What this does: Shows all ERROR-level log lines from the orchestrator.

Step 2 -- If the error mentions the database, check PostgreSQL:

_stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL

What this does: Shows PostgreSQL errors and fatal messages.

Step 3 -- If the error mentions Redis, check Redis:

_stream:{kubernetes.pod_name="backend-redis-master-0"} error

What this does: Shows all Redis error logs.

Step 4 -- Check if the problem is at the gateway level (request never reaching the orchestrator):

_stream:{kubernetes.namespace_name="envoy-gateway-system"} "503" OR "no healthy upstream"

What this does: Shows gateway-level failures where requests could not be routed.

подсказка

Start at the orchestrator and work outward. Most 500s originate in the API code itself, not infrastructure.


"A user's AI agent isn't starting"

Step 1 -- Check the webhook for pod creation errors:

_stream:{kubernetes.container_name="webhook"} error

What this does: Shows errors during agent pod creation.

Step 2 -- Check if the agent pod exists and is logging:

_stream:{kubernetes.namespace_name="userswarms"}

What this does: Shows all logs from agent pods.

Step 3 -- Look for crash loops or OOM kills in agent pods:

_stream:{kubernetes.namespace_name="userswarms"} "exit code" OR OOMKilled OR error

What this does: Surfaces agent pods that are crashing or running out of memory.

Step 4 -- Check the metacontroller for scheduling issues:

_stream:{kubernetes.namespace_name="userswarm-controller"}

What this does: Shows metacontroller logs to diagnose why a pod was not scheduled.

warning

If Step 2 returns nothing, the pod was never created. Focus on the webhook (Step 1) and metacontroller (Step 4).


"ArgoCD sync is failing"

Step 1 -- Check for sync errors across all ArgoCD components:

_stream:{kubernetes.namespace_name="argocd"} "sync failed" OR error

What this does: Shows all sync failures and errors across ArgoCD.

Step 2 -- Narrow to the repo server (where manifest generation happens):

_stream:{kubernetes.namespace_name="argocd", kubernetes.container_name="repo-server"} error

What this does: Shows errors during Helm/Kustomize rendering.

Step 3 -- Check if a specific app is mentioned:

_stream:{kubernetes.namespace_name="argocd"} "orchestrator" error

What this does: Filters ArgoCD errors related to the orchestrator app.

примечание

Replace "orchestrator" with the name of whatever application is failing.


"TLS certificate isn't renewing"

Step 1 -- Check cert-manager for challenge failures:

_stream:{kubernetes.namespace_name="cert-manager"} error OR "challenge" OR "not ready"

What this does: Shows certificate issuance errors and challenge status.

Step 2 -- Check if the Cloudflare API token is valid:

_stream:{kubernetes.namespace_name="cert-manager"} "403" OR "unauthorized" OR "cloudflare"

What this does: Surfaces authentication failures with the Cloudflare API.

Step 3 -- Verify the external-secrets operator synced the Cloudflare token:

_stream:{kubernetes.namespace_name="external-secrets"} "cloudflare" OR error

What this does: Checks if the secret containing the Cloudflare token was delivered to the cluster.

осторожно

If the Cloudflare token expired or was rotated, all certificate renewals will fail. Update it in AWS Secrets Manager and restart external-secrets.


"DNS records aren't updating"

Step 1 -- Check external-dns for errors:

_stream:{kubernetes.namespace_name="external-dns"} error OR "failed"

What this does: Shows all external-dns errors.

Step 2 -- Look for Cloudflare API rate limits or auth issues:

_stream:{kubernetes.namespace_name="external-dns"} "rate limit" OR "403" OR "unauthorized"

What this does: Surfaces API authentication or throttling problems.


"The database is slow"

Step 1 -- Check PostgreSQL for slow query warnings:

_stream:{kubernetes.pod_name="backend-postgresql-0"} "duration" OR "slow" OR "lock"

What this does: Surfaces slow queries, lock waits, and duration warnings.

Step 2 -- Check if connections are being exhausted:

_stream:{kubernetes.pod_name="backend-postgresql-0"} "too many connections" OR "remaining connection"

What this does: Shows connection pool exhaustion warnings.

Step 3 -- Cross-reference with orchestrator logs to see which requests are slow:

_stream:{kubernetes.container_name="orchestrator"} (level:WARN OR level:ERROR) AND (database OR postgres OR sql)

What this does: Correlates application-level warnings with database issues.

подсказка

Check connection counts first. Most "slow database" issues are actually connection pool exhaustion.


"Redis is not responding"

Step 1 -- Check Redis logs directly:

_stream:{kubernetes.pod_name="backend-redis-master-0"} error OR "OOM"

What this does: Shows Redis errors and out-of-memory events.

Step 2 -- Check the orchestrator for Redis connection errors:

_stream:{kubernetes.container_name="orchestrator"} "redis" OR "connection refused"

What this does: Shows application-side Redis connection failures.


"A new deployment broke something -- what changed?"

к сведению

Set your time range to the 10 minutes around the deployment before running these queries.

Step 1 -- Check for errors across the backend namespace:

_stream:{kubernetes.namespace_name="backend"} error OR panic OR fatal

What this does: Broad sweep for any errors in the backend after deploy.

Step 2 -- Watch the orchestrator's startup sequence (set time range to just after the deploy):

_stream:{kubernetes.container_name="orchestrator"} | sort by (_time) asc

What this does: Shows the orchestrator boot sequence in chronological order.

Step 3 -- Check if ArgoCD had issues during the sync:

_stream:{kubernetes.namespace_name="argocd"} "sync" error OR failed

What this does: Shows ArgoCD sync failures that may have caused a bad rollout.


Chapter 5: Advanced Queries

Combining conditions (AND, OR, NOT)

OperatorSyntaxExample
ANDSpace-separated wordserror database -- lines with both words
OROR between wordserror OR panic -- lines with either word
NOTNOT before a worderror NOT "404" -- errors excluding 404s

AND example:

_stream:{kubernetes.container_name="orchestrator"} error database

What this does: Matches lines containing both "error" AND "database".

OR example:

_stream:{kubernetes.container_name="orchestrator"} error OR panic

What this does: Matches lines containing either "error" or "panic".

NOT example:

_stream:{kubernetes.container_name="orchestrator"} error NOT "404"

What this does: Shows errors but excludes 404-related lines.

осторожно

AND, OR, and NOT must be uppercase. Lowercase and, or, not will be treated as literal words to search for.


Regex matching

Use container_name for stable filtering instead of regex on pod names. Container names do not change across restarts or CronJob runs:

_stream:{kubernetes.namespace_name="userswarms", kubernetes.container_name="zeroclaw"}

What this does: Matches all ZeroClaw workspace pods regardless of pod name suffix.

Pattern matching on field values -- use field:~"pattern" outside the stream selector:

_stream:{kubernetes.namespace_name="userswarms"} kubernetes.pod_name:~"zeroclaw-workspace-81a5.*"

What this does: First selects all logs from the userswarms namespace, then filters to pods matching the pattern.

Pattern matching in the log message -- use re() in a filter pipe:

_stream:{kubernetes.container_name="orchestrator"} | filter _msg:~"user_id=[0-9]+"

What this does: Finds log lines containing a numeric user_id field.


Counting and statistics

Count errors per container:

_stream:{kubernetes.namespace_name="backend"} error | stats by (kubernetes.container_name) count() as errors

What this does: Groups error logs by container name and counts them.

Count errors over time (spot spikes):

_stream:{kubernetes.container_name="orchestrator"} level:ERROR | stats count() as error_count

What this does: Shows the total error count, useful for detecting spikes in a time range.


Sorting results

Most recent first:

_stream:{kubernetes.container_name="orchestrator"} error | sort by (_time) desc

What this does: Shows the newest errors at the top.

Oldest first (follow a startup sequence):

_stream:{kubernetes.container_name="orchestrator"} | sort by (_time) asc | limit 100

What this does: Shows the first 100 log lines in chronological order.


Time-based filtering

Relative time (last N minutes/hours/days):

_stream:{kubernetes.namespace_name="backend"} error _time:5m

What this does: Shows errors from the last 5 minutes only.

ShorthandMeaning
_time:5mLast 5 minutes
_time:1hLast 1 hour
_time:24hLast 24 hours
_time:7dLast 7 days

Exact time range:

_stream:{kubernetes.namespace_name="backend"} error _time:[2026-04-04T14:00:00Z, 2026-04-04T14:30:00Z]

What this does: Shows errors within a precise 30-minute window.

подсказка

Use relative time (_time:5m) for quick checks. Use exact ranges when investigating a known incident window.


JSON field filtering (for structured logs)

The orchestrator and webhook emit JSON logs via Go's slog. Fluent Bit's parser filter automatically extracts every top-level JSON key into a separate field before the record reaches VictoriaLogs. This means you do not need to match raw JSON substrings -- fields are already indexed and directly searchable.

The orchestrator emits JSON like:

{"time":"2026-04-04T12:00:00Z","level":"INFO","msg":"request received","method":"GET","path":"/v1/health","request_id":"abc123"}

After Fluent Bit parses it, VictoriaLogs receives individual fields: level, _msg, method, path, request_id, etc.

Old (wrong) -- matching a raw JSON substring:

_stream:{kubernetes.container_name="orchestrator"} "level":"ERROR"

New (correct) -- querying the extracted field directly:

_stream:{kubernetes.container_name="orchestrator"} level:ERROR

Filter by method and level:

_stream:{kubernetes.container_name="orchestrator"} method:POST level:ERROR

What this does: Finds ERROR-level logs for POST requests.

Filter by path:

_stream:{kubernetes.container_name="orchestrator"} path:/v1/auth level:ERROR

What this does: Finds ERROR-level logs for the /v1/auth endpoint.

_msg vs message

The _msg field contains the human-readable msg value from slog (e.g. request started), not the raw JSON string. Use _msg when you want to search or display the log message text.

JSON field extraction is automatic

Any service that writes JSON to stdout gets this treatment for free. Fluent Bit's parser filter detects JSON output and promotes every top-level key to its own searchable field -- no per-service configuration required.


Selecting specific fields

_stream:{kubernetes.container_name="orchestrator"} error | fields _time, message

What this does: Strips away Kubernetes metadata and shows only timestamp and message.


Limiting results

_stream:{kubernetes.namespace_name="backend"} error | limit 20

What this does: Returns only the first 20 matches.

подсказка

Start with a small limit when exploring. You can always increase it once you know the query returns what you want.


Combining pipes

Pipes chain left to right with |:

_stream:{kubernetes.namespace_name="backend"} error
| fields _time, kubernetes.container_name, message
| sort by (_time) desc
| limit 50

What this does: Gets error logs from backend, keeps only three fields, sorts newest first, and returns the top 50.


Chapter 6: Quick Reference Card

I want to...Query
See everything*
See all orchestrator logs_stream:{kubernetes.container_name="orchestrator"}
See orchestrator errors_stream:{kubernetes.container_name="orchestrator"} level:ERROR
See all backend namespace logs_stream:{kubernetes.namespace_name="backend"}
See errors across all namespaceserror OR panic OR fatal
See webhook logs_stream:{kubernetes.container_name="webhook"}
See all agent runtime logs_stream:{kubernetes.namespace_name="userswarms"}
See a specific agent pod_stream:{kubernetes.namespace_name="userswarms", kubernetes.pod_name="zeroclaw-workspace-81a5f386-c6a6-4c0a-b6a-3353eb37c1-0"}
See PostgreSQL errors_stream:{kubernetes.pod_name="backend-postgresql-0"} ERROR OR FATAL
See Redis logs_stream:{kubernetes.pod_name="backend-redis-master-0"}
See ArgoCD sync errors_stream:{kubernetes.namespace_name="argocd"} "sync failed" OR error
See cert-manager issues_stream:{kubernetes.namespace_name="cert-manager"} error
See Envoy Gateway logs_stream:{kubernetes.namespace_name="envoy-gateway-system"}
See external-dns logs_stream:{kubernetes.namespace_name="external-dns"}
See Fluent Bit logs_stream:{kubernetes.container_name="fluent-bit"}
See only stderr output_stream:{kubernetes.container_name="orchestrator", stream="stderr"}
Count errors per container_stream:{kubernetes.namespace_name="backend"} error | stats by (kubernetes.container_name) count() as errors
See last 5 minutes only_stream:{kubernetes.container_name="orchestrator"} _time:5m
Find a specific error message_stream:{kubernetes.namespace_name="backend"} "connection refused"
See the docs site logs_stream:{kubernetes.namespace_name="backend", kubernetes.container_name="docs"}

Retention

warning

Logs are retained for 14 days. Records older than 14 days are automatically deleted. If you need to investigate something older, check whether any exports or captures exist before the window closes.

🔗 Terms On This Page

If a term below is unfamiliar, open its glossary entry. For the full list, go to Internal Glossary.

  • DOKS: DigitalOcean Kubernetes, the managed Kubernetes service used for the Crawbl cluster.
  • ArgoCD: The GitOps deployment system that keeps the cluster aligned with what is committed in Git.