Runtime Wire Contract

The wire contract between the orchestrator and each Agent Runtime pod uses gRPC bidirectional streaming over port 42618. Every user message routed to a swarm flows through this protocol.

warning

This is a gRPC contract, not an HTTP webhook. The legacy HTTP/JSON webhook format was replaced with the gRPC bidi stream defined in proto/agentruntime/v1/runtime.proto.

Service Definition

service AgentRuntime {
  rpc Converse(stream ConverseRequest) returns (stream ConverseEvent);
}

The orchestrator opens a Converse bidi stream, sends ConverseRequest messages (one per user turn), and receives ConverseEvent oneofs back until a DoneEvent closes the turn. The stream stays open across multiple turns.

Connection Details

Property	Value
Protocol	gRPC (bidi stream)
Port	42618
Service address	`{serviceName}.{namespace}.svc.cluster.local:42618`
Authentication	HMAC bearer token in `authorization` gRPC metadata
Session correlation	`session_id` field in request, NOT a header

Agent Runtime pods are not exposed through a public route. The orchestrator resolves the service over cluster DNS.

Request: ConverseRequest

Each message on the request stream carries a single user turn:

Field	Type	Required	Description
`session_id`	string	Yes	Conversation ID for session continuity
`message`	string	Yes	The user's message text
`agent_id`	string	No	Target agent slug (e.g., `"wally"`). Empty means the Manager routes.
`system_prompt`	string	No	Optional per-turn system prompt override
`workspace_id`	string	First req	Workspace identifier (required on first request in stream)
`user_id`	string	First req	User identifier (required on first request in stream)

Example

{
  "session_id": "conv-uuid-here",
  "message": "What meetings do I have tomorrow?",
  "agent_id": "wally",
  "workspace_id": "ws-uuid",
  "user_id": "user-uuid"
}

Response: ConverseEvent

The server sends back a stream of events. Each event is a oneof:

Event type	Description
`ChunkEvent`	Partial text delta streamed token-by-token
`ThinkingEvent`	Reasoning-step delta (extended thinking)
`ToolCallEvent`	Emitted before a tool is invoked
`ToolResultEvent`	Emitted after a tool invocation completes
`UsageEvent`	Token usage counts after each LLM call
`DoneEvent`	Terminal event with aggregated turns and model name

ChunkEvent

{ "agent_id": "wally", "text": "Based on " }

Streamed as partial text deltas. The orchestrator forwards these to the mobile app via Socket.IO message.chunk events.

ThinkingEvent

Same shape as ChunkEvent but carries reasoning-step text. The UI may render these differently (e.g., collapsed thinking section).

ToolCallEvent

{
  "agent_id": "wally",
  "tool": "web_search",
  "args_json": "{\"query\": \"meetings tomorrow\"}",
  "call_id": "call-123"
}

ToolResultEvent

{
  "call_id": "call-123",
  "result_json": "{\"results\": [...]}",
  "error": false
}

UsageEvent

Emitted after each LLM GenerateContent call returns. Multiple UsageEvents may be emitted per turn if tool calls trigger multiple LLM round-trips.

{
  "agent_id": "wally",
  "model": "gpt-4o",
  "prompt_tokens": 1250,
  "completion_tokens": 340,
  "total_tokens": 1590,
  "cached_tokens": 800,
  "thoughts_tokens": 0,
  "tool_use_prompt_tokens": 200,
  "call_sequence": 0
}

The orchestrator accumulates these for quota enforcement and forwards them to the analytics pipeline via NATS.

DoneEvent

Terminal event for a single turn. Carries the aggregated turns array and model identifier.

{
  "model": "gpt-4o",
  "turns": [
    { "agent_id": "wally", "text": "Based on your calendar, you have three meetings tomorrow..." }
  ]
}

How Agent Routing Works

Step 1

Choose a responder

The orchestrator decides whether the message should go to a specific delegate agent or to the Manager.

Step 2

Open or reuse a gRPC stream

The orchestrator opens a Converse bidi stream to the workspace's runtime pod (or reuses an existing stream for the same conversation).

Step 3

Send the ConverseRequest

The request includes the user message, session ID, and optional agent_id. The runtime resolves the agent slug against its ADK agent configuration.

Step 4

Stream events back

The runtime streams ChunkEvents (text deltas), ToolCallEvents, UsageEvents, and a terminal DoneEvent back to the orchestrator.

Step 5

Orchestrator processes events

Each event type triggers different orchestrator behavior: chunks are forwarded to mobile via Socket.IO, usage events update Postgres counters and publish to NATS, done events finalize and persist messages.

Session Continuity

The session_id field in ConverseRequest links requests to a conversation. The runtime uses this to maintain per-conversation state (ADK session) across turns. The orchestrator passes the conversation UUID as the session ID.

Authentication

Runtime pods authenticate via HMAC. The orchestrator signs requests using CRAWBL_MCP_SIGNING_KEY:

HMAC-SHA256(secret, userID:workspaceID)

The signed token is sent in the authorization gRPC metadata header. The runtime validates it before processing any request.

Error Handling

If the runtime encounters an error during a turn, it sends a synthetic DoneEvent with the model field prefixed "ERROR:":

{
  "model": "ERROR: INTERNAL: runner: context deadline exceeded",
  "turns": []
}

The orchestrator detects this pattern and surfaces a user-visible error. The stream stays open for subsequent turns.

Key Source Files

File	Purpose
`proto/agentruntime/v1/runtime.proto`	gRPC service and message definitions
`internal/agentruntime/server/converse.go`	Runtime-side Converse handler
`internal/orchestrator/service/chatservice/stream.go`	Orchestrator-side stream processing
`internal/userswarm/client/`	gRPC client used by orchestrator

What's Next

See Secrets Management to understand how API keys and credentials are delivered to Agent Runtime pods.

Service Definition​

Connection Details​

Request: ConverseRequest​

Example​

Response: ConverseEvent​

ChunkEvent​

ThinkingEvent​

ToolCallEvent​

ToolResultEvent​

UsageEvent​

DoneEvent​

How Agent Routing Works​